File size: 3,809 Bytes
bdff7d6
 
 
 
 
 
 
 
cae3b05
46b2641
b6276a6
 
 
3b1a8e9
9d6082d
4671ed0
9d6082d
cae3b05
8baea48
 
 
cae3b05
5064629
 
cae3b05
 
 
40e4cf7
cae3b05
 
40e4cf7
d58207a
 
cae3b05
 
05e11c3
 
132ba68
 
 
cae3b05
 
05e11c3
 
cae3b05
 
05e11c3
 
cae3b05
05e11c3
 
 
cae3b05
05e11c3
cae3b05
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
title: README
emoji: 
colorFrom: blue
colorTo: red
sdk: static
pinned: false
---

- **[2024-11]** 🔔🔔 We are excited to introduce **LMMs-Eval/v0.3.0**, focusing on audio understanding. Building upon LMMs-Eval/v0.2.0, we have added audio models and tasks. Now, LMMs-Eval provides a consistent evaluation toolkit across image, video, and audio modalities.
    
    [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Documentation](https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/docs/lmms-eval-0.3.md)

- **[2024-11]** 🤯🤯 We introduce **Multimodal SAE**, the first framework designed to interpret learned features in large-scale multimodal models using Sparse Autoencoders. Through our approach, we leverage LLaVA-OneVision-72B to analyze and explain the SAE-derived features of LLaVA-NeXT-LLaMA3-8B. Furthermore, we demonstrate the ability to steer model behavior by clamping specific features to alleviate hallucinations and avoid safety-related issues.
    
    [GitHub](https://github.com/EvolvingLMMs-Lab/multimodal-sae) | [Paper](https://arxiv.org/abs/2411.14982)

- **[2024-10]** 🔥🔥 We present **`LLaVA-Critic`**, the first open-source large multimodal model as a generalist evaluator for assessing LMM-generated responses across diverse multimodal tasks and scenarios.
    
    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-10-03-llava-critic/)
  
- **[2024-10]** 🎬🎬 Introducing **`LLaVA-Video`**, a family of open large multimodal models designed specifically for advanced video understanding. We're open-sourcing **LLaVA-Video-178K**, a high-quality, synthetic dataset for video instruction tuning.
    
    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://github.com/LLaVA-VL/LLaVA-NeXT)

- **[2024-08]** 🤞🤞 We present **`LLaVA-OneVision`**, a family of LMMs developed by consolidating insights into data, models, and visual representations.

    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-08-05-llava-onevision/)

- **[2024-06]** 🧑‍🎨🧑‍🎨 We release **`LLaVA-NeXT-Interleave`**, an LMM extending capabilities to real-world settings: Multi-image, Multi-frame (videos), Multi-view (3D), and Multi-patch (single-image).

    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-06-16-llava-next-interleave/)

- **[2024-06]** 🚀🚀 We release **`LongVA`**, a long language model with state-of-the-art video understanding performance.

    [GitHub](https://github.com/EvolvingLMMs-Lab/LongVA) | [Blog](https://lmms-lab.github.io/posts/longva/)

<details>
  <summary>Older Updates (2024-06 and earlier)</summary>

- **[2024-06]** 🎬🎬 The **`lmms-eval/v0.2`** toolkit now supports video evaluations for models like LLaVA-NeXT Video and Gemini 1.5 Pro.

    [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.2/)

- **[2024-05]** 🚀🚀 We release **`LLaVA-NeXT Video`**, a model performing at Google's Gemini level on video understanding tasks.

    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-04-30-llava-next-video/)

- **[2024-05]** 🚀🚀 The **`LLaVA-NeXT`** model family reaches near GPT-4V performance on multimodal benchmarks, with models up to 110B parameters.

    [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/)

- **[2024-03]** We release **`lmms-eval`**, a toolkit for holistic evaluations with 50+ multimodal datasets and 10+ models.

    [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.1/)
</details>