ZhangYuanhan commited on
Commit
cae3b05
·
verified ·
1 Parent(s): 8baea48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -16
README.md CHANGED
@@ -6,38 +6,45 @@ colorTo: red
6
  sdk: static
7
  pinned: false
8
  ---
9
- - **[2024-10]** 🔥🔥 We present `LLaVA-Critic`, the first open-source large multimodal model as a generalist evaluator for assessing LMM-generated responses across diverse multimodal tasks and scenarios.
 
10
 
11
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-10-03-llava-critic/)
12
 
13
- - **[2024-10]** 🎬🎬 We present `LLaVA-Video`, a family of open large multimodal models (LMMs) designed specifically for advanced video understanding. We're excited to open-source LLaVA-Video-178K, a high-quality, synthetic dataset curated for video instruction tuning.
14
 
15
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://github.com/LLaVA-VL/LLaVA-NeXT)
16
-
17
- - **[2024-08]** 🤞🤞 We present `LLaVA-OneVision`, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series.
18
-
19
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-08-05-llava-onevision/)
 
 
 
 
 
 
 
20
 
21
- - **[2024-06]** 🧑‍🎨🧑‍🎨 We release the `LLaVA-NeXT-Interleave`, an all-around LMM that extends the model capabilities to new real-world settings: Multi-image, Multi-frame (videos), Multi-view (3D) and maintains the performance of the Multi-patch (single-image) scenarios.
22
-
23
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-06-16-llava-next-interleave/)
24
 
25
- - **[2024-06]** 🚀🚀 We release the `LongVA`, a long language model with state-of-the-art performance on video understanding tasks.
26
-
27
  [GitHub](https://github.com/EvolvingLMMs-Lab/LongVA) | [Blog](https://lmms-lab.github.io/posts/longva/)
28
 
29
- - **[2024-06]** 🎬🎬 The `lmms-eval/v0.2` has been upgraded to support video evaluations for video models like LLaVA-NeXT Video and Gemini 1.5 Pro across tasks such as EgoSchema, PerceptionTest, VideoMME, and more.
30
-
31
  [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.2/)
32
 
33
- - **[2024-05]** 🚀🚀 We release the `LLaVA-NeXT Video`, a video model with state-of-the-art performance and reaching to Google's Gemini level performance on diverse video understanding tasks.
34
-
35
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-04-30-llava-next-video/)
36
 
37
- - **[2024-05]** 🚀🚀 We release the `LLaVA-NeXT` with state-of-the-art and near GPT-4V performance at multiple multimodal benchmarks. LLaVA model family now reaches at 72B, and 110B parameters level.
38
 
39
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/)
40
 
41
- - **[2024-03]** We release the `lmms-eval`, a toolkit for holistic evaluations with 50+ multimodal datasets and 10+ models.
42
 
43
- [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.1/)
 
 
6
  sdk: static
7
  pinned: false
8
  ---
9
+
10
+ - **[2024-10]** 🔥🔥 We present **`LLaVA-Critic`**, the first open-source large multimodal model as a generalist evaluator for assessing LMM-generated responses across diverse multimodal tasks and scenarios.
11
 
12
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-10-03-llava-critic/)
13
 
14
+ - **[2024-10]** 🎬🎬 Introducing **`LLaVA-Video`**, a family of open large multimodal models designed specifically for advanced video understanding. We're open-sourcing **LLaVA-Video-178K**, a high-quality, synthetic dataset for video instruction tuning.
15
 
16
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://github.com/LLaVA-VL/LLaVA-NeXT)
17
+
18
+ - **[2024-08]** 🤞🤞 We present **`LLaVA-OneVision`**, a family of LMMs developed by consolidating insights into data, models, and visual representations.
19
+
20
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-08-05-llava-onevision/)
21
+
22
+ ---
23
+
24
+ <details>
25
+ <summary>Older Updates (2024-06 and earlier)</summary>
26
+
27
+ - **[2024-06]** 🧑‍🎨🧑‍🎨 We release **`LLaVA-NeXT-Interleave`**, an LMM extending capabilities to real-world settings: Multi-image, Multi-frame (videos), Multi-view (3D), and Multi-patch (single-image).
28
 
 
 
29
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-06-16-llava-next-interleave/)
30
 
31
+ - **[2024-06]** 🚀🚀 We release **`LongVA`**, a long language model with state-of-the-art video understanding performance.
32
+
33
  [GitHub](https://github.com/EvolvingLMMs-Lab/LongVA) | [Blog](https://lmms-lab.github.io/posts/longva/)
34
 
35
+ - **[2024-06]** 🎬🎬 The **`lmms-eval/v0.2`** toolkit now supports video evaluations for models like LLaVA-NeXT Video and Gemini 1.5 Pro.
36
+
37
  [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.2/)
38
 
39
+ - **[2024-05]** 🚀🚀 We release **`LLaVA-NeXT Video`**, a model performing at Google's Gemini level on video understanding tasks.
40
+
41
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-04-30-llava-next-video/)
42
 
43
+ - **[2024-05]** 🚀🚀 The **`LLaVA-NeXT`** model family reaches near GPT-4V performance on multimodal benchmarks, with models up to 110B parameters.
44
 
45
  [GitHub](https://github.com/LLaVA-VL/LLaVA-NeXT) | [Blog](https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/)
46
 
47
+ - **[2024-03]** We release **`lmms-eval`**, a toolkit for holistic evaluations with 50+ multimodal datasets and 10+ models.
48
 
49
+ [GitHub](https://github.com/EvolvingLMMs-Lab/lmms-eval) | [Blog](https://lmms-lab.github.io/posts/lmms-eval-0.1/)
50
+ </details>