Update README.md
Browse files
README.md
CHANGED
@@ -9,21 +9,21 @@ pipeline_tag: image-text-to-text
|
|
9 |
---
|
10 |
|
11 |
## Introduction
|
12 |
-
Ovis is a novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings. For a comprehensive introduction, please refer to [Ovis paper](https://arxiv.org/abs/2405.20797) and [Ovis GitHub](https://github.com/AIDC-AI/Ovis).
|
13 |
|
14 |
<div align="center">
|
15 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/TIlymOb86R6_Mez3bpmcB.png" width="100%" />
|
16 |
</div>
|
17 |
|
18 |
## Model
|
19 |
-
Built upon Ovis1.5, Ovis1.6 further enhances high-resolution image processing, is trained on a larger, more diverse, and higher-quality dataset, and refines the training process with DPO training following instruction-tuning.
|
20 |
|
21 |
| Ovis MLLMs | ViT | LLM | Model Weights |
|
22 |
|:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|
|
23 |
| Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) |
|
24 |
|
25 |
## Performance
|
26 |
-
With just **10B** parameters, Ovis1.6-Gemma2-9B leads the [OpenCompass](https://github.com/open-compass/VLMEvalKit) benchmark among open-source MLLMs within **30B** parameters.
|
27 |
|
28 |
<div align="center">
|
29 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/FBw_icZic56Dm1XyzJaxA.png" width="100%" />
|
|
|
9 |
---
|
10 |
|
11 |
## Introduction
|
12 |
+
**Ovis** is a novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings. For a comprehensive introduction, please refer to [Ovis paper](https://arxiv.org/abs/2405.20797) and [Ovis GitHub](https://github.com/AIDC-AI/Ovis).
|
13 |
|
14 |
<div align="center">
|
15 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/TIlymOb86R6_Mez3bpmcB.png" width="100%" />
|
16 |
</div>
|
17 |
|
18 |
## Model
|
19 |
+
Built upon Ovis1.5, **Ovis1.6** further enhances high-resolution image processing, is trained on a larger, more diverse, and higher-quality dataset, and refines the training process with DPO training following instruction-tuning.
|
20 |
|
21 |
| Ovis MLLMs | ViT | LLM | Model Weights |
|
22 |
|:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|
|
23 |
| Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) |
|
24 |
|
25 |
## Performance
|
26 |
+
With just **10B** parameters, **Ovis1.6-Gemma2-9B** leads the [OpenCompass](https://github.com/open-compass/VLMEvalKit) benchmark among open-source MLLMs within **30B** parameters.
|
27 |
|
28 |
<div align="center">
|
29 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/FBw_icZic56Dm1XyzJaxA.png" width="100%" />
|