xiangan commited on
Commit
e279690
·
verified ·
1 Parent(s): 5418ee9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -9,4 +9,48 @@ base_model:
9
  - DeepGlint-AI/mlcd-vit-large-patch14-336
10
  ---
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  We would like to express our gratitude to [Huajie Tan](https://huggingface.co/tanhuajie2001), [Yumeng Wang](https://huggingface.co/devymex), [Yin Xie](https://huggingface.co/Yin-Xie) for his significant contributions to the experimental validation in MLLMs.
 
9
  - DeepGlint-AI/mlcd-vit-large-patch14-336
10
  ---
11
 
12
+
13
+ [[Paper]](https://arxiv.org/abs/2407.17331) [[GitHub]](https://github.com/deepglint/unicom)
14
+
15
+ ## Performance in RoboVQA and OpenEQA
16
+
17
+
18
+
19
+ | | | MLCD-Embodied-7B | LLaVA OneVision-7B | GPT-4V | RoboMamba |
20
+ |----------------|-------------------|-------------------|--------------------|--------|-----------|
21
+ | **RoboVQA** | BLEU1 | **73.16** | 38.12 | - | 54.9 |
22
+ | | BLEU2 | **66.39** | 33.56 | - | 44.2 |
23
+ | | BLEU3 | **60.61** | 31.76 | - | 39.5 |
24
+ | | BLEU4 | **56.56** | 30.97 | - | 36.3 |
25
+ | **OpenEQA** | OBJECT-STATE-RECOGNITION | **71.83** | - | 63.2 | - |
26
+ | | OBJECT-RECOGNITION | **49.46** | - | 43.4 | - |
27
+ | | FUNCTIONAL-REASONING | 54.38 | - | **57.4** | - |
28
+ | | SPATIAL-UNDERSTANDING | **48.64** | - | 33.6 | - |
29
+ | | ATTRIBUTE-RECOGNITION | **67.08** | - | 57.2 | - |
30
+ | | WORLD-KNOWLEDGE | **53.87** | - | 50.7 | - |
31
+ | | OBJECT-LOCALIZATION | **43.06** | - | 42.0 | - |
32
+
33
+
34
+
35
+
36
+ ## General Ability Evaluation: Comparison with LLaVA OneVision-7B and GPT-4
37
+
38
+ | Dataset | Split | MLCD-Embodied-7B | LLaVA OneVision-7B | GPT-4v | GPT-4o |
39
+ | :-- | :-: | :-: | :-: | :-: | :-: |
40
+ | A12D | test | 79.9 | 81.4 | 78.2 | 94.2 |
41
+ | ChartQA | test | 83.0 | 80.0 | 78.5 | 85.7 |
42
+ | DocVQA | test | 91.6 | 87.5 | 88.4 | 92.8 |
43
+ | InfoVQA | val | 73.9 | 70.7 | - | - |
44
+ | InfoVQA | test | 70.0 | 68.8 | - | - |
45
+ | MMMU | val | 47.3 | 48.8 | 56.8 | 69.1 |
46
+ | MMStar | test | 58.5 | 61.7 | 57.1 | 63.9 |
47
+ | OCRBench | - | 749.0 | 697.0 | 656.0 | 805.0 |
48
+ | RealWorldQA | test | 68.9 | 66.3 | 61.4 | 58.6 |
49
+ | SeedBench | image | 74.9 | 75.4 | 49.9 | 76.2 |
50
+ | MMbench | en-dev | 81.1 | 83.2 | 81.3 | 83.4 |
51
+ | MMbench | en-test | 80.1 | 80.8 | 75.0 | - |
52
+ | MME | test | 578/1603 | 418/1580 | 517/1409 | - |
53
+
54
+
55
+
56
  We would like to express our gratitude to [Huajie Tan](https://huggingface.co/tanhuajie2001), [Yumeng Wang](https://huggingface.co/devymex), [Yin Xie](https://huggingface.co/Yin-Xie) for his significant contributions to the experimental validation in MLLMs.