Feature Extraction
Safetensors
clip_vision_model
Vision
LLaVA
xiangan commited on
Commit
14fd13e
·
verified ·
1 Parent(s): cf984ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -25,22 +25,23 @@ In our experiments, we replaced the CLIP model in [LLaVA-NeXT](https://github.co
25
  |:----------------|:-------------|:-------------|
26
  | LLM | Qwen2.5-7B | Qwen2.5-7B |
27
  | AI2D | **76.98** | 73.15 |
28
- | ChartQA | **67.84** | 66.52 |
29
- | DocVQA_val | **76.46** | 75.21 |
30
  | GQA | **64.17** | 63.31 |
31
  | InfoVQA_val | **43.48** | 38.88 |
32
  | MMBench_cn_dev | **74.83** | 72.51 |
33
  | MMBench_en_dev | **76.37** | 74.57 |
34
  | MME(cognition) | **432** | 384 |
35
  | MME(perception) | **1598** | 1512 |
 
 
 
36
  | MMMU | **44.30** | 44.20 |
37
  | OCRBench | **531.00** | 525.00 |
 
 
38
  | POPE | 88.69 | **88.83** |
39
- | ScienceQA_img | **78.09** | 76.35 |
40
  | TextVQA_val | 61.69 | **62.47** |
41
- | SeedBench | **68.20** | 66.80 |
42
- | SeedBench_img | **73.75** | 72.72 |
43
- | MMStar | **50.98** | 48.98 |
44
 
45
 
46
 
 
25
  |:----------------|:-------------|:-------------|
26
  | LLM | Qwen2.5-7B | Qwen2.5-7B |
27
  | AI2D | **76.98** | 73.15 |
28
+ | ScienceQA_img | **78.09** | 76.35 |
 
29
  | GQA | **64.17** | 63.31 |
30
  | InfoVQA_val | **43.48** | 38.88 |
31
  | MMBench_cn_dev | **74.83** | 72.51 |
32
  | MMBench_en_dev | **76.37** | 74.57 |
33
  | MME(cognition) | **432** | 384 |
34
  | MME(perception) | **1598** | 1512 |
35
+ | SeedBench | **68.20** | 66.80 |
36
+ | SeedBench_img | **73.75** | 72.72 |
37
+ | MMStar | **50.98** | 48.98 |
38
  | MMMU | **44.30** | 44.20 |
39
  | OCRBench | **531.00** | 525.00 |
40
+ | ChartQA | **67.84** | 66.52 |
41
+ | DocVQA_val | **76.46** | 75.21 |
42
  | POPE | 88.69 | **88.83** |
 
43
  | TextVQA_val | 61.69 | **62.47** |
44
+
 
 
45
 
46
 
47