Update README.md
Browse files
README.md
CHANGED
@@ -46,7 +46,7 @@ In training, we used 1849 training dataset, and 200 validation dataset.
|
|
46 |
We internally evaluated [VLMEvalKit](https://github.com/open-compass/VLMEvalKit?tab=readme-ov-file).
|
47 |
We utilized **chatgpt-0125**, **gpt-4o-mini** and **gpt-4-turbo** in `MMBench`, `MathVista` and `MMVet`, respectively.
|
48 |
|
49 |
-
| Model | MMStar | MathVista |
|
50 |
|:---------:|:-----:|:------:|:-----:|:-----:|:----:|:-----:|:-----:|:-----:|
|
51 |
| Step-1o (closed model) | 69.3 | **74.7** | **89.1** | 55.8 | **92.6** | **82.8** | 87.3 | **78.8** |
|
52 |
| InternVL2.5-78B-MPO (Open) | **72.1** | 76.6 | 58.1 | **89.2** | 90.9 | 73.5 | **87.8** | 78.3 |
|
|
|
46 |
We internally evaluated [VLMEvalKit](https://github.com/open-compass/VLMEvalKit?tab=readme-ov-file).
|
47 |
We utilized **chatgpt-0125**, **gpt-4o-mini** and **gpt-4-turbo** in `MMBench`, `MathVista` and `MMVet`, respectively.
|
48 |
|
49 |
+
| Model | MMStar | MathVista | HallusionBench | AI2D | OCRBench | MMVet | MMBench_V11 | AVG |
|
50 |
|:---------:|:-----:|:------:|:-----:|:-----:|:----:|:-----:|:-----:|:-----:|
|
51 |
| Step-1o (closed model) | 69.3 | **74.7** | **89.1** | 55.8 | **92.6** | **82.8** | 87.3 | **78.8** |
|
52 |
| InternVL2.5-78B-MPO (Open) | **72.1** | 76.6 | 58.1 | **89.2** | 90.9 | 73.5 | **87.8** | 78.3 |
|