TianyuZhang commited on
Commit
e3581ce
Β·
verified Β·
1 Parent(s): a840c98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -11,6 +11,7 @@ short_description: The VCR-Wiki datasets
11
 
12
  This space contains all configurations for VCR-Wiki, introduced in VCR: Visual Caption Restoration (https://arxiv.org/abs/2406.06462).
13
  # News
 
14
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-13]** We release the evaluation codes for open-source models, closed-source models and the pipeline of creating the dataset in [VCR's Github Repo](https://github.com/tianyu-z/VCR).
15
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-12]** We have incorperated the VCR-wiki evaluation process in [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval) framework. Now, users can use one line command to run the evaluation of models on the VCR-wiki test datasets.
16
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-11]** Our paper has been released on the [arXiv](https://arxiv.org/abs/2406.06462), including the evaluation results of a series of models.
 
11
 
12
  This space contains all configurations for VCR-Wiki, introduced in VCR: Visual Caption Restoration (https://arxiv.org/abs/2406.06462).
13
  # News
14
+ - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-24]** We update our arXiv paper. Now, we have results from Claude 3.5 Sonnet, Claude 3 Opus, GPT-4o, GPT-4-Turbo, Qwen-VL-Max, Reka Core and Gemini-1.5-pro. The evaluation script is also released. Please check github repo: `src/evaluation/closed_source_eval.py`.
15
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-13]** We release the evaluation codes for open-source models, closed-source models and the pipeline of creating the dataset in [VCR's Github Repo](https://github.com/tianyu-z/VCR).
16
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-12]** We have incorperated the VCR-wiki evaluation process in [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval) framework. Now, users can use one line command to run the evaluation of models on the VCR-wiki test datasets.
17
  - πŸ”₯πŸ”₯πŸ”₯ **[2024-06-11]** Our paper has been released on the [arXiv](https://arxiv.org/abs/2406.06462), including the evaluation results of a series of models.