Alibaba-NLP
/

gme-Qwen2-VL-2B-Instruct

Sentence Similarity

sentence-transformers

image-text-to-text

Inference Endpoints

Model card Files Files and versions Community

thenlper commited on 3 days ago

Commit

a62d4ef

•

1 Parent(s): 615ba56

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -3613,7 +3613,7 @@ The `GME` models support three types of input: **text**, **image**, and **image-
 **Developed by**: Tongyi Lab, Alibaba Group
-**Paper**: GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
 ## Model List
@@ -3678,7 +3678,7 @@ We validated the performance on our universal multimodal retrieval benchmark (**
 The [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) English tab shows the text embeddings performence of our model.
-**More detailed experimental results can be found in the [paper](https://arxiv.org/pdf/2407.19669)**.
 ## Limitations
@@ -3701,7 +3701,7 @@ We encourage and value diverse applications of GME models and continuous enhance
 In addition to the open-source [GME](https://huggingface.co/collections/Alibaba-NLP/gme-models-67667e092da3491f630964d6) series models, GME series models are also available as commercial API services on Alibaba Cloud.
-- [MultiModal Embedding Models](https://help.aliyun.com/zh/model-studio/developer-reference/general-text-embedding/): The `multimodal-embedding-v1` model service is available.
 Note that the models behind the commercial APIs are not entirely identical to the open-source models.
@@ -3720,9 +3720,9 @@ If you find our paper or models helpful, please consider cite:
       title={GME: Improving Universal Multimodal Retrieval by Multimodal LLMs},
       author={Zhang, Xin and Zhang, Yanzhao and Xie, Wen and Li, Mingxin and Dai, Ziqi and Long, Dingkun and Xie, Pengjun and Zhang, Meishan and Li, Wenjie and Zhang, Min},
       year={2024},
-      eprint={2412.xxxxx},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
-      url={https://arxiv.org/abs/2412.xxxxx},
 }
-```

 **Developed by**: Tongyi Lab, Alibaba Group
+**Paper**: [GME: Improving Universal Multimodal Retrieval by Multimodal LLMs](http://arxiv.org/abs/2412.16855)
 ## Model List
 The [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) English tab shows the text embeddings performence of our model.
+**More detailed experimental results can be found in the [paper](http://arxiv.org/abs/2412.16855)**.
 ## Limitations
 In addition to the open-source [GME](https://huggingface.co/collections/Alibaba-NLP/gme-models-67667e092da3491f630964d6) series models, GME series models are also available as commercial API services on Alibaba Cloud.
+- [MultiModal Embedding Models](https://help.aliyun.com/zh/model-studio/developer-reference/multimodal-embedding-api-reference?spm=a2c4g.11186623.0.0.321c1d1cqmoJ5C): The `multimodal-embedding-v1` model service is available.
 Note that the models behind the commercial APIs are not entirely identical to the open-source models.
       title={GME: Improving Universal Multimodal Retrieval by Multimodal LLMs},
       author={Zhang, Xin and Zhang, Yanzhao and Xie, Wen and Li, Mingxin and Dai, Ziqi and Long, Dingkun and Xie, Pengjun and Zhang, Meishan and Li, Wenjie and Zhang, Min},
       year={2024},
+      eprint={2412.16855},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
+      url={http://arxiv.org/abs/2412.16855},
 }
+```