Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: image-feature-extraction
|
|
12 |
|
13 |
# Model Card for InternViT-6B-448px-V1-0
|
14 |
|
15 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/
|
16 |
|
17 |
\[[Paper](https://arxiv.org/abs/2312.14238)\] \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
|
18 |
|
@@ -30,7 +30,7 @@ pipeline_tag: image-feature-extraction
|
|
30 |
- Params (M): 5903
|
31 |
- Image size: 448 x 448
|
32 |
- **Pretrain Dataset:** LAION-en, LAION-COCO, COYO, CC12M, CC3M, SBU, Wukong, LAION-multi, OCR-related datasets.
|
33 |
-
- **Note:** This model has 48 blocks, and we found that using the output after the fourth-to-last block worked best for VLLM. Therefore,
|
34 |
|
35 |
## Model Usage (Image Embeddings)
|
36 |
|
|
|
12 |
|
13 |
# Model Card for InternViT-6B-448px-V1-0
|
14 |
|
15 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/s0wjRQcYFdcQZa2FZ3Om7.webp" alt="Image Description" width="300" height="300">
|
16 |
|
17 |
\[[Paper](https://arxiv.org/abs/2312.14238)\] \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
|
18 |
|
|
|
30 |
- Params (M): 5903
|
31 |
- Image size: 448 x 448
|
32 |
- **Pretrain Dataset:** LAION-en, LAION-COCO, COYO, CC12M, CC3M, SBU, Wukong, LAION-multi, OCR-related datasets.
|
33 |
+
- **Note:** This model has 48 blocks, and we found that using the output after the fourth-to-last block worked best for VLLM. Therefore, when building a VLLM with this model, **please use the features from the fourth-to-last layer.**
|
34 |
|
35 |
## Model Usage (Image Embeddings)
|
36 |
|