OpenGVLab
/

InternViT-6B-448px-V1-0

@@ -10,16 +10,12 @@ datasets:
 pipeline_tag: image-feature-extraction
 ---
-# Model Card for InternViT-6B-448px-V1-0
-<p align="center">
-  <img src="https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/s0wjRQcYFdcQZa2FZ3Om7.webp" alt="Image Description" width="300" height="300">
-</p>
 [\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)  [\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)
 [\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#model-usage)  [\[🌐 Community-hosted API\]](https://rapidapi.com/adushar1320/api/internvl-chat)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/675877376)
 We release InternViT-6B-448px-V1-0, which is integrated into [InternVL-Chat-V1-1](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1). In this update, we explored increasing the resolution to 448x448, enhancing Optical Character Recognition (OCR) capabilities, and improving support for Chinese conversations. For examples of the enhanced capabilities, please refer to the [LINK](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1#examples).
 ## Model Details
@@ -90,8 +86,3 @@ If you find this project useful in your research, please consider citing:
   year={2024}
 }
 ```
-## Acknowledgement
-InternVL is built with reference to the code of the following projects: [OpenAI CLIP](https://github.com/openai/CLIP), [Open CLIP](https://github.com/mlfoundations/open_clip), [CLIP Benchmark](https://github.com/LAION-AI/CLIP_benchmark), [EVA](https://github.com/baaivision/EVA/tree/master), [InternImage](https://github.com/OpenGVLab/InternImage), [ViT-Adapter](https://github.com/czczup/ViT-Adapter), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation), [Transformers](https://github.com/huggingface/transformers), [DINOv2](https://github.com/facebookresearch/dinov2), [BLIP-2](https://github.com/salesforce/LAVIS/tree/main/projects/blip2), [Qwen-VL](https://github.com/QwenLM/Qwen-VL/tree/master/eval_mm), and [LLaVA-1.5](https://github.com/haotian-liu/LLaVA). Thanks for their awesome work!

 pipeline_tag: image-feature-extraction
 ---
+# InternViT-6B-448px-V1-0
 [\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)  [\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)
 [\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#model-usage)  [\[🌐 Community-hosted API\]](https://rapidapi.com/adushar1320/api/internvl-chat)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/675877376)
 We release InternViT-6B-448px-V1-0, which is integrated into [InternVL-Chat-V1-1](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1). In this update, we explored increasing the resolution to 448x448, enhancing Optical Character Recognition (OCR) capabilities, and improving support for Chinese conversations. For examples of the enhanced capabilities, please refer to the [LINK](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1#examples).
 ## Model Details
   year={2024}
 }
 ```