llava-hf
/

LLaVA-NeXT-Video-7B-32K-hf

@@ -3,6 +3,8 @@ language:
 - en
 license: llama2
 pipeline_tag: image-text-to-text
 ---
 # LLaVA-NeXT-Video Model Card
@@ -18,7 +20,7 @@ LLaVA-Next-Video is an open-source chatbot trained by fine-tuning LLM on multimo
 The model is a current SOTA among open-source models on [VideoMME bench](https://arxiv.org/abs/2405.21075).
 Base LLM: [lmsys/vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5)
-<img src="http://drive.google.com/uc?export=view&id=1fVg-r5MU3NoHlTpD7_lYPEBWH9R8na_4">
 **Model date:**
@@ -231,5 +233,4 @@ If you find our paper and code useful in your research:
     month={January},
     year={2024}
 }
-```

 - en
 license: llama2
 pipeline_tag: image-text-to-text
+datasets:
+- lmms-lab/VideoChatGPT
 ---
 # LLaVA-NeXT-Video Model Card
 The model is a current SOTA among open-source models on [VideoMME bench](https://arxiv.org/abs/2405.21075).
 Base LLM: [lmsys/vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5)
+![llava_next_video_arch](demo.png)
 **Model date:**
     month={January},
     year={2024}
 }
+```