Spaces:

Tonic
/

Llava-Video

Runtime error

Tonic commited on Oct 4, 2024

Commit

1488686

unverified ·

1 Parent(s): 7e5cbb7

add demo

Files changed (1) hide show

app.py CHANGED Viewed

@@ -30,11 +30,12 @@ description1 ="""The **🌋📹LLaVA-Video-7B-Qwen2** is a 7B parameter model  t
 This model leverages the **SO400M vision backbone** for visual input and Qwen2 for language processing, making it highly efficient in multi-modal reasoning, including visual and video-based tasks.
 🌋📹LLaVA-Video has larger variants of [32B](https://huggingface.co/lmms-lab/LLaVA-NeXT-Video-32B-Qwen) and [72B](https://huggingface.co/lmms-lab/LLaVA-Video-72B-Qwen2) and with a [variant](https://huggingface.co/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only) only trained on the new synthetic data
 For further details, please visit the [Project Page](https://github.com/LLaVA-VL/LLaVA-NeXT) or check out the corresponding [research paper](https://arxiv.org/abs/2410.02713).
-"""
-description2 ="""- **Architecture**: `LlavaQwenForCausalLM`
 - **Attention Heads**: 28
 - **Hidden Layers**: 28
 - **Hidden Size**: 3584
 - **Intermediate Size**: 18944
 - **Max Frames Supported**: 64
 - **Languages Supported**: English, Chinese

 This model leverages the **SO400M vision backbone** for visual input and Qwen2 for language processing, making it highly efficient in multi-modal reasoning, including visual and video-based tasks.
 🌋📹LLaVA-Video has larger variants of [32B](https://huggingface.co/lmms-lab/LLaVA-NeXT-Video-32B-Qwen) and [72B](https://huggingface.co/lmms-lab/LLaVA-Video-72B-Qwen2) and with a [variant](https://huggingface.co/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only) only trained on the new synthetic data
 For further details, please visit the [Project Page](https://github.com/LLaVA-VL/LLaVA-NeXT) or check out the corresponding [research paper](https://arxiv.org/abs/2410.02713).
+- **Architecture**: `LlavaQwenForCausalLM`
 - **Attention Heads**: 28
 - **Hidden Layers**: 28
 - **Hidden Size**: 3584
+"""
+description2 ="""
 - **Intermediate Size**: 18944
 - **Max Frames Supported**: 64
 - **Languages Supported**: English, Chinese