Tonic commited on
Commit
1488686
·
unverified ·
1 Parent(s): 7e5cbb7
Files changed (1) hide show
  1. app.py +3 -2
app.py CHANGED
@@ -30,11 +30,12 @@ description1 ="""The **🌋📹LLaVA-Video-7B-Qwen2** is a 7B parameter model t
30
  This model leverages the **SO400M vision backbone** for visual input and Qwen2 for language processing, making it highly efficient in multi-modal reasoning, including visual and video-based tasks.
31
  🌋📹LLaVA-Video has larger variants of [32B](https://huggingface.co/lmms-lab/LLaVA-NeXT-Video-32B-Qwen) and [72B](https://huggingface.co/lmms-lab/LLaVA-Video-72B-Qwen2) and with a [variant](https://huggingface.co/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only) only trained on the new synthetic data
32
  For further details, please visit the [Project Page](https://github.com/LLaVA-VL/LLaVA-NeXT) or check out the corresponding [research paper](https://arxiv.org/abs/2410.02713).
33
- """
34
- description2 ="""- **Architecture**: `LlavaQwenForCausalLM`
35
  - **Attention Heads**: 28
36
  - **Hidden Layers**: 28
37
  - **Hidden Size**: 3584
 
 
38
  - **Intermediate Size**: 18944
39
  - **Max Frames Supported**: 64
40
  - **Languages Supported**: English, Chinese
 
30
  This model leverages the **SO400M vision backbone** for visual input and Qwen2 for language processing, making it highly efficient in multi-modal reasoning, including visual and video-based tasks.
31
  🌋📹LLaVA-Video has larger variants of [32B](https://huggingface.co/lmms-lab/LLaVA-NeXT-Video-32B-Qwen) and [72B](https://huggingface.co/lmms-lab/LLaVA-Video-72B-Qwen2) and with a [variant](https://huggingface.co/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only) only trained on the new synthetic data
32
  For further details, please visit the [Project Page](https://github.com/LLaVA-VL/LLaVA-NeXT) or check out the corresponding [research paper](https://arxiv.org/abs/2410.02713).
33
+ - **Architecture**: `LlavaQwenForCausalLM`
 
34
  - **Attention Heads**: 28
35
  - **Hidden Layers**: 28
36
  - **Hidden Size**: 3584
37
+ """
38
+ description2 ="""
39
  - **Intermediate Size**: 18944
40
  - **Max Frames Supported**: 64
41
  - **Languages Supported**: English, Chinese