TryingHard
commited on
Ovis1.6-Gemma2-9B-GPTQ-Int4 readme v1
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ language:
|
|
10 |
- en
|
11 |
---
|
12 |
|
13 |
-
# Ovis1.6-Gemma2-9B
|
14 |
<div align="center">
|
15 |
<img src=https://cdn-uploads.huggingface.co/production/uploads/637aebed7ce76c3b834cea37/3IK823BZ8w-mz_QfeYkDn.png width="30%"/>
|
16 |
</div>
|
@@ -32,28 +32,42 @@ Built upon Ovis1.5, **Ovis1.6** further enhances high-resolution image processin
|
|
32 |
|:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------:|
|
33 |
| Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) | [Space](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Gemma2-9B) |
|
34 |
|
35 |
-
##
|
36 |
-
|
37 |
-
|
38 |
-
<div align="center">
|
39 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/637aebed7ce76c3b834cea37/ro7nBJmhHQMZYePZmmFJd.png" width="100%" />
|
40 |
-
</div>
|
41 |
|
42 |
-
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
```bash
|
45 |
-
|
|
|
|
|
46 |
```
|
|
|
|
|
|
|
|
|
47 |
```python
|
48 |
import torch
|
49 |
from PIL import Image
|
50 |
-
from transformers import
|
|
|
51 |
|
52 |
# load model
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
|
|
|
|
|
|
|
|
57 |
text_tokenizer = model.get_text_tokenizer()
|
58 |
visual_tokenizer = model.get_visual_tokenizer()
|
59 |
|
@@ -140,6 +154,12 @@ for i in range(len(batch_input_ids)):
|
|
140 |
```
|
141 |
</details>
|
142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
143 |
## Citation
|
144 |
If you find Ovis useful, please cite the paper
|
145 |
```
|
|
|
10 |
- en
|
11 |
---
|
12 |
|
13 |
+
# Ovis1.6-Gemma2-9B-GPTQ-Int4
|
14 |
<div align="center">
|
15 |
<img src=https://cdn-uploads.huggingface.co/production/uploads/637aebed7ce76c3b834cea37/3IK823BZ8w-mz_QfeYkDn.png width="30%"/>
|
16 |
</div>
|
|
|
32 |
|:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------:|
|
33 |
| Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) | [Space](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Gemma2-9B) |
|
34 |
|
35 |
+
## Quantized Model: GPTQ-Int4
|
36 |
+
We quantized Ovis1.6 with AutoGPTQ. Follow these steps to run it.
|
|
|
|
|
|
|
|
|
37 |
|
38 |
+
### Installation
|
39 |
+
1. Run the following commands to get a basic environment. Be sure to run with CUDA 12.1.
|
40 |
+
```bash
|
41 |
+
conda create -n <your_env_name> python=3.10
|
42 |
+
conda activate <your_env_name>
|
43 |
+
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
|
44 |
+
pip install numpy==1.24.3 transformers==4.44.2 pillow==10.3.0 gekko pandas
|
45 |
+
```
|
46 |
+
2. Build AutoGPTQ: We customized AutoGPTQ to support Ovis model quantization. You need to build from source to install the customized version.
|
47 |
```bash
|
48 |
+
git clone https://github.com/kq-chen/AutoGPTQ.git
|
49 |
+
cd AutoGPTQ
|
50 |
+
pip install -vvv --no-build-isolation -e .
|
51 |
```
|
52 |
+
Check [this](https://github.com/AutoGPTQ/AutoGPTQ/issues/194) first if you are building inside a Docker environment.
|
53 |
+
|
54 |
+
### Usage
|
55 |
+
Below is a code snippet to run Ovis1.6-Gemma2-9B-GPTQ-Int4 with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Ovis GitHub](https://github.com/AIDC-AI/Ovis?tab=readme-ov-file#inference).
|
56 |
```python
|
57 |
import torch
|
58 |
from PIL import Image
|
59 |
+
from transformers import GenerationConfig
|
60 |
+
from auto_gptq.modeling import OvisGPTQForCausalLM
|
61 |
|
62 |
# load model
|
63 |
+
load_device = "cuda:0" # customize load device
|
64 |
+
model = OvisGPTQForCausalLM.from_pretrained(
|
65 |
+
"TryingHard/Ovis1.6-Gemma2-9B-GPTQ-Int4",
|
66 |
+
device=load_device,
|
67 |
+
multimodal_max_length=8192,
|
68 |
+
trust_remote_code=True
|
69 |
+
)
|
70 |
+
model.model.generation_config = GenerationConfig.from_pretrained("TryingHard/Ovis1.6-Gemma2-9B-GPTQ-Int4")
|
71 |
text_tokenizer = model.get_text_tokenizer()
|
72 |
visual_tokenizer = model.get_visual_tokenizer()
|
73 |
|
|
|
154 |
```
|
155 |
</details>
|
156 |
|
157 |
+
|
158 |
+
## Performance
|
159 |
+
Here we report the performance of Ovis1.6-Gemma2-9B-GPTQ-Int4. The results are obtained with VLMEvalkit.
|
160 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/645cb4b4a03f3ebb0bde20e0/pSKiBhCy1S6Fb1QODY_ZZ.png)
|
161 |
+
|
162 |
+
|
163 |
## Citation
|
164 |
If you find Ovis useful, please cite the paper
|
165 |
```
|