Update README.md
Browse files
README.md
CHANGED
@@ -266,7 +266,7 @@ If your GPU allows, load and run inference in half precision (`torch.float16` or
|
|
266 |
|
267 |
```diff
|
268 |
model = AutoModelForVision2Seq.from_pretrained(
|
269 |
-
"lamm-mit/Cephalo-Idefics-2-vision-
|
270 |
+ torch_dtype=torch.float16,
|
271 |
).to(DEVICE)
|
272 |
```
|
@@ -287,7 +287,7 @@ Mke sure to install `flash-attn`. Refer to the [original repository of Flash Att
|
|
287 |
|
288 |
```diff
|
289 |
model = AutoModelForVision2Seq.from_pretrained(
|
290 |
-
"lamm-mit/Cephalo-Idefics-2-vision-
|
291 |
+ torch_dtype=torch.bfloat16,
|
292 |
+ _attn_implementation="flash_attention_2",
|
293 |
).to(DEVICE)
|
@@ -298,7 +298,7 @@ model = AutoModelForVision2Seq.from_pretrained(
|
|
298 |
**4 bit quantization with bitsandbytes**
|
299 |
|
300 |
<details><summary>Click to expand.</summary>
|
301 |
-
It is possible to load
|
302 |
|
303 |
```diff
|
304 |
+ from transformers import BitsAndBytesConfig
|
@@ -310,7 +310,7 @@ quantization_config = BitsAndBytesConfig(
|
|
310 |
bnb_4bit_compute_dtype=torch.bfloat16
|
311 |
)
|
312 |
model = AutoModelForVision2Seq.from_pretrained(
|
313 |
-
"lamm-mit/Cephalo-Idefics-2-vision-
|
314 |
+ torch_dtype=torch.bfloat16,
|
315 |
+ quantization_config=quantization_config,
|
316 |
).to(DEVICE)
|
|
|
266 |
|
267 |
```diff
|
268 |
model = AutoModelForVision2Seq.from_pretrained(
|
269 |
+
"lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
|
270 |
+ torch_dtype=torch.float16,
|
271 |
).to(DEVICE)
|
272 |
```
|
|
|
287 |
|
288 |
```diff
|
289 |
model = AutoModelForVision2Seq.from_pretrained(
|
290 |
+
"lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
|
291 |
+ torch_dtype=torch.bfloat16,
|
292 |
+ _attn_implementation="flash_attention_2",
|
293 |
).to(DEVICE)
|
|
|
298 |
**4 bit quantization with bitsandbytes**
|
299 |
|
300 |
<details><summary>Click to expand.</summary>
|
301 |
+
It is possible to load Cephalo-Idefics-2-vision-10b-alpha in 4bits with `bitsandbytes`. Make sure that you have `accelerate` and `bitsandbytes` installed.
|
302 |
|
303 |
```diff
|
304 |
+ from transformers import BitsAndBytesConfig
|
|
|
310 |
bnb_4bit_compute_dtype=torch.bfloat16
|
311 |
)
|
312 |
model = AutoModelForVision2Seq.from_pretrained(
|
313 |
+
"lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
|
314 |
+ torch_dtype=torch.bfloat16,
|
315 |
+ quantization_config=quantization_config,
|
316 |
).to(DEVICE)
|