changed use_flash_attention_2=True to attn_implementation="flash_attention_2"
#53
by
macadeliccc
- opened
README.md
CHANGED
@@ -271,7 +271,7 @@ To do so, you first need to install [Flash Attention](https://github.com/Dao-AIL
|
|
271 |
pip install flash-attn --no-build-isolation
|
272 |
```
|
273 |
|
274 |
-
and then all you have to do is to pass `
|
275 |
|
276 |
```diff
|
277 |
- model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True)
|
|
|
271 |
pip install flash-attn --no-build-isolation
|
272 |
```
|
273 |
|
274 |
+
and then all you have to do is to pass `attn_implementation="flash_attention_2"` to `from_pretrained`:
|
275 |
|
276 |
```diff
|
277 |
- model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True)
|