gmastrapas
commited on
Commit
•
9deac5f
1
Parent(s):
ced12ef
docs: update README on xformers and flash-attn
Browse files
README.md
CHANGED
@@ -389,6 +389,14 @@ _, _, text_embeddings, image_embeddings = output
|
|
389 |
|
390 |
</details>
|
391 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
392 |
|
393 |
## License
|
394 |
|
|
|
389 |
|
390 |
</details>
|
391 |
|
392 |
+
### On CUDA devices
|
393 |
+
|
394 |
+
On a CUDA enabled torch environment, the model comes in `torch.bfloat16`
|
395 |
+
precision by default. When running on CUDA, it is recommended to install
|
396 |
+
[FlashAttention](https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features)
|
397 |
+
and [xFormers](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers)
|
398 |
+
to make use of their efficient attention mechanism implementations.
|
399 |
+
|
400 |
|
401 |
## License
|
402 |
|