Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,12 @@ tags: []
|
|
6 |
|
7 |
## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
|
8 |
|
9 |
-
FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch
|
10 |
|
11 |
To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
|
12 |
|
13 |
-
Outputting
|
|
|
14 |
|
15 |
## Use with 🤗 transformers
|
16 |
```python
|
|
|
6 |
|
7 |
## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
|
8 |
|
9 |
+
FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch attention implementation.
|
10 |
|
11 |
To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
|
12 |
|
13 |
+
Outputting attention maps (or the contact predictino head) is not natively possible with SDPA. You can still pass ```output_attentions``` to have attention calculated manually and returned.
|
14 |
+
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
15 |
|
16 |
## Use with 🤗 transformers
|
17 |
```python
|