lhallee commited on
Commit
9b6e7c5
·
verified ·
1 Parent(s): 8c20d0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -6,11 +6,12 @@ tags: []
6
 
7
  ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
8
 
9
- FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch Attention implementation.
10
 
11
  To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
12
 
13
- Outputting attentions and predicting contacts are not possible from SDPA. Various other optimizations also make the base implementation slightly different than the one in transformers.
 
14
 
15
  ## Use with 🤗 transformers
16
  ```python
 
6
 
7
  ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
8
 
9
+ FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch attention implementation.
10
 
11
  To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
12
 
13
+ Outputting attention maps (or the contact predictino head) is not natively possible with SDPA. You can still pass ```output_attentions``` to have attention calculated manually and returned.
14
+ Various other optimizations also make the base implementation slightly different than the one in transformers.
15
 
16
  ## Use with 🤗 transformers
17
  ```python