lhallee commited on
Commit
430aa02
·
verified ·
1 Parent(s): 5ac0109

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -6,11 +6,11 @@ tags: []
6
 
7
  ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
8
 
9
- FastESM is a fully Huggingface compatible version rewritten with a newer PyTorch Attention implementation which will run FlashAttention2 when possible.
10
 
11
- To produce the FastESM weights, we trained ESM2-650 50000 additional steps in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
12
 
13
- Outputting attentions and predicting contacts are not possible from SDPA. Various other optimizations also make the base implementation slightly different than the HF one.
14
 
15
  ## Use with 🤗 transformers
16
  ```python
@@ -21,8 +21,8 @@ model_path = 'Synthyra/FastESM2_650'
21
  model = AutoModel.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).eval()
22
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
23
 
24
- sequence = 'MSEQWENCE'
25
- tokenized = tokenizer(sequence, return_tensors='pt')
26
  with torch.no_grad():
27
  embeddings = model(**tokenized).last_hidden_state
28
 
@@ -56,7 +56,7 @@ _ = model.embed_dataset(
56
  )
57
  ```
58
  ## Model probes
59
- We employ linear probing techniques on various PLMs and standard datasets, similar our previous [paper](https://www.biorxiv.org/content/10.1101/2024.07.30.605924v1), to access the intrinsic correlation between pooled hidden states and valuable properties. ESMC (and thus ESM++) perform very well.
60
 
61
  The plot below showcases performance normalized between the negative control (random vector embeddings) and the best performer. Classification task scores are averaged between MCC and F1 (or F1max for multilabel) and regression tasks are averaged between Spearman rho and R2.
62
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f2bd3bdb7cbd214b658c48/4BvJwkXRFSGbMVqMksS8O.png)
@@ -79,7 +79,7 @@ Requires PyTorch 2.5+ for the most savings, see [SDPA](https://pytorch.org/docs/
79
  author = { Hallee, L. and Bichara, D. and Gleghorn, J, P. },
80
  title = { FastESM2 },
81
  year = 2024,
82
- url = { https://huggingface.co/Synthyra/FastESM2 },
83
  doi = { 10.57967/hf/3729 },
84
  publisher = { Hugging Face }
85
  }
 
6
 
7
  ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
8
 
9
+ FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch Attention implementation.
10
 
11
+ To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
12
 
13
+ Outputting attentions and predicting contacts are not possible from SDPA. Various other optimizations also make the base implementation slightly different than the one in transformers.
14
 
15
  ## Use with 🤗 transformers
16
  ```python
 
21
  model = AutoModel.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).eval()
22
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
23
 
24
+ sequences = ['MPRTEIN', 'MSEQWENCE']
25
+ tokenized = tokenizer(sequences, padding=True, return_tensors='pt')
26
  with torch.no_grad():
27
  embeddings = model(**tokenized).last_hidden_state
28
 
 
56
  )
57
  ```
58
  ## Model probes
59
+ We employ linear probing techniques on various PLMs and standard datasets, similar our previous [paper](https://www.biorxiv.org/content/10.1101/2024.07.30.605924v1), to access the intrinsic correlation between pooled hidden states and valuable properties. FastESM performs very well.
60
 
61
  The plot below showcases performance normalized between the negative control (random vector embeddings) and the best performer. Classification task scores are averaged between MCC and F1 (or F1max for multilabel) and regression tasks are averaged between Spearman rho and R2.
62
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f2bd3bdb7cbd214b658c48/4BvJwkXRFSGbMVqMksS8O.png)
 
79
  author = { Hallee, L. and Bichara, D. and Gleghorn, J, P. },
80
  title = { FastESM2 },
81
  year = 2024,
82
+ url = { https://huggingface.co/Synthyra/FastESM2_650 },
83
  doi = { 10.57967/hf/3729 },
84
  publisher = { Hugging Face }
85
  }