Synthyra
/

FastESM2_650

Fill-Mask

Transformers

Safetensors

fast_esm

custom_code

Model card Files Files and versions Community

lhallee commited on Dec 6, 2024

Commit

430aa02

verified ·

1 Parent(s): 5ac0109

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -6,11 +6,11 @@ tags: []
 ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
-FastESM is a fully Huggingface compatible version rewritten with a newer PyTorch Attention implementation which will run FlashAttention2 when possible.
-To produce the FastESM weights, we trained ESM2-650 50000 additional steps in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
-Outputting attentions and predicting contacts are not possible from SDPA. Various other optimizations also make the base implementation slightly different than the HF one.
 ## Use with 🤗 transformers
 ```python
@@ -21,8 +21,8 @@ model_path = 'Synthyra/FastESM2_650'
 model = AutoModel.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).eval()
 tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
-sequence = 'MSEQWENCE'
-tokenized = tokenizer(sequence, return_tensors='pt')
 with torch.no_grad():
     embeddings = model(**tokenized).last_hidden_state
@@ -56,7 +56,7 @@ _ = model.embed_dataset(
 )
 ```
 ## Model probes
-We employ linear probing techniques on various PLMs and standard datasets, similar our previous [paper](https://www.biorxiv.org/content/10.1101/2024.07.30.605924v1), to access the intrinsic correlation between pooled hidden states and valuable properties. ESMC (and thus ESM++) perform very well.
 The plot below showcases performance normalized between the negative control (random vector embeddings) and the best performer. Classification task scores are averaged between MCC and F1 (or F1max for multilabel) and regression tasks are averaged between Spearman rho and R2.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f2bd3bdb7cbd214b658c48/4BvJwkXRFSGbMVqMksS8O.png)
@@ -79,7 +79,7 @@ Requires PyTorch 2.5+ for the most savings, see [SDPA](https://pytorch.org/docs/
 	author       = { Hallee, L. and Bichara, D. and Gleghorn, J, P. },
 	title        = { FastESM2 },
 	year         = 2024,
-	url          = { https://huggingface.co/Synthyra/FastESM2 },
 	doi          = { 10.57967/hf/3729 },
 	publisher    = { Hugging Face }
 }

 ## A faster half-precision version of ESM2-650 with FlashAttention2 and longer context
+FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch Attention implementation.
+To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
+Outputting attentions and predicting contacts are not possible from SDPA. Various other optimizations also make the base implementation slightly different than the one in transformers.
 ## Use with 🤗 transformers
 ```python
 model = AutoModel.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).eval()
 tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+sequences = ['MPRTEIN', 'MSEQWENCE']
+tokenized = tokenizer(sequences, padding=True, return_tensors='pt')
 with torch.no_grad():
     embeddings = model(**tokenized).last_hidden_state
 )
 ```
 ## Model probes
+We employ linear probing techniques on various PLMs and standard datasets, similar our previous [paper](https://www.biorxiv.org/content/10.1101/2024.07.30.605924v1), to access the intrinsic correlation between pooled hidden states and valuable properties. FastESM performs very well.
 The plot below showcases performance normalized between the negative control (random vector embeddings) and the best performer. Classification task scores are averaged between MCC and F1 (or F1max for multilabel) and regression tasks are averaged between Spearman rho and R2.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f2bd3bdb7cbd214b658c48/4BvJwkXRFSGbMVqMksS8O.png)
 	author       = { Hallee, L. and Bichara, D. and Gleghorn, J, P. },
 	title        = { FastESM2 },
 	year         = 2024,
+	url          = { https://huggingface.co/Synthyra/FastESM2_650 },
 	doi          = { 10.57967/hf/3729 },
 	publisher    = { Hugging Face }
 }