ChatterjeeLab
/

MeMDLM

feature-extraction

Model card Files Files and versions Community

sgoel30 commited on Oct 21, 2024

Commit

c3c0264

·

verified ·

1 Parent(s): 56efb02

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ pipeline_tag: fill-mask
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65bbea9a26c639b000501321/uWW6xnJZwQFWDS1QZNQTm.png)
-Masked Diffusion Language Models (MDLMs), introduced by Sahoo et al (arxiv.org/pdf/2406.07524), provide strong generative capabilities to BERT-style models. In this work, we pre-train and fine-tune ESM-2-150M protein language model (pLM) on the MDLM objective to scaffold functional motifs and unconditionally generate realistic, high-quality membrane protein sequences.
 ## Model Usage
@@ -42,6 +42,4 @@ inputs = tokenizer(input_sequence, return_tensors="pt")
 output = model(**inputs)
 filled_protein_seq = tokenizer.decode(output.squeeze()) # contains the output protein sequence with filled mask tokens
-```
-This backbone model can be integrated with the [MDLM formulation](https://github.com/kuleshov-group/mdlm) by setting the model backbone type to "hf_dit" and setting the HuggingFace Model ID to "ChatterjeeLab/MeMDLM"

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65bbea9a26c639b000501321/uWW6xnJZwQFWDS1QZNQTm.png)
+[Masked Diffusion Language Models (MDLMs)](arxiv.org/pdf/2406.07524), introduced by Sahoo et al, provide strong generative capabilities to BERT-style models. In this work, we pre-train and fine-tune ESM-2-150M protein language model (pLM) on the MDLM objective to scaffold functional motifs and unconditionally generate realistic, high-quality membrane protein sequences.
 ## Model Usage
 output = model(**inputs)
 filled_protein_seq = tokenizer.decode(output.squeeze()) # contains the output protein sequence with filled mask tokens
+```