Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ pipeline_tag: fill-mask
|
|
24 |
|
25 |

|
26 |
|
27 |
-
Masked Diffusion Language Models (MDLMs)
|
28 |
|
29 |
## Model Usage
|
30 |
|
@@ -42,6 +42,4 @@ inputs = tokenizer(input_sequence, return_tensors="pt")
|
|
42 |
output = model(**inputs)
|
43 |
|
44 |
filled_protein_seq = tokenizer.decode(output.squeeze()) # contains the output protein sequence with filled mask tokens
|
45 |
-
```
|
46 |
-
|
47 |
-
This backbone model can be integrated with the [MDLM formulation](https://github.com/kuleshov-group/mdlm) by setting the model backbone type to "hf_dit" and setting the HuggingFace Model ID to "ChatterjeeLab/MeMDLM"
|
|
|
24 |
|
25 |

|
26 |
|
27 |
+
[Masked Diffusion Language Models (MDLMs)](arxiv.org/pdf/2406.07524), introduced by Sahoo et al, provide strong generative capabilities to BERT-style models. In this work, we pre-train and fine-tune ESM-2-150M protein language model (pLM) on the MDLM objective to scaffold functional motifs and unconditionally generate realistic, high-quality membrane protein sequences.
|
28 |
|
29 |
## Model Usage
|
30 |
|
|
|
42 |
output = model(**inputs)
|
43 |
|
44 |
filled_protein_seq = tokenizer.decode(output.squeeze()) # contains the output protein sequence with filled mask tokens
|
45 |
+
```
|
|
|
|