DanielHesslow
commited on
Commit
•
7eb88c8
1
Parent(s):
936090c
Update README.md
Browse files
README.md
CHANGED
@@ -20,18 +20,32 @@ Model | #Params | d_model | layers | lm loss uniref-100
|
|
20 |
[XLarge](https://huggingface.co/lightonai/RITA_xl)| 1.2B | 2048 | 24 | 1.70
|
21 |
|
22 |
|
23 |
-
|
24 |
|
|
|
25 |
Instantiate a model like so:
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
for generation
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
[XLarge](https://huggingface.co/lightonai/RITA_xl)| 1.2B | 2048 | 24 | 1.70
|
21 |
|
22 |
|
23 |
+
For full results see our preprint: https://arxiv.org/abs/2205.05789
|
24 |
|
25 |
+
## Usage
|
26 |
Instantiate a model like so:
|
27 |
+
``` python
|
28 |
+
from transformers import AutoModel, AutoModelForCausalLM
|
29 |
+
model = AutoModelForCausalLM.from_pretrained("lightonai/RITA_l, trust_remote_code=True")
|
30 |
+
tokenizer = AutoTokenizer.from_pretrained("lightonai/RITA_l")
|
31 |
+
```
|
32 |
+
for generation we support pipelines:
|
33 |
+
``` python
|
34 |
+
from transformers import pipeline
|
35 |
+
rita_gen = pipeline('text-generation', model=model, tokenizer=tokenizer)
|
36 |
+
sequences = rita_gen("MAB", max_length=20, do_sample=True, top_k=950, repetition_penalty=1.2,
|
37 |
+
num_return_sequences=2, eos_token_id=2)
|
38 |
+
for seq in sequences:
|
39 |
+
print(f"seq: {seq['generated_text'].replace(' ', '')}")
|
40 |
+
```
|
41 |
+
|
42 |
+
## How to cite
|
43 |
+
|
44 |
+
@misc{RITA2022,
|
45 |
+
doi = {10.48550/ARXIV.2205.05789},
|
46 |
+
url = {https://arxiv.org/abs/2205.05789},
|
47 |
+
author = {Hesslow, Daniel and Zanichelli, Niccoló and Notin, Pascal and Poli, Iacopo and Marks, Debora},
|
48 |
+
title = {RITA: a Study on Scaling Up Generative Protein Sequence Models},
|
49 |
+
publisher = {arXiv},
|
50 |
+
year = {2022},
|
51 |
+
}
|