Zyphra
/

Mamba-370M

yury-zyphra commited on Mar 13, 2024

Commit

ba3dce0

verified ·

1 Parent(s): 724d7c7

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -10,4 +10,13 @@ from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
 model = MambaLMHeadModel.from_pretrained("Zyphra/Mamba-370M", iteration=10_000, device="cuda")
 ```
-If iteration is not specified, then the model from the root of the repository is loaded, which is the final iteration (610,351).

 model = MambaLMHeadModel.from_pretrained("Zyphra/Mamba-370M", iteration=10_000, device="cuda")
 ```
+If iteration is not specified, then the model from the root of the repository is loaded, which is the final iteration (610,351).
+Here is a snippet for text generation:
+```
+import transformers, torch
+tokenizer = transformers.AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
+inp_ids = torch.as_tensor([tokenizer.encode("Hello! How are you?")]).to("cuda")
+out_ids = model.generate(inp_ids, max_length=100)
+print(tokenizer.decode(out_ids[0]))
+```