BerenMillidge commited on
Commit
8dd106c
1 Parent(s): 88a4c87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -7,6 +7,8 @@ Zamba-7B-v1 is a hybrid model between Mamba, a state-space model, and transforme
7
 
8
  Note: the current Huggingface implementation of Zamba performs slower than our internal implementation. We are working to fix this with the Huggingface team.
9
 
 
 
10
  ## Quick start
11
 
12
  ### Presequities
@@ -43,6 +45,17 @@ outputs = model.generate(**input_ids, max_new_tokens=100)
43
  print(tokenizer.decode(outputs[0]))
44
  ```
45
 
 
 
 
 
 
 
 
 
 
 
 
46
  ## Notice
47
 
48
  Zamba is a pretrained base model and therefore does not have any moderation mechanism. In addition, one should not expect good chat performance, as this model was not fine-tuned for chat.
 
7
 
8
  Note: the current Huggingface implementation of Zamba performs slower than our internal implementation. We are working to fix this with the Huggingface team.
9
 
10
+ Our technical report describing the training of Zamba is available [here](https://arxiv.org/abs/2405.16712)
11
+
12
  ## Quick start
13
 
14
  ### Presequities
 
45
  print(tokenizer.decode(outputs[0]))
46
  ```
47
 
48
+ ## Citation
49
+
50
+ If you find Zamba useful in your work please cite it as:
51
+
52
+ @article{glorioso2024zamba,
53
+ title={Zamba: A Compact 7B SSM Hybrid Model},
54
+ author={Glorioso, Paolo and Anthony, Quentin and Tokpanov, Yury and Whittington, James and Pilault, Jonathan and Ibrahim, Adam and Millidge, Beren},
55
+ journal={arXiv preprint arXiv:2405.16712},
56
+ year={2024}
57
+ }
58
+
59
  ## Notice
60
 
61
  Zamba is a pretrained base model and therefore does not have any moderation mechanism. In addition, one should not expect good chat performance, as this model was not fine-tuned for chat.