Zyphra
/

Zamba2-1.2B-instruct

Text Generation

Inference Endpoints

Model card Files Files and versions Community

qanthony-z commited on Oct 2

Commit

7274961

•

1 Parent(s): 1cb4251

update bar charts

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -54,11 +54,12 @@ print((tokenizer.decode(outputs[0])))
 Zamba2-1.2B-Instruct achieves leading instruction-following and multi-turn chat performance for a model of its size and matches strong models significantly larger. For instance, Zamba2-1.2B-Instruct outperforms Gemma2-2B-Instruct, a very strong model over 2x its size.
 <center>
-<img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/UdlseqQFDxYvEdLV5xurw.png" width="900"/>
 </center>
-| Model | Size | MT-Bench | IFEval |
-|-------------|----|----|----|
 | **Zamba2-1.2B-Instruct** | 1.2B | **59.53** | **41.45** |
 | Gemma2-2B-Instruct | 2.7B | 51.69 | 42.20 |
 | H2O-Danube-1.8B-Chat | 1.6B | 49.78 | 27.95 |
@@ -69,7 +70,7 @@ Zamba2-1.2B-Instruct achieves leading instruction-following and multi-turn chat
 Moreover, due to its unique hybrid SSM architecture, Zamba2-1.2B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
 <center>
-<img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/Q82BVdIppSyqPBHYEAjAl.png" width="700" alt="Zamba performance">
 </center>

 Zamba2-1.2B-Instruct achieves leading instruction-following and multi-turn chat performance for a model of its size and matches strong models significantly larger. For instance, Zamba2-1.2B-Instruct outperforms Gemma2-2B-Instruct, a very strong model over 2x its size.
 <center>
+<img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/ceOUHVeJPhBgwTDCsR9Y6.png" width="900"/>
 </center>
+| Model | Size | Aggregate MT-Bench | IFEval |
+|:-------------:|:----:|:-------------:|:----:|
 | **Zamba2-1.2B-Instruct** | 1.2B | **59.53** | **41.45** |
 | Gemma2-2B-Instruct | 2.7B | 51.69 | 42.20 |
 | H2O-Danube-1.8B-Chat | 1.6B | 49.78 | 27.95 |
 Moreover, due to its unique hybrid SSM architecture, Zamba2-1.2B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
 <center>
+<img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/tQ-j1krA634EfTU1Lp3E7.png" width="700" alt="Zamba performance">
 </center>