BerenMillidge
commited on
Commit
•
d7170f2
1
Parent(s):
88bc14a
Update README.md
Browse files
README.md
CHANGED
@@ -51,6 +51,17 @@ Zamba2-2.7B-Instruct punches dramatically above its weight, achieving extremely
|
|
51 |
|
52 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64e40335c0edca443ef8af3e/wXFMLXZA2-xz2PDyUMwTI.png" width="600"/>
|
53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
|
55 |
|
56 |
Time to First Token (TTFT) | Output Generation
|
|
|
51 |
|
52 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64e40335c0edca443ef8af3e/wXFMLXZA2-xz2PDyUMwTI.png" width="600"/>
|
53 |
|
54 |
+
|
55 |
+
| Model | MT-Bench | IFEval
|
56 |
+
|-------------|-----|----|---------------|--------------|
|
57 |
+
| **Zamba2-2.6B-Instruct** | 2.6B | **72.40** | **53.96** |
|
58 |
+
| Mistral-7B-Instruct | 7B | 72.40 | 66.4 | 45.3 |
|
59 |
+
| Gemma2-2B-Instruct | 2.7B | 51.69 | 48.8 |
|
60 |
+
| H2O-Danube-4B-Chat | 4B | 52.57 | 45.44 |
|
61 |
+
| StableLM-Zephyr-3B | 3B | 66.43 | 36.83 |
|
62 |
+
|
63 |
+
|
64 |
+
|
65 |
Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
|
66 |
|
67 |
Time to First Token (TTFT) | Output Generation
|