Update README.md
Browse files
README.md
CHANGED
@@ -28,13 +28,13 @@ The adapter was trained via SFT on random subsets of the following:
|
|
28 |
* <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (25K - chosen answers only)
|
29 |
|
30 |
## Performance
|
31 |
-
| Models | Llama2-7B (fp16)| Llama2-7B (HQQ-
|
32 |
|-------------------|------------------|------------------|------------------|------------------|
|
33 |
| Wiki Perpexlity | 5.18 | 9866 | <b>8.53</b> | 8.54 |
|
34 |
| VRAM (GB) | 13.5 | <b>1.76</b> | 1.85 | 2.72 |
|
35 |
| forward time (sec)| <b>0.1<b> | 0.231 | 0.257 | 0.353 |
|
36 |
|
37 |
-
| Models | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ-
|
38 |
|-------------------|------------------|------------------|------------------|
|
39 |
| ARC (25-shot) | 53.67 | 21.59 | 31.14 |
|
40 |
| HellaSwag (10-shot)| 78.56 | 25.66 | 52.96 |
|
|
|
28 |
* <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (25K - chosen answers only)
|
29 |
|
30 |
## Performance
|
31 |
+
| Models | Llama2-7B (fp16)| Llama2-7B (HQQ 1-bit)| Llama2-7B (HQQ+ 1-bit)| Quip# (2-bit)|
|
32 |
|-------------------|------------------|------------------|------------------|------------------|
|
33 |
| Wiki Perpexlity | 5.18 | 9866 | <b>8.53</b> | 8.54 |
|
34 |
| VRAM (GB) | 13.5 | <b>1.76</b> | 1.85 | 2.72 |
|
35 |
| forward time (sec)| <b>0.1<b> | 0.231 | 0.257 | 0.353 |
|
36 |
|
37 |
+
| Models | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ 1-bit)| Llama2-7B-chat (HQQ+ 1-bit)|
|
38 |
|-------------------|------------------|------------------|------------------|
|
39 |
| ARC (25-shot) | 53.67 | 21.59 | 31.14 |
|
40 |
| HellaSwag (10-shot)| 78.56 | 25.66 | 52.96 |
|