mobiuslabsgmbh
/

Llama-2-7b-chat-hf_1bitgs8_hqq

Text Generation

Model card Files Files and versions Community

mobicham commited on Mar 27, 2024

Commit

b9643c4

·

verified ·

1 Parent(s): 2dc97bb

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -28,13 +28,13 @@ The adapter was trained via SFT on random subsets of the following:
 * <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (25K - chosen answers only)
 ## Performance
-| Models            | Llama2-7B (fp16)| Llama2-7B (HQQ-1bit)| Llama2-7B (HQQ+-1bit)| Quip# (2bit)|
 |-------------------|------------------|------------------|------------------|------------------|
 | Wiki Perpexlity   |    5.18          |      9866        |    <b>8.53</b>   |      8.54        |
 | VRAM (GB)         |    13.5          |   <b>1.76</b>    |    1.85          |      2.72        |
 | forward time (sec)|   <b>0.1<b>      |    0.231         |     0.257        |      0.353       |
-| Models            | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ-1bit)| Llama2-7B-chat (HQQ+-1bit)|
 |-------------------|------------------|------------------|------------------|
 | ARC (25-shot)     |    53.67         |  21.59           |  31.14  |
 | HellaSwag (10-shot)|   78.56         |  25.66           |  52.96  |

 * <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (25K - chosen answers only)
 ## Performance
+| Models            | Llama2-7B (fp16)| Llama2-7B (HQQ 1-bit)| Llama2-7B (HQQ+ 1-bit)| Quip# (2-bit)|
 |-------------------|------------------|------------------|------------------|------------------|
 | Wiki Perpexlity   |    5.18          |      9866        |    <b>8.53</b>   |      8.54        |
 | VRAM (GB)         |    13.5          |   <b>1.76</b>    |    1.85          |      2.72        |
 | forward time (sec)|   <b>0.1<b>      |    0.231         |     0.257        |      0.353       |
+| Models            | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ 1-bit)| Llama2-7B-chat (HQQ+ 1-bit)|
 |-------------------|------------------|------------------|------------------|
 | ARC (25-shot)     |    53.67         |  21.59           |  31.14  |
 | HellaSwag (10-shot)|   78.56         |  25.66           |  52.96  |