Syed-Hasan-8503
commited on
Commit
•
5043722
1
Parent(s):
71b64b2
Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,77 @@ library_name: transformers
|
|
10 |
## Overview
|
11 |
This repository contains a compressed version of the Meta Llama-3-8B-Instruct model, utilizing the Palu framework for KV-Cache compression. Palu reduces the hidden dimensions of the KV-Cache through low-rank decomposition, significantly reducing the model's memory footprint while maintaining or enhancing performance.
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
## Key Features
|
14 |
- **Model**: Meta Llama-3-8B-Instruct
|
15 |
- **Compression Framework**: Palu
|
|
|
10 |
## Overview
|
11 |
This repository contains a compressed version of the Meta Llama-3-8B-Instruct model, utilizing the Palu framework for KV-Cache compression. Palu reduces the hidden dimensions of the KV-Cache through low-rank decomposition, significantly reducing the model's memory footprint while maintaining or enhancing performance.
|
12 |
|
13 |
+
|
14 |
+
## Evaluation Results
|
15 |
+
|
16 |
+
Here's a Markdown file to include the results of your comparisons:
|
17 |
+
|
18 |
+
---
|
19 |
+
|
20 |
+
# Meta Llama-3-8B-Instruct: Palu Compression Results
|
21 |
+
|
22 |
+
## Perplexity (PPL)
|
23 |
+
|
24 |
+
| Model | PPL |
|
25 |
+
|----------------------------------------|-----------------|
|
26 |
+
| **meta-llama-3-8b-instruct-palu** | **8.8309** |
|
27 |
+
| **meta-llama-3-8b-instruct (Base)** | **8.2845** |
|
28 |
+
|
29 |
+
## Zero-shot Evaluation
|
30 |
+
|
31 |
+
### meta-llama-3-8b-instruct-palu
|
32 |
+
|
33 |
+
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr |
|
34 |
+
|-----------------|---------|--------|--------|---------|--------|---------|
|
35 |
+
| winogrande | 1 | none | 0 | acc | 0.7277 | ±0.0125 |
|
36 |
+
| arc_challenge | 1 | none | 0 | acc | 0.4949 | ±0.0146 |
|
37 |
+
| | | | 0 | acc_norm| 0.5427 | ±0.0146 |
|
38 |
+
| arc_easy | 1 | none | 0 | acc | 0.7942 | ±0.0083 |
|
39 |
+
| | | | 0 | acc_norm| 0.7551 | ±0.0088 |
|
40 |
+
| piqa | 1 | none | 0 | acc | 0.7655 | ±0.0099 |
|
41 |
+
| | | | 0 | acc_norm| 0.7644 | ±0.0099 |
|
42 |
+
| hellaswag | 1 | none | 0 | acc | 0.5664 | ±0.0049 |
|
43 |
+
| | | | 0 | acc_norm| 0.7511 | ±0.0043 |
|
44 |
+
| openbookqa | 1 | none | 0 | acc | 0.3360 | ±0.0211 |
|
45 |
+
| | | | 0 | acc_norm| 0.4380 | ±0.0222 |
|
46 |
+
|
47 |
+
### meta-llama-3-8b-instruct (Base)
|
48 |
+
|
49 |
+
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr |
|
50 |
+
|-----------------|---------|--------|--------|---------|--------|---------|
|
51 |
+
| winogrande | 1 | none | 0 | acc | 0.7206 | ±0.0126 |
|
52 |
+
| arc_challenge | 1 | none | 0 | acc | 0.5299 | ±0.0146 |
|
53 |
+
| | | | 0 | acc_norm| 0.5683 | ±0.0145 |
|
54 |
+
| arc_easy | 1 | none | 0 | acc | 0.8161 | ±0.0079 |
|
55 |
+
| | | | 0 | acc_norm| 0.7976 | ±0.0082 |
|
56 |
+
| piqa | 1 | none | 0 | acc | 0.7867 | ±0.0096 |
|
57 |
+
| | | | 0 | acc_norm| 0.7856 | ±0.0096 |
|
58 |
+
| hellaswag | 1 | none | 0 | acc | 0.5769 | ±0.0049 |
|
59 |
+
| | | | 0 | acc_norm| 0.7581 | ±0.0043 |
|
60 |
+
| openbookqa | 1 | none | 0 | acc | 0.3420 | ±0.0212 |
|
61 |
+
| | | | 0 | acc_norm| 0.4320 | ±0.0222 |
|
62 |
+
|
63 |
+
## Long-Bench Evaluation
|
64 |
+
|
65 |
+
### triviaqa
|
66 |
+
|
67 |
+
| Model | Score |
|
68 |
+
|----------------------------------------|--------|
|
69 |
+
| **meta-llama-3-8b-instruct-palu** | 89.45 |
|
70 |
+
| **meta-llama-3-8b-instruct (Base)** | 90.56 |
|
71 |
+
|
72 |
+
### qasper
|
73 |
+
|
74 |
+
| Model | Score |
|
75 |
+
|----------------------------------------|--------|
|
76 |
+
| **meta-llama-3-8b-instruct-palu** | 34.92 |
|
77 |
+
| **meta-llama-3-8b-instruct (Base)** | 31.74 |
|
78 |
+
|
79 |
+
---
|
80 |
+
|
81 |
+
This Markdown file should effectively summarize and present the results of your model comparison.
|
82 |
+
|
83 |
+
|
84 |
## Key Features
|
85 |
- **Model**: Meta Llama-3-8B-Instruct
|
86 |
- **Compression Framework**: Palu
|