Commit
•
6740a6a
1
Parent(s):
4de54e6
Update README.md (#1)
Browse files- Update README.md (f3287102ab98d9d51056101e439291670af4acdd)
Co-authored-by: Alexandre Marques <[email protected]>
README.md
CHANGED
@@ -47,13 +47,13 @@ Model evaluation metrics and results. [UPDATE]
|
|
47 |
|
48 |
| Benchmark | Metric | Llama-2-7b | Llama-2-7b-pruned70-retrained |
|
49 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
50 |
-
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot
|
51 |
-
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot |
|
52 |
-
| [WinoGrande](https://arxiv.org/abs/1907.10641) |
|
53 |
-
| [ARC-c](https://arxiv.org/abs/1911.01547) |
|
54 |
-
| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot |
|
55 |
-
| [
|
56 |
-
| [
|
57 |
|
58 |
## Model Training Details
|
59 |
|
|
|
47 |
|
48 |
| Benchmark | Metric | Llama-2-7b | Llama-2-7b-pruned70-retrained |
|
49 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
50 |
+
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot | 46.9% | 36.5% |
|
51 |
+
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot | 78.6% | 74.1% |
|
52 |
+
| [WinoGrande](https://arxiv.org/abs/1907.10641) | 5-shot | 74.0% | 69.5% |
|
53 |
+
| [ARC-c](https://arxiv.org/abs/1911.01547) | 25-shot | 53.1% | 45.4% |
|
54 |
+
| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot | 38.8% | 36.7% |
|
55 |
+
| [GSM8K](https://arxiv.org/abs/2110.14168) | 5-shot | 14.5% | 8.0% |
|
56 |
+
| [HumanEval](https://arxiv.org/abs/2107.03374) | pass@1 | 13.4% | 14.4% |
|
57 |
|
58 |
## Model Training Details
|
59 |
|