Update README.md
Browse files
README.md
CHANGED
@@ -48,8 +48,7 @@ This model performs significantly worse than Memphis-CoT on benchmarks, despite
|
|
48 |
|
49 |
| Model | GSM8K (5-shot) | AGIEval (English/Nous subset, acc_norm) | BIG Bench Hard (CoT, few-shot*) |
|
50 |
|:---------------------------------------------------------------------------|:---------------|:----------------------------------------|:------------------------------ |
|
51 |
-
| [StableLM 3B Base](https://hf.co/stabilityai/stablelm-3b-4e1t) |
|
52 |
| [Memphis-CoT 3B](https://hf.co/euclaise/Memphis-CoT-3B) | 13.8% | 26.24% | 38.24% |
|
53 |
| [Memphis-scribe 3B alpha](https://hf.co/euclaise/Memphis-scribe-3B-alpha) | 12.28% | 23.92% | |
|
54 |
-
|
55 |
-
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|
|
|
48 |
|
49 |
| Model | GSM8K (5-shot) | AGIEval (English/Nous subset, acc_norm) | BIG Bench Hard (CoT, few-shot*) |
|
50 |
|:---------------------------------------------------------------------------|:---------------|:----------------------------------------|:------------------------------ |
|
51 |
+
| [StableLM 3B Base](https://hf.co/stabilityai/stablelm-3b-4e1t) | 2.05% | 25.14% | 36.75% |
|
52 |
| [Memphis-CoT 3B](https://hf.co/euclaise/Memphis-CoT-3B) | 13.8% | 26.24% | 38.24% |
|
53 |
| [Memphis-scribe 3B alpha](https://hf.co/euclaise/Memphis-scribe-3B-alpha) | 12.28% | 23.92% | |
|
54 |
+
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|
|