Update README.md
Browse files
README.md
CHANGED
@@ -24,4 +24,19 @@ Benchmarks for this model:
|
|
24 |
|
25 |
Benchmarks for base Qwen/QwQ-32B model:
|
26 |
|
27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
Benchmarks for base Qwen/QwQ-32B model:
|
26 |
|
27 |
+
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
28 |
+
|--------------|------:|------|-----:|----------|---|-----:|---|-----:|
|
29 |
+
|arc_challenge | 1|none | 0|acc |↑ |0.5367|± |0.0146|
|
30 |
+
| | |none | 0|acc_norm |↑ |0.5563|± |0.0145|
|
31 |
+
|arc_easy | 1|none | 0|acc |↑ |0.8102|± |0.0080|
|
32 |
+
| | |none | 0|acc_norm |↑ |0.7866|± |0.0084|
|
33 |
+
|hellaswag | 1|none | 0|acc |↑ |0.6516|± |0.0048|
|
34 |
+
| | |none | 0|acc_norm |↑ |0.8407|± |0.0037|
|
35 |
+
|lambada_openai| 1|none | 0|acc |↑ |0.6683|± |0.0066|
|
36 |
+
| | |none | 0|perplexity|↓ |3.8310|± |0.0893|
|
37 |
+
|piqa | 1|none | 0|acc |↑ |0.7976|± |0.0094|
|
38 |
+
| | |none | 0|acc_norm |↑ |0.8118|± |0.0091|
|
39 |
+
|sciq | 1|none | 0|acc |↑ |0.9630|± |0.0060|
|
40 |
+
| | |none | 0|acc_norm |↑ |0.9490|± |0.0070|
|
41 |
+
|winogrande | 1|none | 0|acc |↑ |0.7048|± |0.0128|
|
42 |
+
|mmlu | 2|none | |acc |↑ |0.7985|± |0.0032|
|