Update README.md
Browse files
README.md
CHANGED
@@ -143,12 +143,23 @@ print(make_table(results))
|
|
143 |
|----------------------------------|-------------|-------------------|
|
144 |
| | Phi-4 mini-Ins | phi4-mini-8dq4w |
|
145 |
| **Popular aggregated benchmark** | | |
|
146 |
-
| mmlu | 66.73
|
147 |
-
| mmlu_pro | 44.71
|
148 |
| **Reasoning** | | |
|
149 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
150 |
| **Multilingual** | | |
|
|
|
|
|
151 |
| **Math** | | |
|
|
|
|
|
152 |
| **Overall** | **TODO** | **TODO** |
|
153 |
|
154 |
|
|
|
143 |
|----------------------------------|-------------|-------------------|
|
144 |
| | Phi-4 mini-Ins | phi4-mini-8dq4w |
|
145 |
| **Popular aggregated benchmark** | | |
|
146 |
+
| mmlu | 66.73 | 63.11 |
|
147 |
+
| mmlu_pro | 44.71 | 35.31 |
|
148 |
| **Reasoning** | | |
|
149 |
+
| arc_challenge | TODO | TODO |
|
150 |
+
| gpqa | TODO | TODO |
|
151 |
+
| hellaswag | 54.57 | 53.24 |
|
152 |
+
| openbookqa | TODO | TODO |
|
153 |
+
| piqa | TODO | TODO |
|
154 |
+
| siqa | TODO | TODO |
|
155 |
+
| truthfulqa | TODO | TODO |
|
156 |
+
| winogrande | TODO | TODO |
|
157 |
| **Multilingual** | | |
|
158 |
+
| Mgsm | TODO | TODO |
|
159 |
+
| mgsm_cot_native | TODO | TODO |
|
160 |
| **Math** | | |
|
161 |
+
| gsm8k | TODO | TODO |
|
162 |
+
| Mathqa | TODO | TODO |
|
163 |
| **Overall** | **TODO** | **TODO** |
|
164 |
|
165 |
|