guangy10 commited on
Commit
d50e425
·
verified ·
1 Parent(s): 3b45b2f

More metrics update

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -186,13 +186,13 @@ Okay, the user is asking if I can talk to them. First, I need to clarify that I
186
  | bbh | - | - |
187
  | **Reasoning** | | |
188
  | hellaswag | - | 54.39 |
189
- | gpqa_main_zeroshot | - | 27.46 |
190
  | **Multilingual** | | |
191
  | m_mmlu | - | - |
192
- | mgsm_en_cot_en | - | 40.40 |
193
  | **Math** | | |
194
- | gsm8k | - | 58.08 |
195
- | leaderboard_math_hard (v3) | - | 19.94 |
196
  | **Overall** | - | - |
197
 
198
  <details>
 
186
  | bbh | - | - |
187
  | **Reasoning** | | |
188
  | hellaswag | - | 54.39 |
189
+ | gpqa_main_zeroshot | 32.37 | 27.46 |
190
  | **Multilingual** | | |
191
  | m_mmlu | - | - |
192
+ | mgsm_en_cot_en | 66.80 | 40.40 |
193
  | **Math** | | |
194
+ | gsm8k | 72.71 | 58.08 |
195
+ | leaderboard_math_hard (v3) | 27.87 | 19.94 |
196
  | **Overall** | - | - |
197
 
198
  <details>