multiple_choice_score: there are 1132 tasks in prompt | |
multiple_choice_score: reading tasks.multiple_choice_score: failed to read task 20 of 1132 | |
multiple_choice_score: there are 1132 tasks in prompt | |
multiple_choice_score: reading tasks.multiple_choice_score: failed to read task 20 of 1132 | |