Watt-Tool-8B-GGUF / scores /Watt-Tool-8B-q4_k_m.mmlu
eaddario's picture
Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores
6583e65 verified
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.
task acc_norm
1 100.00000000
2 50.00000000
3 33.33333333
4 50.00000000
5 40.00000000
6 33.33333333
7 42.85714286
8 50.00000000
9 44.44444444
10 50.00000000
11 45.45454545
12 50.00000000
13 46.15384615
14 42.85714286
15 46.66666667
16 50.00000000
17 52.94117647
18 50.00000000
19 52.63157895
20 55.00000000
21 57.14285714
22 54.54545455
23 52.17391304
24 54.16666667
25 56.00000000
26 53.84615385
27 55.55555556
28 57.14285714
29 58.62068966
30 60.00000000
31 58.06451613
32 59.37500000
33 57.57575758
34 58.82352941
35 60.00000000
36 61.11111111
37 62.16216216
38 63.15789474
39 61.53846154
40 60.00000000
41 58.53658537
42 57.14285714
43 58.13953488
44 56.81818182
45 57.77777778
46 56.52173913
47 55.31914894
48 56.25000000
49 57.14285714
50 56.00000000
51 54.90196078
52 53.84615385
53 52.83018868
54 51.85185185
55 52.72727273
56 53.57142857
57 54.38596491
58 53.44827586
59 52.54237288
60 51.66666667
61 52.45901639
62 51.61290323
63 50.79365079
64 50.00000000
65 49.23076923
66 48.48484848
67 47.76119403
68 47.05882353
69 46.37681159
70 47.14285714
71 47.88732394
72 47.22222222
73 46.57534247
74 47.29729730
75 48.00000000
76 48.68421053
77 49.35064935
78 50.00000000
79 49.36708861
80 50.00000000
81 50.61728395
82 50.00000000
83 50.60240964
84 51.19047619
85 50.58823529
86 50.00000000
87 50.57471264
88 51.13636364
89 50.56179775
90 50.00000000
91 49.45054945
92 48.91304348
93 48.38709677
94 48.93617021
95 48.42105263
96 48.95833333
97 49.48453608
98 48.97959184
99 49.49494949
100 49.00000000
101 48.51485149
102 48.03921569
103 47.57281553
104 48.07692308
105 48.57142857
106 49.05660377
107 48.59813084
108 49.07407407
109 49.54128440
110 50.00000000
111 50.45045045
112 50.00000000
113 49.55752212
114 50.00000000
115 50.43478261
116 50.00000000
117 49.57264957
118 49.15254237
119 49.57983193
120 49.16666667
121 49.58677686
122 49.18032787
123 48.78048780
124 48.38709677
125 48.80000000
126 48.41269841
127 48.03149606
128 47.65625000
129 47.28682171
130 46.92307692
131 46.56488550
132 46.96969697
133 46.61654135
134 46.26865672
135 45.92592593
136 45.58823529
137 45.25547445
138 44.92753623
139 45.32374101
140 45.71428571
141 46.09929078
142 46.47887324
143 46.85314685
144 47.22222222
145 47.58620690
146 47.94520548
147 47.61904762
148 47.97297297
149 47.65100671
150 47.33333333
151 47.01986755
152 47.36842105
153 47.71241830
154 47.40259740
155 47.09677419
156 46.79487179
157 47.13375796
158 47.46835443
159 47.16981132
160 47.50000000
161 47.20496894
162 46.91358025
163 46.62576687
164 46.34146341
165 46.66666667
166 46.38554217
167 46.70658683
168 46.42857143
169 46.74556213
170 46.47058824
171 46.78362573
172 46.51162791
173 46.24277457
174 45.97701149
175 45.71428571
176 46.02272727
177 45.76271186
178 45.50561798
179 45.81005587
180 45.55555556
181 45.30386740
182 45.05494505
183 45.35519126
184 45.10869565
185 45.40540541
186 45.69892473
187 45.45454545
188 45.74468085
189 46.03174603
190 45.78947368
191 46.07329843
192 45.83333333
193 45.59585492
194 45.87628866
195 46.15384615
196 46.42857143
197 46.19289340
198 46.46464646
199 46.23115578
200 46.00000000
201 45.77114428
202 45.54455446
203 45.32019704
204 45.09803922
205 45.36585366
206 45.14563107
207 45.41062802
208 45.19230769
209 44.97607656
210 44.76190476
211 45.02369668
212 44.81132075
213 44.60093897
214 44.39252336
215 44.65116279
216 44.90740741
217 44.70046083
218 44.95412844
219 45.20547945
220 45.45454545
221 45.24886878
222 45.04504505
223 45.29147982
224 45.08928571
225 45.33333333
226 45.13274336
227 44.93392070
228 45.17543860
229 44.97816594
230 44.78260870
231 45.02164502
232 45.25862069
233 45.06437768
234 45.29914530
235 45.10638298
236 44.91525424
237 44.72573840
238 44.53781513
239 44.35146444
240 44.58333333
241 44.39834025
242 44.62809917
243 44.85596708
244 44.67213115
245 44.89795918
246 44.71544715
247 44.93927126
248 44.75806452
249 44.57831325
250 44.40000000
251 44.22310757
252 44.04761905
253 43.87351779
254 43.70078740
255 43.52941176
256 43.75000000
257 43.96887160
258 43.79844961
259 43.62934363
260 43.84615385
261 43.67816092
262 43.51145038
263 43.34600760
264 43.18181818
265 43.01886792
266 43.23308271
267 43.07116105
268 42.91044776
269 43.12267658
270 43.33333333
271 43.17343173
272 43.38235294
273 43.22344322
274 43.06569343
275 42.90909091
276 43.11594203
277 43.32129964
278 43.16546763
279 43.01075269
280 42.85714286
281 42.70462633
282 42.55319149
283 42.40282686
284 42.25352113
285 42.10526316
286 41.95804196
287 42.16027875
288 42.01388889
289 41.86851211
290 42.06896552
291 42.26804124
292 42.46575342
293 42.66211604
294 42.51700680
295 42.37288136
296 42.22972973
297 42.08754209
298 42.28187919
299 42.47491639
300 42.66666667
301 42.52491694
302 42.71523179
303 42.57425743
304 42.43421053
305 42.29508197
306 42.15686275
307 42.34527687
308 42.20779221
309 42.07119741
310 41.93548387
311 42.12218650
312 42.30769231
313 42.17252396
314 42.03821656
315 41.90476190
316 41.77215190
317 41.95583596
318 41.82389937
319 42.00626959
320 41.87500000
321 41.74454829
322 41.92546584
323 41.79566563
324 41.66666667
325 41.84615385
326 41.71779141
327 41.89602446
328 41.76829268
329 41.64133739
330 41.51515152
331 41.38972810
332 41.26506024
333 41.44144144
334 41.31736527
335 41.49253731
336 41.66666667
337 41.54302671
338 41.42011834
339 41.29793510
340 41.47058824
341 41.64222874
342 41.81286550
343 41.69096210
344 41.56976744
345 41.44927536
346 41.61849711
347 41.78674352
348 41.66666667
349 41.54727794
350 41.71428571
351 41.59544160
352 41.47727273
353 41.64305949
354 41.52542373
355 41.40845070
356 41.57303371
357 41.45658263
358 41.34078212
359 41.22562674
360 41.38888889
361 41.55124654
362 41.71270718
363 41.59779614
364 41.48351648
365 41.36986301
366 41.53005464
367 41.41689373
368 41.30434783
369 41.19241192
370 41.35135135
371 41.23989218
372 41.12903226
373 41.01876676
374 40.90909091
375 41.06666667
376 40.95744681
377 41.11405836
378 41.26984127
379 41.16094987
380 41.05263158
381 41.20734908
382 41.36125654
383 41.51436031
384 41.40625000
385 41.55844156
386 41.45077720
387 41.34366925
388 41.23711340
389 41.38817481
390 41.53846154
391 41.68797954
392 41.83673469
393 41.73027990
394 41.62436548
395 41.77215190
396 41.66666667
397 41.81360202
398 41.70854271
399 41.85463659
400 41.75000000
401 41.89526185
402 42.03980100
403 41.93548387
404 42.07920792
405 42.22222222
406 42.11822660
407 42.01474201
408 41.91176471
409 42.05378973
410 41.95121951
411 41.84914842
412 41.74757282
413 41.64648910
414 41.54589372
415 41.44578313
416 41.58653846
417 41.48681055
418 41.38755981
419 41.52744630
420 41.42857143
421 41.56769596
422 41.46919431
423 41.37115839
424 41.50943396
425 41.64705882
426 41.78403756
427 41.92037471
428 41.82242991
429 41.95804196
430 41.86046512
431 41.99535963
432 42.12962963
433 42.03233256
434 41.93548387
435 41.83908046
436 41.74311927
437 41.64759725
438 41.78082192
439 41.91343964
440 41.81818182
441 41.72335601
442 41.62895928
443 41.53498871
444 41.44144144
445 41.34831461
446 41.25560538
447 41.16331096
448 41.07142857
449 40.97995546
450 41.11111111
451 41.24168514
452 41.15044248
453 41.28035320
454 41.18942731
455 41.09890110
456 41.22807018
457 41.35667396
458 41.26637555
459 41.17647059
460 41.30434783
461 41.21475054
462 41.12554113
463 41.03671706
464 41.16379310
465 41.07526882
466 40.98712446
467 41.11349036
468 41.23931624
469 41.36460554
470 41.27659574
471 41.18895966
472 41.10169492
473 41.22621564
474 41.13924051
475 41.05263158
476 41.17647059
477 41.09014675
478 41.21338912
479 41.12734864
480 41.25000000
481 41.16424116
482 41.07883817
483 40.99378882
484 40.90909091
485 40.82474227
486 40.94650206
487 41.06776181
488 40.98360656
489 40.89979550
490 41.02040816
491 40.93686354
492 40.85365854
493 40.77079108
494 40.89068826
495 41.01010101
496 41.12903226
497 41.04627767
498 40.96385542
499 40.88176353
500 41.00000000
501 40.91816367
502 40.83665339
503 40.75546720
504 40.67460317
505 40.59405941
506 40.71146245
507 40.63116371
508 40.74803150
509 40.66797642
510 40.78431373
511 40.70450098
512 40.62500000
513 40.54580897
514 40.46692607
515 40.38834951
516 40.50387597
517 40.61895551
518 40.73359073
519 40.84778420
520 40.76923077
521 40.69097889
522 40.80459770
523 40.72657744
524 40.83969466
525 40.76190476
526 40.68441065
527 40.60721063
528 40.53030303
529 40.45368620
530 40.56603774
531 40.48964218
532 40.41353383
533 40.33771107
534 40.44943820
535 40.56074766
536 40.48507463
537 40.59590317
538 40.70631970
539 40.63079777
540 40.55555556
541 40.48059150
542 40.59040590
543 40.51565378
544 40.62500000
545 40.55045872
546 40.47619048
547 40.40219378
548 40.51094891
549 40.43715847
550 40.36363636
551 40.47186933
552 40.39855072
553 40.32549729
554 40.43321300
555 40.36036036
556 40.46762590
557 40.57450628
558 40.68100358
559 40.60822898
560 40.53571429
561 40.46345811
562 40.56939502
563 40.67495560
564 40.60283688
565 40.70796460
566 40.63604240
567 40.56437390
568 40.49295775
569 40.42179262
570 40.35087719
571 40.28021016
572 40.38461538
573 40.31413613
574 40.24390244
575 40.17391304
576 40.27777778
577 40.38128250
578 40.48442907
579 40.41450777
580 40.51724138
581 40.61962134
582 40.54982818
583 40.65180103
584 40.75342466
585 40.68376068
586 40.78498294
587 40.71550256
588 40.64625850
589 40.57724958
590 40.50847458
591 40.43993232
592 40.37162162
593 40.47217538
594 40.57239057
595 40.50420168
596 40.60402685
597 40.53601340
598 40.46822742
599 40.56761269
600 40.50000000
601 40.43261231
602 40.36544850
603 40.46434494
604 40.39735099
605 40.33057851
606 40.26402640
607 40.36243822
608 40.29605263
609 40.22988506
610 40.16393443
611 40.26186579
612 40.19607843
613 40.29363785
614 40.39087948
615 40.32520325
616 40.42207792
617 40.51863857
618 40.45307443
619 40.38772213
620 40.32258065
621 40.25764895
622 40.19292605
623 40.12841091
624 40.06410256
625 40.00000000
626 40.09584665
627 40.03189793
628 39.96815287
629 40.06359300
630 40.15873016
631 40.09508716
632 40.18987342
633 40.12638231
634 40.06309148
635 40.00000000
636 39.93710692
637 39.87441130
638 39.81191223
639 39.90610329
640 40.00000000
641 39.93759750
642 39.87538941
643 39.81337481
644 39.90683230
645 39.84496124
646 39.78328173
647 39.72179289
648 39.66049383
649 39.59938367
650 39.53846154
651 39.47772657
652 39.41717791
653 39.35681470
654 39.44954128
655 39.38931298
656 39.48170732
657 39.42161339
658 39.51367781
659 39.45371775
660 39.39393939
661 39.33434191
662 39.27492447
663 39.21568627
664 39.15662651
665 39.24812030
666 39.18918919
667 39.28035982
668 39.37125749
669 39.31240658
670 39.40298507
671 39.34426230
672 39.28571429
673 39.22734027
674 39.16913947
675 39.11111111
676 39.05325444
677 38.99556869
678 38.93805310
679 38.88070692
680 38.97058824
681 39.06020558
682 39.00293255
683 38.94582723
684 39.03508772
685 38.97810219
686 39.06705539
687 39.15574964
688 39.09883721
689 39.04208999
690 38.98550725
691 39.07380608
692 39.01734104
693 38.96103896
694 39.04899135
695 38.99280576
696 38.93678161
697 38.88091822
698 38.96848138
699 39.05579399
700 39.00000000
701 39.08701854
702 39.17378917
703 39.26031294
704 39.34659091
705 39.29078014
706 39.23512748
707 39.17963225
708 39.12429379
709 39.06911142
710 39.01408451
711 39.09985935
712 39.18539326
713 39.27068724
714 39.35574230
715 39.30069930
716 39.24581006
717 39.19107392
718 39.13649025
719 39.08205841
720 39.02777778
721 38.97364771
722 38.91966759
723 39.00414938
724 38.95027624
725 39.03448276
726 39.11845730
727 39.06464924
728 39.14835165
729 39.09465021
730 39.04109589
731 38.98768810
732 38.93442623
733 39.01773533
734 39.10081744
735 39.18367347
736 39.13043478
737 39.21302578
738 39.29539295
739 39.37753721
740 39.32432432
741 39.27125506
742 39.35309973
743 39.43472409
744 39.51612903
745 39.46308725
746 39.41018767
747 39.35742972
748 39.30481283
749 39.25233645
750 39.33333333
Final result: 39.3333 +/- 1.7849
Random chance: 25.0000 +/- 1.5822