Watt-Tool-8B-GGUF / scores /Watt-Tool-8B-q4_k_m-naive.mmlu
eaddario's picture
Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores
6583e65 verified
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.
task acc_norm
1 100.00000000
2 50.00000000
3 33.33333333
4 50.00000000
5 40.00000000
6 33.33333333
7 42.85714286
8 50.00000000
9 44.44444444
10 50.00000000
11 45.45454545
12 41.66666667
13 46.15384615
14 42.85714286
15 46.66666667
16 50.00000000
17 52.94117647
18 50.00000000
19 52.63157895
20 55.00000000
21 57.14285714
22 54.54545455
23 52.17391304
24 54.16666667
25 56.00000000
26 53.84615385
27 55.55555556
28 57.14285714
29 58.62068966
30 60.00000000
31 61.29032258
32 62.50000000
33 60.60606061
34 61.76470588
35 62.85714286
36 63.88888889
37 62.16216216
38 63.15789474
39 61.53846154
40 62.50000000
41 60.97560976
42 59.52380952
43 60.46511628
44 59.09090909
45 60.00000000
46 58.69565217
47 57.44680851
48 58.33333333
49 59.18367347
50 58.00000000
51 56.86274510
52 55.76923077
53 54.71698113
54 53.70370370
55 54.54545455
56 55.35714286
57 56.14035088
58 55.17241379
59 54.23728814
60 53.33333333
61 54.09836066
62 53.22580645
63 52.38095238
64 51.56250000
65 50.76923077
66 50.00000000
67 49.25373134
68 48.52941176
69 47.82608696
70 48.57142857
71 49.29577465
72 48.61111111
73 47.94520548
74 48.64864865
75 49.33333333
76 50.00000000
77 50.64935065
78 51.28205128
79 50.63291139
80 51.25000000
81 51.85185185
82 51.21951220
83 51.80722892
84 52.38095238
85 51.76470588
86 51.16279070
87 51.72413793
88 52.27272727
89 51.68539326
90 51.11111111
91 50.54945055
92 50.00000000
93 49.46236559
94 50.00000000
95 49.47368421
96 50.00000000
97 50.51546392
98 50.00000000
99 50.50505051
100 50.00000000
101 49.50495050
102 49.01960784
103 48.54368932
104 49.03846154
105 49.52380952
106 50.00000000
107 49.53271028
108 50.00000000
109 50.45871560
110 50.90909091
111 51.35135135
112 50.89285714
113 50.44247788
114 50.87719298
115 51.30434783
116 51.72413793
117 51.28205128
118 50.84745763
119 51.26050420
120 50.83333333
121 51.23966942
122 50.81967213
123 50.40650407
124 50.00000000
125 50.40000000
126 50.00000000
127 49.60629921
128 49.21875000
129 48.83720930
130 48.46153846
131 48.85496183
132 49.24242424
133 48.87218045
134 48.50746269
135 48.14814815
136 47.79411765
137 48.17518248
138 47.82608696
139 48.20143885
140 48.57142857
141 48.93617021
142 49.29577465
143 49.65034965
144 50.00000000
145 50.34482759
146 50.68493151
147 50.34013605
148 50.67567568
149 50.33557047
150 50.00000000
151 49.66887417
152 50.00000000
153 50.32679739
154 50.00000000
155 49.67741935
156 49.35897436
157 49.68152866
158 50.00000000
159 49.68553459
160 50.00000000
161 49.68944099
162 49.38271605
163 49.07975460
164 48.78048780
165 49.09090909
166 48.79518072
167 49.10179641
168 48.80952381
169 49.11242604
170 48.82352941
171 49.12280702
172 48.83720930
173 48.55491329
174 48.85057471
175 48.57142857
176 48.29545455
177 48.02259887
178 47.75280899
179 48.04469274
180 47.77777778
181 47.51381215
182 47.25274725
183 47.54098361
184 47.28260870
185 47.56756757
186 47.84946237
187 47.59358289
188 47.87234043
189 48.14814815
190 47.89473684
191 48.16753927
192 47.91666667
193 47.66839378
194 47.93814433
195 48.20512821
196 48.46938776
197 48.22335025
198 48.48484848
199 48.24120603
200 48.00000000
201 47.76119403
202 47.52475248
203 47.29064039
204 47.05882353
205 47.31707317
206 47.08737864
207 47.34299517
208 47.11538462
209 46.88995215
210 46.66666667
211 46.91943128
212 46.69811321
213 46.47887324
214 46.26168224
215 46.51162791
216 46.75925926
217 46.54377880
218 46.78899083
219 47.03196347
220 47.27272727
221 47.05882353
222 46.84684685
223 47.08520179
224 47.32142857
225 47.55555556
226 47.34513274
227 47.13656388
228 47.36842105
229 47.16157205
230 46.95652174
231 46.75324675
232 46.98275862
233 47.21030043
234 47.43589744
235 47.23404255
236 47.03389831
237 46.83544304
238 46.63865546
239 46.44351464
240 46.66666667
241 46.47302905
242 46.69421488
243 46.91358025
244 46.72131148
245 46.93877551
246 46.74796748
247 46.96356275
248 46.77419355
249 46.58634538
250 46.40000000
251 46.21513944
252 46.03174603
253 45.84980237
254 45.66929134
255 45.49019608
256 45.70312500
257 45.91439689
258 45.73643411
259 45.55984556
260 45.76923077
261 45.59386973
262 45.41984733
263 45.24714829
264 45.07575758
265 45.28301887
266 45.48872180
267 45.31835206
268 45.14925373
269 45.35315985
270 45.55555556
271 45.38745387
272 45.58823529
273 45.42124542
274 45.25547445
275 45.09090909
276 45.28985507
277 45.48736462
278 45.32374101
279 45.16129032
280 45.00000000
281 44.83985765
282 44.68085106
283 44.52296820
284 44.36619718
285 44.21052632
286 44.05594406
287 44.25087108
288 44.09722222
289 43.94463668
290 44.13793103
291 44.32989691
292 44.52054795
293 44.70989761
294 44.55782313
295 44.40677966
296 44.25675676
297 44.10774411
298 44.29530201
299 44.48160535
300 44.66666667
301 44.51827243
302 44.70198675
303 44.55445545
304 44.40789474
305 44.26229508
306 44.11764706
307 44.29967427
308 44.15584416
309 44.01294498
310 43.87096774
311 44.05144695
312 44.23076923
313 44.08945687
314 43.94904459
315 43.80952381
316 43.67088608
317 43.84858044
318 43.71069182
319 43.88714734
320 43.75000000
321 43.61370717
322 43.78881988
323 43.65325077
324 43.51851852
325 43.69230769
326 43.55828221
327 43.73088685
328 43.59756098
329 43.46504559
330 43.33333333
331 43.20241692
332 43.37349398
333 43.54354354
334 43.41317365
335 43.58208955
336 43.75000000
337 43.62017804
338 43.49112426
339 43.36283186
340 43.52941176
341 43.40175953
342 43.56725146
343 43.44023324
344 43.31395349
345 43.18840580
346 43.35260116
347 43.51585014
348 43.39080460
349 43.26647564
350 43.42857143
351 43.30484330
352 43.18181818
353 43.34277620
354 43.22033898
355 43.09859155
356 43.25842697
357 43.13725490
358 43.01675978
359 42.89693593
360 43.05555556
361 43.21329640
362 43.37016575
363 43.25068871
364 43.13186813
365 43.01369863
366 43.16939891
367 43.05177112
368 42.93478261
369 42.81842818
370 42.97297297
371 42.85714286
372 42.74193548
373 42.62734584
374 42.51336898
375 42.66666667
376 42.55319149
377 42.70557029
378 42.85714286
379 42.74406332
380 42.63157895
381 42.78215223
382 42.93193717
383 43.08093995
384 42.96875000
385 43.11688312
386 43.00518135
387 42.89405685
388 42.78350515
389 42.93059126
390 43.07692308
391 43.22250639
392 43.36734694
393 43.25699746
394 43.14720812
395 43.29113924
396 43.43434343
397 43.32493703
398 43.21608040
399 43.35839599
400 43.25000000
401 43.39152120
402 43.53233831
403 43.42431762
404 43.56435644
405 43.70370370
406 43.59605911
407 43.48894349
408 43.38235294
409 43.52078240
410 43.41463415
411 43.30900243
412 43.20388350
413 43.09927361
414 42.99516908
415 42.89156627
416 43.02884615
417 42.92565947
418 42.82296651
419 42.95942721
420 43.09523810
421 43.23040380
422 43.12796209
423 43.02600473
424 43.16037736
425 43.29411765
426 43.19248826
427 43.09133489
428 42.99065421
429 43.12354312
430 43.02325581
431 43.15545244
432 43.28703704
433 43.18706697
434 43.08755760
435 42.98850575
436 42.88990826
437 42.79176201
438 42.92237443
439 43.05239180
440 42.95454545
441 42.85714286
442 42.76018100
443 42.66365688
444 42.56756757
445 42.47191011
446 42.60089686
447 42.50559284
448 42.41071429
449 42.31625835
450 42.44444444
451 42.57206208
452 42.47787611
453 42.60485651
454 42.51101322
455 42.41758242
456 42.54385965
457 42.66958425
458 42.57641921
459 42.48366013
460 42.60869565
461 42.51626898
462 42.42424242
463 42.33261339
464 42.45689655
465 42.36559140
466 42.27467811
467 42.39828694
468 42.52136752
469 42.64392324
470 42.55319149
471 42.46284501
472 42.37288136
473 42.49471459
474 42.40506329
475 42.31578947
476 42.43697479
477 42.34800839
478 42.46861925
479 42.37995825
480 42.50000000
481 42.41164241
482 42.32365145
483 42.23602484
484 42.14876033
485 42.06185567
486 42.18106996
487 42.29979466
488 42.21311475
489 42.12678937
490 42.24489796
491 42.15885947
492 42.07317073
493 41.98782961
494 42.10526316
495 42.22222222
496 42.33870968
497 42.25352113
498 42.16867470
499 42.08416834
500 42.20000000
501 42.11576846
502 42.03187251
503 41.94831014
504 41.86507937
505 41.78217822
506 41.89723320
507 41.81459566
508 41.92913386
509 41.84675835
510 41.96078431
511 41.87866928
512 41.79687500
513 41.71539961
514 41.63424125
515 41.74757282
516 41.86046512
517 41.97292070
518 42.08494208
519 42.19653179
520 42.30769231
521 42.22648752
522 42.33716475
523 42.25621415
524 42.36641221
525 42.28571429
526 42.20532319
527 42.12523719
528 42.04545455
529 41.96597353
530 42.07547170
531 41.99623352
532 41.91729323
533 41.83864916
534 41.94756554
535 42.05607477
536 41.97761194
537 42.08566108
538 42.19330855
539 42.11502783
540 42.03703704
541 41.95933457
542 42.06642066
543 41.98895028
544 42.09558824
545 42.01834862
546 41.94139194
547 41.86471664
548 41.97080292
549 41.89435337
550 41.81818182
551 41.92377495
552 41.84782609
553 41.77215190
554 41.69675090
555 41.62162162
556 41.72661871
557 41.83123878
558 41.93548387
559 41.86046512
560 41.78571429
561 41.71122995
562 41.81494662
563 41.91829485
564 41.84397163
565 41.94690265
566 41.87279152
567 41.79894180
568 41.72535211
569 41.65202109
570 41.57894737
571 41.50612960
572 41.60839161
573 41.53577661
574 41.46341463
575 41.39130435
576 41.49305556
577 41.59445407
578 41.69550173
579 41.62348877
580 41.72413793
581 41.82444062
582 41.75257732
583 41.85248714
584 41.95205479
585 41.88034188
586 41.97952218
587 41.90800681
588 41.83673469
589 41.76570458
590 41.69491525
591 41.62436548
592 41.55405405
593 41.65261383
594 41.75084175
595 41.68067227
596 41.77852349
597 41.70854271
598 41.63879599
599 41.73622705
600 41.66666667
601 41.59733777
602 41.52823920
603 41.62520730
604 41.55629139
605 41.48760331
606 41.41914191
607 41.51565074
608 41.44736842
609 41.37931034
610 41.31147541
611 41.40752864
612 41.33986928
613 41.43556281
614 41.53094463
615 41.46341463
616 41.55844156
617 41.65316045
618 41.58576052
619 41.51857835
620 41.45161290
621 41.38486312
622 41.31832797
623 41.25200642
624 41.18589744
625 41.12000000
626 41.21405751
627 41.14832536
628 41.08280255
629 41.17647059
630 41.26984127
631 41.36291601
632 41.45569620
633 41.39020537
634 41.32492114
635 41.25984252
636 41.19496855
637 41.13029827
638 41.06583072
639 41.15805947
640 41.25000000
641 41.18564743
642 41.12149533
643 41.05754277
644 41.14906832
645 41.08527132
646 41.02167183
647 40.95826893
648 40.89506173
649 40.83204931
650 40.76923077
651 40.70660522
652 40.64417178
653 40.58192956
654 40.67278287
655 40.61068702
656 40.70121951
657 40.63926941
658 40.72948328
659 40.66767830
660 40.60606061
661 40.54462935
662 40.48338369
663 40.42232278
664 40.51204819
665 40.60150376
666 40.54054054
667 40.62968516
668 40.71856287
669 40.80717489
670 40.89552239
671 40.83457526
672 40.77380952
673 40.71322437
674 40.65281899
675 40.59259259
676 40.53254438
677 40.47267356
678 40.41297935
679 40.35346097
680 40.44117647
681 40.52863436
682 40.46920821
683 40.40995608
684 40.49707602
685 40.43795620
686 40.37900875
687 40.46579330
688 40.40697674
689 40.34833091
690 40.28985507
691 40.37626628
692 40.31791908
693 40.25974026
694 40.34582133
695 40.28776978
696 40.37356322
697 40.31563845
698 40.25787966
699 40.34334764
700 40.28571429
701 40.37089872
702 40.45584046
703 40.54054054
704 40.62500000
705 40.56737589
706 40.50991501
707 40.45261669
708 40.39548023
709 40.47954866
710 40.42253521
711 40.50632911
712 40.58988764
713 40.67321178
714 40.75630252
715 40.69930070
716 40.64245810
717 40.58577406
718 40.52924791
719 40.47287900
720 40.41666667
721 40.36061026
722 40.30470914
723 40.38727524
724 40.33149171
725 40.41379310
726 40.35812672
727 40.30261348
728 40.38461538
729 40.32921811
730 40.27397260
731 40.21887825
732 40.16393443
733 40.24556617
734 40.32697548
735 40.40816327
736 40.35326087
737 40.43419267
738 40.51490515
739 40.59539919
740 40.54054054
741 40.48582996
742 40.56603774
743 40.64602961
744 40.72580645
745 40.67114094
746 40.61662198
747 40.56224900
748 40.50802139
749 40.45393858
750 40.53333333
Final result: 40.5333 +/- 1.7939
Random chance: 25.0000 +/- 1.5822