File size: 12,600 Bytes
6583e65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	50.00000000
9	44.44444444
10	50.00000000
11	45.45454545
12	41.66666667
13	46.15384615
14	42.85714286
15	46.66666667
16	50.00000000
17	52.94117647
18	50.00000000
19	52.63157895
20	55.00000000
21	57.14285714
22	54.54545455
23	52.17391304
24	54.16666667
25	56.00000000
26	53.84615385
27	55.55555556
28	57.14285714
29	58.62068966
30	60.00000000
31	61.29032258
32	62.50000000
33	60.60606061
34	61.76470588
35	62.85714286
36	63.88888889
37	62.16216216
38	63.15789474
39	61.53846154
40	62.50000000
41	60.97560976
42	59.52380952
43	60.46511628
44	59.09090909
45	60.00000000
46	58.69565217
47	57.44680851
48	58.33333333
49	59.18367347
50	58.00000000
51	56.86274510
52	55.76923077
53	54.71698113
54	53.70370370
55	54.54545455
56	55.35714286
57	56.14035088
58	55.17241379
59	54.23728814
60	53.33333333
61	54.09836066
62	53.22580645
63	52.38095238
64	51.56250000
65	50.76923077
66	50.00000000
67	49.25373134
68	48.52941176
69	47.82608696
70	48.57142857
71	49.29577465
72	48.61111111
73	47.94520548
74	48.64864865
75	49.33333333
76	50.00000000
77	50.64935065
78	51.28205128
79	50.63291139
80	51.25000000
81	51.85185185
82	51.21951220
83	51.80722892
84	52.38095238
85	51.76470588
86	51.16279070
87	51.72413793
88	52.27272727
89	51.68539326
90	51.11111111
91	50.54945055
92	50.00000000
93	49.46236559
94	50.00000000
95	49.47368421
96	50.00000000
97	50.51546392
98	50.00000000
99	50.50505051
100	50.00000000
101	49.50495050
102	49.01960784
103	48.54368932
104	49.03846154
105	49.52380952
106	50.00000000
107	49.53271028
108	50.00000000
109	50.45871560
110	50.90909091
111	51.35135135
112	50.89285714
113	50.44247788
114	50.87719298
115	51.30434783
116	51.72413793
117	51.28205128
118	50.84745763
119	51.26050420
120	50.83333333
121	51.23966942
122	50.81967213
123	50.40650407
124	50.00000000
125	50.40000000
126	50.00000000
127	49.60629921
128	49.21875000
129	48.83720930
130	48.46153846
131	48.85496183
132	49.24242424
133	48.87218045
134	48.50746269
135	48.14814815
136	47.79411765
137	48.17518248
138	47.82608696
139	48.20143885
140	48.57142857
141	48.93617021
142	49.29577465
143	49.65034965
144	50.00000000
145	50.34482759
146	50.68493151
147	50.34013605
148	50.67567568
149	50.33557047
150	50.00000000
151	49.66887417
152	50.00000000
153	50.32679739
154	50.00000000
155	49.67741935
156	49.35897436
157	49.68152866
158	50.00000000
159	49.68553459
160	50.00000000
161	49.68944099
162	49.38271605
163	49.07975460
164	48.78048780
165	49.09090909
166	48.79518072
167	49.10179641
168	48.80952381
169	49.11242604
170	48.82352941
171	49.12280702
172	48.83720930
173	48.55491329
174	48.85057471
175	48.57142857
176	48.29545455
177	48.02259887
178	47.75280899
179	48.04469274
180	47.77777778
181	47.51381215
182	47.25274725
183	47.54098361
184	47.28260870
185	47.56756757
186	47.84946237
187	47.59358289
188	47.87234043
189	48.14814815
190	47.89473684
191	48.16753927
192	47.91666667
193	47.66839378
194	47.93814433
195	48.20512821
196	48.46938776
197	48.22335025
198	48.48484848
199	48.24120603
200	48.00000000
201	47.76119403
202	47.52475248
203	47.29064039
204	47.05882353
205	47.31707317
206	47.08737864
207	47.34299517
208	47.11538462
209	46.88995215
210	46.66666667
211	46.91943128
212	46.69811321
213	46.47887324
214	46.26168224
215	46.51162791
216	46.75925926
217	46.54377880
218	46.78899083
219	47.03196347
220	47.27272727
221	47.05882353
222	46.84684685
223	47.08520179
224	47.32142857
225	47.55555556
226	47.34513274
227	47.13656388
228	47.36842105
229	47.16157205
230	46.95652174
231	46.75324675
232	46.98275862
233	47.21030043
234	47.43589744
235	47.23404255
236	47.03389831
237	46.83544304
238	46.63865546
239	46.44351464
240	46.66666667
241	46.47302905
242	46.69421488
243	46.91358025
244	46.72131148
245	46.93877551
246	46.74796748
247	46.96356275
248	46.77419355
249	46.58634538
250	46.40000000
251	46.21513944
252	46.03174603
253	45.84980237
254	45.66929134
255	45.49019608
256	45.70312500
257	45.91439689
258	45.73643411
259	45.55984556
260	45.76923077
261	45.59386973
262	45.41984733
263	45.24714829
264	45.07575758
265	45.28301887
266	45.48872180
267	45.31835206
268	45.14925373
269	45.35315985
270	45.55555556
271	45.38745387
272	45.58823529
273	45.42124542
274	45.25547445
275	45.09090909
276	45.28985507
277	45.48736462
278	45.32374101
279	45.16129032
280	45.00000000
281	44.83985765
282	44.68085106
283	44.52296820
284	44.36619718
285	44.21052632
286	44.05594406
287	44.25087108
288	44.09722222
289	43.94463668
290	44.13793103
291	44.32989691
292	44.52054795
293	44.70989761
294	44.55782313
295	44.40677966
296	44.25675676
297	44.10774411
298	44.29530201
299	44.48160535
300	44.66666667
301	44.51827243
302	44.70198675
303	44.55445545
304	44.40789474
305	44.26229508
306	44.11764706
307	44.29967427
308	44.15584416
309	44.01294498
310	43.87096774
311	44.05144695
312	44.23076923
313	44.08945687
314	43.94904459
315	43.80952381
316	43.67088608
317	43.84858044
318	43.71069182
319	43.88714734
320	43.75000000
321	43.61370717
322	43.78881988
323	43.65325077
324	43.51851852
325	43.69230769
326	43.55828221
327	43.73088685
328	43.59756098
329	43.46504559
330	43.33333333
331	43.20241692
332	43.37349398
333	43.54354354
334	43.41317365
335	43.58208955
336	43.75000000
337	43.62017804
338	43.49112426
339	43.36283186
340	43.52941176
341	43.40175953
342	43.56725146
343	43.44023324
344	43.31395349
345	43.18840580
346	43.35260116
347	43.51585014
348	43.39080460
349	43.26647564
350	43.42857143
351	43.30484330
352	43.18181818
353	43.34277620
354	43.22033898
355	43.09859155
356	43.25842697
357	43.13725490
358	43.01675978
359	42.89693593
360	43.05555556
361	43.21329640
362	43.37016575
363	43.25068871
364	43.13186813
365	43.01369863
366	43.16939891
367	43.05177112
368	42.93478261
369	42.81842818
370	42.97297297
371	42.85714286
372	42.74193548
373	42.62734584
374	42.51336898
375	42.66666667
376	42.55319149
377	42.70557029
378	42.85714286
379	42.74406332
380	42.63157895
381	42.78215223
382	42.93193717
383	43.08093995
384	42.96875000
385	43.11688312
386	43.00518135
387	42.89405685
388	42.78350515
389	42.93059126
390	43.07692308
391	43.22250639
392	43.36734694
393	43.25699746
394	43.14720812
395	43.29113924
396	43.43434343
397	43.32493703
398	43.21608040
399	43.35839599
400	43.25000000
401	43.39152120
402	43.53233831
403	43.42431762
404	43.56435644
405	43.70370370
406	43.59605911
407	43.48894349
408	43.38235294
409	43.52078240
410	43.41463415
411	43.30900243
412	43.20388350
413	43.09927361
414	42.99516908
415	42.89156627
416	43.02884615
417	42.92565947
418	42.82296651
419	42.95942721
420	43.09523810
421	43.23040380
422	43.12796209
423	43.02600473
424	43.16037736
425	43.29411765
426	43.19248826
427	43.09133489
428	42.99065421
429	43.12354312
430	43.02325581
431	43.15545244
432	43.28703704
433	43.18706697
434	43.08755760
435	42.98850575
436	42.88990826
437	42.79176201
438	42.92237443
439	43.05239180
440	42.95454545
441	42.85714286
442	42.76018100
443	42.66365688
444	42.56756757
445	42.47191011
446	42.60089686
447	42.50559284
448	42.41071429
449	42.31625835
450	42.44444444
451	42.57206208
452	42.47787611
453	42.60485651
454	42.51101322
455	42.41758242
456	42.54385965
457	42.66958425
458	42.57641921
459	42.48366013
460	42.60869565
461	42.51626898
462	42.42424242
463	42.33261339
464	42.45689655
465	42.36559140
466	42.27467811
467	42.39828694
468	42.52136752
469	42.64392324
470	42.55319149
471	42.46284501
472	42.37288136
473	42.49471459
474	42.40506329
475	42.31578947
476	42.43697479
477	42.34800839
478	42.46861925
479	42.37995825
480	42.50000000
481	42.41164241
482	42.32365145
483	42.23602484
484	42.14876033
485	42.06185567
486	42.18106996
487	42.29979466
488	42.21311475
489	42.12678937
490	42.24489796
491	42.15885947
492	42.07317073
493	41.98782961
494	42.10526316
495	42.22222222
496	42.33870968
497	42.25352113
498	42.16867470
499	42.08416834
500	42.20000000
501	42.11576846
502	42.03187251
503	41.94831014
504	41.86507937
505	41.78217822
506	41.89723320
507	41.81459566
508	41.92913386
509	41.84675835
510	41.96078431
511	41.87866928
512	41.79687500
513	41.71539961
514	41.63424125
515	41.74757282
516	41.86046512
517	41.97292070
518	42.08494208
519	42.19653179
520	42.30769231
521	42.22648752
522	42.33716475
523	42.25621415
524	42.36641221
525	42.28571429
526	42.20532319
527	42.12523719
528	42.04545455
529	41.96597353
530	42.07547170
531	41.99623352
532	41.91729323
533	41.83864916
534	41.94756554
535	42.05607477
536	41.97761194
537	42.08566108
538	42.19330855
539	42.11502783
540	42.03703704
541	41.95933457
542	42.06642066
543	41.98895028
544	42.09558824
545	42.01834862
546	41.94139194
547	41.86471664
548	41.97080292
549	41.89435337
550	41.81818182
551	41.92377495
552	41.84782609
553	41.77215190
554	41.69675090
555	41.62162162
556	41.72661871
557	41.83123878
558	41.93548387
559	41.86046512
560	41.78571429
561	41.71122995
562	41.81494662
563	41.91829485
564	41.84397163
565	41.94690265
566	41.87279152
567	41.79894180
568	41.72535211
569	41.65202109
570	41.57894737
571	41.50612960
572	41.60839161
573	41.53577661
574	41.46341463
575	41.39130435
576	41.49305556
577	41.59445407
578	41.69550173
579	41.62348877
580	41.72413793
581	41.82444062
582	41.75257732
583	41.85248714
584	41.95205479
585	41.88034188
586	41.97952218
587	41.90800681
588	41.83673469
589	41.76570458
590	41.69491525
591	41.62436548
592	41.55405405
593	41.65261383
594	41.75084175
595	41.68067227
596	41.77852349
597	41.70854271
598	41.63879599
599	41.73622705
600	41.66666667
601	41.59733777
602	41.52823920
603	41.62520730
604	41.55629139
605	41.48760331
606	41.41914191
607	41.51565074
608	41.44736842
609	41.37931034
610	41.31147541
611	41.40752864
612	41.33986928
613	41.43556281
614	41.53094463
615	41.46341463
616	41.55844156
617	41.65316045
618	41.58576052
619	41.51857835
620	41.45161290
621	41.38486312
622	41.31832797
623	41.25200642
624	41.18589744
625	41.12000000
626	41.21405751
627	41.14832536
628	41.08280255
629	41.17647059
630	41.26984127
631	41.36291601
632	41.45569620
633	41.39020537
634	41.32492114
635	41.25984252
636	41.19496855
637	41.13029827
638	41.06583072
639	41.15805947
640	41.25000000
641	41.18564743
642	41.12149533
643	41.05754277
644	41.14906832
645	41.08527132
646	41.02167183
647	40.95826893
648	40.89506173
649	40.83204931
650	40.76923077
651	40.70660522
652	40.64417178
653	40.58192956
654	40.67278287
655	40.61068702
656	40.70121951
657	40.63926941
658	40.72948328
659	40.66767830
660	40.60606061
661	40.54462935
662	40.48338369
663	40.42232278
664	40.51204819
665	40.60150376
666	40.54054054
667	40.62968516
668	40.71856287
669	40.80717489
670	40.89552239
671	40.83457526
672	40.77380952
673	40.71322437
674	40.65281899
675	40.59259259
676	40.53254438
677	40.47267356
678	40.41297935
679	40.35346097
680	40.44117647
681	40.52863436
682	40.46920821
683	40.40995608
684	40.49707602
685	40.43795620
686	40.37900875
687	40.46579330
688	40.40697674
689	40.34833091
690	40.28985507
691	40.37626628
692	40.31791908
693	40.25974026
694	40.34582133
695	40.28776978
696	40.37356322
697	40.31563845
698	40.25787966
699	40.34334764
700	40.28571429
701	40.37089872
702	40.45584046
703	40.54054054
704	40.62500000
705	40.56737589
706	40.50991501
707	40.45261669
708	40.39548023
709	40.47954866
710	40.42253521
711	40.50632911
712	40.58988764
713	40.67321178
714	40.75630252
715	40.69930070
716	40.64245810
717	40.58577406
718	40.52924791
719	40.47287900
720	40.41666667
721	40.36061026
722	40.30470914
723	40.38727524
724	40.33149171
725	40.41379310
726	40.35812672
727	40.30261348
728	40.38461538
729	40.32921811
730	40.27397260
731	40.21887825
732	40.16393443
733	40.24556617
734	40.32697548
735	40.40816327
736	40.35326087
737	40.43419267
738	40.51490515
739	40.59539919
740	40.54054054
741	40.48582996
742	40.56603774
743	40.64602961
744	40.72580645
745	40.67114094
746	40.61662198
747	40.56224900
748	40.50802139
749	40.45393858
750	40.53333333

Final result: 40.5333 +/- 1.7939
Random chance: 25.0000 +/- 1.5822