File size: 12,600 Bytes
6583e65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	50.00000000
9	44.44444444
10	50.00000000
11	45.45454545
12	50.00000000
13	46.15384615
14	42.85714286
15	46.66666667
16	50.00000000
17	52.94117647
18	50.00000000
19	52.63157895
20	55.00000000
21	57.14285714
22	54.54545455
23	52.17391304
24	54.16666667
25	56.00000000
26	53.84615385
27	55.55555556
28	57.14285714
29	58.62068966
30	60.00000000
31	58.06451613
32	59.37500000
33	57.57575758
34	58.82352941
35	60.00000000
36	61.11111111
37	62.16216216
38	63.15789474
39	61.53846154
40	60.00000000
41	58.53658537
42	57.14285714
43	58.13953488
44	56.81818182
45	57.77777778
46	56.52173913
47	55.31914894
48	56.25000000
49	57.14285714
50	56.00000000
51	54.90196078
52	53.84615385
53	52.83018868
54	51.85185185
55	52.72727273
56	53.57142857
57	54.38596491
58	53.44827586
59	52.54237288
60	51.66666667
61	52.45901639
62	51.61290323
63	50.79365079
64	50.00000000
65	49.23076923
66	48.48484848
67	47.76119403
68	47.05882353
69	46.37681159
70	47.14285714
71	47.88732394
72	47.22222222
73	46.57534247
74	47.29729730
75	48.00000000
76	48.68421053
77	49.35064935
78	50.00000000
79	49.36708861
80	50.00000000
81	50.61728395
82	50.00000000
83	50.60240964
84	51.19047619
85	50.58823529
86	50.00000000
87	50.57471264
88	51.13636364
89	50.56179775
90	50.00000000
91	49.45054945
92	48.91304348
93	48.38709677
94	48.93617021
95	48.42105263
96	48.95833333
97	49.48453608
98	48.97959184
99	49.49494949
100	49.00000000
101	48.51485149
102	48.03921569
103	47.57281553
104	48.07692308
105	48.57142857
106	49.05660377
107	48.59813084
108	49.07407407
109	49.54128440
110	50.00000000
111	50.45045045
112	50.00000000
113	49.55752212
114	50.00000000
115	50.43478261
116	50.00000000
117	49.57264957
118	49.15254237
119	49.57983193
120	49.16666667
121	49.58677686
122	49.18032787
123	48.78048780
124	48.38709677
125	48.80000000
126	48.41269841
127	48.03149606
128	47.65625000
129	47.28682171
130	46.92307692
131	46.56488550
132	46.96969697
133	46.61654135
134	46.26865672
135	45.92592593
136	45.58823529
137	45.25547445
138	44.92753623
139	45.32374101
140	45.71428571
141	46.09929078
142	46.47887324
143	46.85314685
144	47.22222222
145	47.58620690
146	47.94520548
147	47.61904762
148	47.97297297
149	47.65100671
150	47.33333333
151	47.01986755
152	47.36842105
153	47.71241830
154	47.40259740
155	47.09677419
156	46.79487179
157	47.13375796
158	47.46835443
159	47.16981132
160	47.50000000
161	47.20496894
162	46.91358025
163	46.62576687
164	46.34146341
165	46.66666667
166	46.38554217
167	46.70658683
168	46.42857143
169	46.74556213
170	46.47058824
171	46.78362573
172	46.51162791
173	46.24277457
174	45.97701149
175	45.71428571
176	46.02272727
177	45.76271186
178	45.50561798
179	45.81005587
180	45.55555556
181	45.30386740
182	45.05494505
183	45.35519126
184	45.10869565
185	45.40540541
186	45.69892473
187	45.45454545
188	45.74468085
189	46.03174603
190	45.78947368
191	46.07329843
192	45.83333333
193	45.59585492
194	45.87628866
195	46.15384615
196	46.42857143
197	46.19289340
198	46.46464646
199	46.23115578
200	46.00000000
201	45.77114428
202	45.54455446
203	45.32019704
204	45.09803922
205	45.36585366
206	45.14563107
207	45.41062802
208	45.19230769
209	44.97607656
210	44.76190476
211	45.02369668
212	44.81132075
213	44.60093897
214	44.39252336
215	44.65116279
216	44.90740741
217	44.70046083
218	44.95412844
219	45.20547945
220	45.45454545
221	45.24886878
222	45.04504505
223	45.29147982
224	45.08928571
225	45.33333333
226	45.13274336
227	44.93392070
228	45.17543860
229	44.97816594
230	44.78260870
231	45.02164502
232	45.25862069
233	45.06437768
234	45.29914530
235	45.10638298
236	44.91525424
237	44.72573840
238	44.53781513
239	44.35146444
240	44.58333333
241	44.39834025
242	44.62809917
243	44.85596708
244	44.67213115
245	44.89795918
246	44.71544715
247	44.93927126
248	44.75806452
249	44.57831325
250	44.40000000
251	44.22310757
252	44.04761905
253	43.87351779
254	43.70078740
255	43.52941176
256	43.75000000
257	43.96887160
258	43.79844961
259	43.62934363
260	43.84615385
261	43.67816092
262	43.51145038
263	43.34600760
264	43.18181818
265	43.01886792
266	43.23308271
267	43.07116105
268	42.91044776
269	43.12267658
270	43.33333333
271	43.17343173
272	43.38235294
273	43.22344322
274	43.06569343
275	42.90909091
276	43.11594203
277	43.32129964
278	43.16546763
279	43.01075269
280	42.85714286
281	42.70462633
282	42.55319149
283	42.40282686
284	42.25352113
285	42.10526316
286	41.95804196
287	42.16027875
288	42.01388889
289	41.86851211
290	42.06896552
291	42.26804124
292	42.46575342
293	42.66211604
294	42.51700680
295	42.37288136
296	42.22972973
297	42.08754209
298	42.28187919
299	42.47491639
300	42.66666667
301	42.52491694
302	42.71523179
303	42.57425743
304	42.43421053
305	42.29508197
306	42.15686275
307	42.34527687
308	42.20779221
309	42.07119741
310	41.93548387
311	42.12218650
312	42.30769231
313	42.17252396
314	42.03821656
315	41.90476190
316	41.77215190
317	41.95583596
318	41.82389937
319	42.00626959
320	41.87500000
321	41.74454829
322	41.92546584
323	41.79566563
324	41.66666667
325	41.84615385
326	41.71779141
327	41.89602446
328	41.76829268
329	41.64133739
330	41.51515152
331	41.38972810
332	41.26506024
333	41.44144144
334	41.31736527
335	41.49253731
336	41.66666667
337	41.54302671
338	41.42011834
339	41.29793510
340	41.47058824
341	41.64222874
342	41.81286550
343	41.69096210
344	41.56976744
345	41.44927536
346	41.61849711
347	41.78674352
348	41.66666667
349	41.54727794
350	41.71428571
351	41.59544160
352	41.47727273
353	41.64305949
354	41.52542373
355	41.40845070
356	41.57303371
357	41.45658263
358	41.34078212
359	41.22562674
360	41.38888889
361	41.55124654
362	41.71270718
363	41.59779614
364	41.48351648
365	41.36986301
366	41.53005464
367	41.41689373
368	41.30434783
369	41.19241192
370	41.35135135
371	41.23989218
372	41.12903226
373	41.01876676
374	40.90909091
375	41.06666667
376	40.95744681
377	41.11405836
378	41.26984127
379	41.16094987
380	41.05263158
381	41.20734908
382	41.36125654
383	41.51436031
384	41.40625000
385	41.55844156
386	41.45077720
387	41.34366925
388	41.23711340
389	41.38817481
390	41.53846154
391	41.68797954
392	41.83673469
393	41.73027990
394	41.62436548
395	41.77215190
396	41.66666667
397	41.81360202
398	41.70854271
399	41.85463659
400	41.75000000
401	41.89526185
402	42.03980100
403	41.93548387
404	42.07920792
405	42.22222222
406	42.11822660
407	42.01474201
408	41.91176471
409	42.05378973
410	41.95121951
411	41.84914842
412	41.74757282
413	41.64648910
414	41.54589372
415	41.44578313
416	41.58653846
417	41.48681055
418	41.38755981
419	41.52744630
420	41.42857143
421	41.56769596
422	41.46919431
423	41.37115839
424	41.50943396
425	41.64705882
426	41.78403756
427	41.92037471
428	41.82242991
429	41.95804196
430	41.86046512
431	41.99535963
432	42.12962963
433	42.03233256
434	41.93548387
435	41.83908046
436	41.74311927
437	41.64759725
438	41.78082192
439	41.91343964
440	41.81818182
441	41.72335601
442	41.62895928
443	41.53498871
444	41.44144144
445	41.34831461
446	41.25560538
447	41.16331096
448	41.07142857
449	40.97995546
450	41.11111111
451	41.24168514
452	41.15044248
453	41.28035320
454	41.18942731
455	41.09890110
456	41.22807018
457	41.35667396
458	41.26637555
459	41.17647059
460	41.30434783
461	41.21475054
462	41.12554113
463	41.03671706
464	41.16379310
465	41.07526882
466	40.98712446
467	41.11349036
468	41.23931624
469	41.36460554
470	41.27659574
471	41.18895966
472	41.10169492
473	41.22621564
474	41.13924051
475	41.05263158
476	41.17647059
477	41.09014675
478	41.21338912
479	41.12734864
480	41.25000000
481	41.16424116
482	41.07883817
483	40.99378882
484	40.90909091
485	40.82474227
486	40.94650206
487	41.06776181
488	40.98360656
489	40.89979550
490	41.02040816
491	40.93686354
492	40.85365854
493	40.77079108
494	40.89068826
495	41.01010101
496	41.12903226
497	41.04627767
498	40.96385542
499	40.88176353
500	41.00000000
501	40.91816367
502	40.83665339
503	40.75546720
504	40.67460317
505	40.59405941
506	40.71146245
507	40.63116371
508	40.74803150
509	40.66797642
510	40.78431373
511	40.70450098
512	40.62500000
513	40.54580897
514	40.46692607
515	40.38834951
516	40.50387597
517	40.61895551
518	40.73359073
519	40.84778420
520	40.76923077
521	40.69097889
522	40.80459770
523	40.72657744
524	40.83969466
525	40.76190476
526	40.68441065
527	40.60721063
528	40.53030303
529	40.45368620
530	40.56603774
531	40.48964218
532	40.41353383
533	40.33771107
534	40.44943820
535	40.56074766
536	40.48507463
537	40.59590317
538	40.70631970
539	40.63079777
540	40.55555556
541	40.48059150
542	40.59040590
543	40.51565378
544	40.62500000
545	40.55045872
546	40.47619048
547	40.40219378
548	40.51094891
549	40.43715847
550	40.36363636
551	40.47186933
552	40.39855072
553	40.32549729
554	40.43321300
555	40.36036036
556	40.46762590
557	40.57450628
558	40.68100358
559	40.60822898
560	40.53571429
561	40.46345811
562	40.56939502
563	40.67495560
564	40.60283688
565	40.70796460
566	40.63604240
567	40.56437390
568	40.49295775
569	40.42179262
570	40.35087719
571	40.28021016
572	40.38461538
573	40.31413613
574	40.24390244
575	40.17391304
576	40.27777778
577	40.38128250
578	40.48442907
579	40.41450777
580	40.51724138
581	40.61962134
582	40.54982818
583	40.65180103
584	40.75342466
585	40.68376068
586	40.78498294
587	40.71550256
588	40.64625850
589	40.57724958
590	40.50847458
591	40.43993232
592	40.37162162
593	40.47217538
594	40.57239057
595	40.50420168
596	40.60402685
597	40.53601340
598	40.46822742
599	40.56761269
600	40.50000000
601	40.43261231
602	40.36544850
603	40.46434494
604	40.39735099
605	40.33057851
606	40.26402640
607	40.36243822
608	40.29605263
609	40.22988506
610	40.16393443
611	40.26186579
612	40.19607843
613	40.29363785
614	40.39087948
615	40.32520325
616	40.42207792
617	40.51863857
618	40.45307443
619	40.38772213
620	40.32258065
621	40.25764895
622	40.19292605
623	40.12841091
624	40.06410256
625	40.00000000
626	40.09584665
627	40.03189793
628	39.96815287
629	40.06359300
630	40.15873016
631	40.09508716
632	40.18987342
633	40.12638231
634	40.06309148
635	40.00000000
636	39.93710692
637	39.87441130
638	39.81191223
639	39.90610329
640	40.00000000
641	39.93759750
642	39.87538941
643	39.81337481
644	39.90683230
645	39.84496124
646	39.78328173
647	39.72179289
648	39.66049383
649	39.59938367
650	39.53846154
651	39.47772657
652	39.41717791
653	39.35681470
654	39.44954128
655	39.38931298
656	39.48170732
657	39.42161339
658	39.51367781
659	39.45371775
660	39.39393939
661	39.33434191
662	39.27492447
663	39.21568627
664	39.15662651
665	39.24812030
666	39.18918919
667	39.28035982
668	39.37125749
669	39.31240658
670	39.40298507
671	39.34426230
672	39.28571429
673	39.22734027
674	39.16913947
675	39.11111111
676	39.05325444
677	38.99556869
678	38.93805310
679	38.88070692
680	38.97058824
681	39.06020558
682	39.00293255
683	38.94582723
684	39.03508772
685	38.97810219
686	39.06705539
687	39.15574964
688	39.09883721
689	39.04208999
690	38.98550725
691	39.07380608
692	39.01734104
693	38.96103896
694	39.04899135
695	38.99280576
696	38.93678161
697	38.88091822
698	38.96848138
699	39.05579399
700	39.00000000
701	39.08701854
702	39.17378917
703	39.26031294
704	39.34659091
705	39.29078014
706	39.23512748
707	39.17963225
708	39.12429379
709	39.06911142
710	39.01408451
711	39.09985935
712	39.18539326
713	39.27068724
714	39.35574230
715	39.30069930
716	39.24581006
717	39.19107392
718	39.13649025
719	39.08205841
720	39.02777778
721	38.97364771
722	38.91966759
723	39.00414938
724	38.95027624
725	39.03448276
726	39.11845730
727	39.06464924
728	39.14835165
729	39.09465021
730	39.04109589
731	38.98768810
732	38.93442623
733	39.01773533
734	39.10081744
735	39.18367347
736	39.13043478
737	39.21302578
738	39.29539295
739	39.37753721
740	39.32432432
741	39.27125506
742	39.35309973
743	39.43472409
744	39.51612903
745	39.46308725
746	39.41018767
747	39.35742972
748	39.30481283
749	39.25233645
750	39.33333333

Final result: 39.3333 +/- 1.7849
Random chance: 25.0000 +/- 1.5822