File size: 26,436 Bytes
d9f605f
 
 
 
 
19d774d
 
d9f605f
 
 
 
19d774d
 
 
 
 
 
 
 
 
 
 
d9f605f
 
 
 
 
 
 
19d774d
d9f605f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
---
license: mit
base_model: openai-community/gpt2
tags:
- generated_from_trainer
datasets:
- gokuls/wiki_book_corpus_raw_dataset_tiny
metrics:
- accuracy
model-index:
- name: gpt_train_2_768
  results:
  - task:
      name: Causal Language Modeling
      type: text-generation
    dataset:
      name: gokuls/wiki_book_corpus_raw_dataset_tiny
      type: gokuls/wiki_book_corpus_raw_dataset_tiny
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.10393614847954215
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt_train_2_768

This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on the gokuls/wiki_book_corpus_raw_dataset_tiny dataset.
It achieves the following results on the evaluation set:
- Loss: 7.4883
- Accuracy: 0.1039

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 10
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy |
|:-------------:|:------:|:----:|:---------------:|:--------:|
| 10.9688       | 0.0001 | 1    | 10.9688         | 0.0000   |
| 10.9609       | 0.0002 | 2    | 10.9688         | 0.0000   |
| 10.9609       | 0.0003 | 3    | 10.9688         | 0.0000   |
| 10.9609       | 0.0004 | 4    | 10.9688         | 0.0000   |
| 10.9609       | 0.0005 | 5    | 10.9688         | 0.0000   |
| 10.9688       | 0.0006 | 6    | 10.9688         | 0.0000   |
| 10.9609       | 0.0007 | 7    | 10.9688         | 0.0000   |
| 10.9609       | 0.0008 | 8    | 10.9688         | 0.0000   |
| 10.9688       | 0.0009 | 9    | 10.9688         | 0.0000   |
| 10.9531       | 0.0010 | 10   | 10.9688         | 0.0000   |
| 10.9688       | 0.0011 | 11   | 10.9688         | 0.0000   |
| 10.9688       | 0.0012 | 12   | 10.9688         | 0.0000   |
| 10.9531       | 0.0013 | 13   | 10.9688         | 0.0000   |
| 10.9609       | 0.0014 | 14   | 10.9688         | 0.0000   |
| 10.9688       | 0.0015 | 15   | 10.9688         | 0.0000   |
| 10.9766       | 0.0015 | 16   | 10.9688         | 0.0000   |
| 10.9688       | 0.0016 | 17   | 10.9688         | 0.0000   |
| 10.9609       | 0.0017 | 18   | 10.8828         | 0.0007   |
| 10.8906       | 0.0018 | 19   | 10.8047         | 0.0051   |
| 10.8359       | 0.0019 | 20   | 10.7188         | 0.0112   |
| 10.75         | 0.0020 | 21   | 10.6484         | 0.0175   |
| 10.6719       | 0.0021 | 22   | 10.5781         | 0.0280   |
| 10.6172       | 0.0022 | 23   | 10.5            | 0.0392   |
| 10.5391       | 0.0023 | 24   | 10.4375         | 0.0447   |
| 10.5078       | 0.0024 | 25   | 10.3828         | 0.0478   |
| 10.4609       | 0.0025 | 26   | 10.3125         | 0.0499   |
| 10.3906       | 0.0026 | 27   | 10.2656         | 0.0511   |
| 10.3281       | 0.0027 | 28   | 10.2109         | 0.0521   |
| 10.2656       | 0.0028 | 29   | 10.1641         | 0.0531   |
| 10.25         | 0.0029 | 30   | 10.1172         | 0.0537   |
| 10.2031       | 0.0030 | 31   | 10.0703         | 0.0544   |
| 10.1641       | 0.0031 | 32   | 10.0312         | 0.0552   |
| 10.125        | 0.0032 | 33   | 9.9922          | 0.0558   |
| 10.0859       | 0.0033 | 34   | 9.9609          | 0.0562   |
| 10.0391       | 0.0034 | 35   | 9.9219          | 0.0566   |
| 10.0156       | 0.0035 | 36   | 9.8906          | 0.0568   |
| 9.9609        | 0.0036 | 37   | 9.8594          | 0.0567   |
| 9.9141        | 0.0037 | 38   | 9.8359          | 0.0566   |
| 9.875         | 0.0038 | 39   | 9.8047          | 0.0568   |
| 9.8672        | 0.0039 | 40   | 9.7812          | 0.0569   |
| 9.8438        | 0.0040 | 41   | 9.7578          | 0.0568   |
| 9.7969        | 0.0041 | 42   | 9.7344          | 0.0565   |
| 9.8203        | 0.0042 | 43   | 9.7109          | 0.0564   |
| 9.7891        | 0.0043 | 44   | 9.6875          | 0.0564   |
| 9.7031        | 0.0044 | 45   | 9.6719          | 0.0566   |
| 9.7344        | 0.0045 | 46   | 9.6484          | 0.0569   |
| 9.7266        | 0.0046 | 47   | 9.6328          | 0.0573   |
| 9.7031        | 0.0046 | 48   | 9.6172          | 0.0579   |
| 9.7109        | 0.0047 | 49   | 9.6016          | 0.0585   |
| 9.6406        | 0.0048 | 50   | 9.5781          | 0.0591   |
| 9.6797        | 0.0049 | 51   | 9.5625          | 0.0597   |
| 9.6328        | 0.0050 | 52   | 9.5469          | 0.0605   |
| 9.6172        | 0.0051 | 53   | 9.5312          | 0.0612   |
| 9.6172        | 0.0052 | 54   | 9.5234          | 0.0615   |
| 9.5703        | 0.0053 | 55   | 9.5078          | 0.0617   |
| 9.5781        | 0.0054 | 56   | 9.4922          | 0.0618   |
| 9.5938        | 0.0055 | 57   | 9.4766          | 0.0620   |
| 9.5391        | 0.0056 | 58   | 9.4688          | 0.0621   |
| 9.4922        | 0.0057 | 59   | 9.4531          | 0.0620   |
| 9.4688        | 0.0058 | 60   | 9.4375          | 0.0620   |
| 9.4922        | 0.0059 | 61   | 9.4297          | 0.0620   |
| 9.4609        | 0.0060 | 62   | 9.4141          | 0.0620   |
| 9.4297        | 0.0061 | 63   | 9.4062          | 0.0620   |
| 9.4844        | 0.0062 | 64   | 9.3906          | 0.0620   |
| 9.4531        | 0.0063 | 65   | 9.3828          | 0.0622   |
| 9.4375        | 0.0064 | 66   | 9.3672          | 0.0625   |
| 9.4375        | 0.0065 | 67   | 9.3594          | 0.0628   |
| 9.3984        | 0.0066 | 68   | 9.3438          | 0.0630   |
| 9.4062        | 0.0067 | 69   | 9.3359          | 0.0632   |
| 9.3984        | 0.0068 | 70   | 9.3203          | 0.0633   |
| 9.4375        | 0.0069 | 71   | 9.3125          | 0.0633   |
| 9.3828        | 0.0070 | 72   | 9.3047          | 0.0634   |
| 9.3594        | 0.0071 | 73   | 9.2891          | 0.0634   |
| 9.3438        | 0.0072 | 74   | 9.2812          | 0.0634   |
| 9.3672        | 0.0073 | 75   | 9.2734          | 0.0634   |
| 9.3125        | 0.0074 | 76   | 9.2578          | 0.0634   |
| 9.3047        | 0.0075 | 77   | 9.25            | 0.0633   |
| 9.2969        | 0.0076 | 78   | 9.2422          | 0.0632   |
| 9.2891        | 0.0077 | 79   | 9.2266          | 0.0631   |
| 9.2812        | 0.0077 | 80   | 9.2188          | 0.0631   |
| 9.2656        | 0.0078 | 81   | 9.2109          | 0.0632   |
| 9.2422        | 0.0079 | 82   | 9.2031          | 0.0633   |
| 9.2656        | 0.0080 | 83   | 9.1875          | 0.0635   |
| 9.25          | 0.0081 | 84   | 9.1797          | 0.0637   |
| 9.2344        | 0.0082 | 85   | 9.1719          | 0.0639   |
| 9.2266        | 0.0083 | 86   | 9.1562          | 0.0640   |
| 9.25          | 0.0084 | 87   | 9.1484          | 0.0641   |
| 9.1406        | 0.0085 | 88   | 9.1406          | 0.0641   |
| 9.1562        | 0.0086 | 89   | 9.1328          | 0.0642   |
| 9.2031        | 0.0087 | 90   | 9.1172          | 0.0641   |
| 9.1406        | 0.0088 | 91   | 9.1094          | 0.0642   |
| 9.1406        | 0.0089 | 92   | 9.1016          | 0.0643   |
| 9.1406        | 0.0090 | 93   | 9.0938          | 0.0644   |
| 9.1328        | 0.0091 | 94   | 9.0781          | 0.0644   |
| 9.125         | 0.0092 | 95   | 9.0703          | 0.0645   |
| 9.1016        | 0.0093 | 96   | 9.0625          | 0.0646   |
| 9.125         | 0.0094 | 97   | 9.0547          | 0.0648   |
| 9.0625        | 0.0095 | 98   | 9.0391          | 0.0652   |
| 9.0859        | 0.0096 | 99   | 9.0312          | 0.0655   |
| 9.0547        | 0.0097 | 100  | 9.0234          | 0.0657   |
| 9.0547        | 0.0098 | 101  | 9.0156          | 0.0658   |
| 9.0625        | 0.0099 | 102  | 9.0078          | 0.0659   |
| 9.0547        | 0.0100 | 103  | 8.9922          | 0.0661   |
| 9.0156        | 0.0101 | 104  | 8.9844          | 0.0662   |
| 9.0391        | 0.0102 | 105  | 8.9766          | 0.0664   |
| 9.0234        | 0.0103 | 106  | 8.9688          | 0.0664   |
| 9.0234        | 0.0104 | 107  | 8.9609          | 0.0664   |
| 8.9766        | 0.0105 | 108  | 8.9453          | 0.0664   |
| 8.9922        | 0.0106 | 109  | 8.9375          | 0.0665   |
| 8.9453        | 0.0107 | 110  | 8.9297          | 0.0665   |
| 8.9609        | 0.0108 | 111  | 8.9219          | 0.0664   |
| 8.9766        | 0.0108 | 112  | 8.9141          | 0.0664   |
| 8.9844        | 0.0109 | 113  | 8.8984          | 0.0666   |
| 8.9453        | 0.0110 | 114  | 8.8906          | 0.0669   |
| 8.9688        | 0.0111 | 115  | 8.8828          | 0.0673   |
| 8.9766        | 0.0112 | 116  | 8.875           | 0.0677   |
| 8.9297        | 0.0113 | 117  | 8.8672          | 0.0682   |
| 8.9297        | 0.0114 | 118  | 8.8594          | 0.0689   |
| 8.8672        | 0.0115 | 119  | 8.8516          | 0.0694   |
| 8.8906        | 0.0116 | 120  | 8.8359          | 0.0700   |
| 8.8984        | 0.0117 | 121  | 8.8281          | 0.0703   |
| 8.8984        | 0.0118 | 122  | 8.8203          | 0.0704   |
| 8.8828        | 0.0119 | 123  | 8.8125          | 0.0706   |
| 8.8594        | 0.0120 | 124  | 8.8047          | 0.0707   |
| 8.8281        | 0.0121 | 125  | 8.7969          | 0.0708   |
| 8.8359        | 0.0122 | 126  | 8.7812          | 0.0710   |
| 8.8359        | 0.0123 | 127  | 8.7734          | 0.0711   |
| 8.8281        | 0.0124 | 128  | 8.7656          | 0.0710   |
| 8.8438        | 0.0125 | 129  | 8.7578          | 0.0707   |
| 8.7578        | 0.0126 | 130  | 8.75            | 0.0702   |
| 8.7812        | 0.0127 | 131  | 8.7422          | 0.0698   |
| 8.7734        | 0.0128 | 132  | 8.7344          | 0.0697   |
| 8.7812        | 0.0129 | 133  | 8.7266          | 0.0701   |
| 8.7891        | 0.0130 | 134  | 8.7188          | 0.0707   |
| 8.7656        | 0.0131 | 135  | 8.7031          | 0.0713   |
| 8.7891        | 0.0132 | 136  | 8.6953          | 0.0719   |
| 8.7188        | 0.0133 | 137  | 8.6875          | 0.0726   |
| 8.7266        | 0.0134 | 138  | 8.6797          | 0.0733   |
| 8.75          | 0.0135 | 139  | 8.6719          | 0.0737   |
| 8.7188        | 0.0136 | 140  | 8.6641          | 0.0740   |
| 8.7344        | 0.0137 | 141  | 8.6562          | 0.0742   |
| 8.6641        | 0.0138 | 142  | 8.6484          | 0.0742   |
| 8.7031        | 0.0139 | 143  | 8.6406          | 0.0741   |
| 8.6797        | 0.0139 | 144  | 8.6328          | 0.0741   |
| 8.6797        | 0.0140 | 145  | 8.6172          | 0.0739   |
| 8.6719        | 0.0141 | 146  | 8.6094          | 0.0736   |
| 8.6641        | 0.0142 | 147  | 8.6016          | 0.0736   |
| 8.6484        | 0.0143 | 148  | 8.5938          | 0.0737   |
| 8.6172        | 0.0144 | 149  | 8.5859          | 0.0741   |
| 8.6719        | 0.0145 | 150  | 8.5781          | 0.0746   |
| 8.6406        | 0.0146 | 151  | 8.5703          | 0.0750   |
| 8.6172        | 0.0147 | 152  | 8.5625          | 0.0754   |
| 8.6094        | 0.0148 | 153  | 8.5547          | 0.0756   |
| 8.6016        | 0.0149 | 154  | 8.5469          | 0.0756   |
| 8.5625        | 0.0150 | 155  | 8.5391          | 0.0755   |
| 8.5312        | 0.0151 | 156  | 8.5312          | 0.0756   |
| 8.5703        | 0.0152 | 157  | 8.5234          | 0.0756   |
| 8.6172        | 0.0153 | 158  | 8.5156          | 0.0757   |
| 8.5781        | 0.0154 | 159  | 8.5078          | 0.0757   |
| 8.6016        | 0.0155 | 160  | 8.5             | 0.0759   |
| 8.5547        | 0.0156 | 161  | 8.4922          | 0.0762   |
| 8.5547        | 0.0157 | 162  | 8.4844          | 0.0766   |
| 8.5312        | 0.0158 | 163  | 8.4766          | 0.0767   |
| 8.5           | 0.0159 | 164  | 8.4688          | 0.0767   |
| 8.5312        | 0.0160 | 165  | 8.4609          | 0.0766   |
| 8.5312        | 0.0161 | 166  | 8.4531          | 0.0766   |
| 8.4531        | 0.0162 | 167  | 8.4453          | 0.0767   |
| 8.4766        | 0.0163 | 168  | 8.4375          | 0.0768   |
| 8.4766        | 0.0164 | 169  | 8.4297          | 0.0770   |
| 8.4688        | 0.0165 | 170  | 8.4219          | 0.0772   |
| 8.4922        | 0.0166 | 171  | 8.4141          | 0.0775   |
| 8.4375        | 0.0167 | 172  | 8.4141          | 0.0777   |
| 8.4609        | 0.0168 | 173  | 8.4062          | 0.0777   |
| 8.4141        | 0.0169 | 174  | 8.3984          | 0.0777   |
| 8.4531        | 0.0170 | 175  | 8.3906          | 0.0778   |
| 8.3984        | 0.0170 | 176  | 8.3828          | 0.0778   |
| 8.4141        | 0.0171 | 177  | 8.375           | 0.0779   |
| 8.4453        | 0.0172 | 178  | 8.3672          | 0.0781   |
| 8.4219        | 0.0173 | 179  | 8.3594          | 0.0783   |
| 8.4219        | 0.0174 | 180  | 8.3516          | 0.0785   |
| 8.4062        | 0.0175 | 181  | 8.3438          | 0.0785   |
| 8.3984        | 0.0176 | 182  | 8.3359          | 0.0787   |
| 8.3828        | 0.0177 | 183  | 8.3281          | 0.0790   |
| 8.375         | 0.0178 | 184  | 8.3203          | 0.0792   |
| 8.3594        | 0.0179 | 185  | 8.3125          | 0.0795   |
| 8.375         | 0.0180 | 186  | 8.3125          | 0.0797   |
| 8.3125        | 0.0181 | 187  | 8.3047          | 0.0796   |
| 8.3438        | 0.0182 | 188  | 8.2969          | 0.0796   |
| 8.3281        | 0.0183 | 189  | 8.2891          | 0.0795   |
| 8.3359        | 0.0184 | 190  | 8.2812          | 0.0795   |
| 8.3047        | 0.0185 | 191  | 8.2734          | 0.0798   |
| 8.3359        | 0.0186 | 192  | 8.2656          | 0.0800   |
| 8.3047        | 0.0187 | 193  | 8.2578          | 0.0803   |
| 8.2969        | 0.0188 | 194  | 8.2578          | 0.0805   |
| 8.3203        | 0.0189 | 195  | 8.25            | 0.0807   |
| 8.2734        | 0.0190 | 196  | 8.2422          | 0.0809   |
| 8.25          | 0.0191 | 197  | 8.2344          | 0.0809   |
| 8.2734        | 0.0192 | 198  | 8.2266          | 0.0810   |
| 8.2109        | 0.0193 | 199  | 8.2188          | 0.0809   |
| 8.25          | 0.0194 | 200  | 8.2109          | 0.0809   |
| 8.2734        | 0.0195 | 201  | 8.2031          | 0.0810   |
| 8.2188        | 0.0196 | 202  | 8.2031          | 0.0812   |
| 8.2578        | 0.0197 | 203  | 8.1953          | 0.0816   |
| 8.2344        | 0.0198 | 204  | 8.1875          | 0.0819   |
| 8.2969        | 0.0199 | 205  | 8.1797          | 0.0823   |
| 8.2812        | 0.0200 | 206  | 8.1719          | 0.0825   |
| 8.2578        | 0.0201 | 207  | 8.1641          | 0.0824   |
| 8.2031        | 0.0201 | 208  | 8.1641          | 0.0824   |
| 8.1953        | 0.0202 | 209  | 8.1562          | 0.0822   |
| 8.2344        | 0.0203 | 210  | 8.1484          | 0.0821   |
| 8.1484        | 0.0204 | 211  | 8.1406          | 0.0822   |
| 8.2188        | 0.0205 | 212  | 8.1328          | 0.0824   |
| 8.1406        | 0.0206 | 213  | 8.1328          | 0.0826   |
| 8.1641        | 0.0207 | 214  | 8.125           | 0.0829   |
| 8.1328        | 0.0208 | 215  | 8.1172          | 0.0831   |
| 8.1875        | 0.0209 | 216  | 8.1094          | 0.0833   |
| 8.1719        | 0.0210 | 217  | 8.1016          | 0.0835   |
| 8.125         | 0.0211 | 218  | 8.1016          | 0.0835   |
| 8.1172        | 0.0212 | 219  | 8.0938          | 0.0835   |
| 8.1172        | 0.0213 | 220  | 8.0859          | 0.0834   |
| 8.1562        | 0.0214 | 221  | 8.0781          | 0.0835   |
| 8.0781        | 0.0215 | 222  | 8.0781          | 0.0838   |
| 8.1094        | 0.0216 | 223  | 8.0703          | 0.0840   |
| 8.0938        | 0.0217 | 224  | 8.0625          | 0.0843   |
| 8.0938        | 0.0218 | 225  | 8.0547          | 0.0846   |
| 8.1016        | 0.0219 | 226  | 8.0469          | 0.0847   |
| 8.1094        | 0.0220 | 227  | 8.0469          | 0.0846   |
| 8.1016        | 0.0221 | 228  | 8.0391          | 0.0844   |
| 8.0859        | 0.0222 | 229  | 8.0312          | 0.0844   |
| 8.0859        | 0.0223 | 230  | 8.0312          | 0.0845   |
| 8.1094        | 0.0224 | 231  | 8.0234          | 0.0849   |
| 8.1016        | 0.0225 | 232  | 8.0156          | 0.0853   |
| 8.0859        | 0.0226 | 233  | 8.0078          | 0.0856   |
| 8.0859        | 0.0227 | 234  | 8.0078          | 0.0857   |
| 8.0781        | 0.0228 | 235  | 8.0             | 0.0857   |
| 8.0234        | 0.0229 | 236  | 7.9922          | 0.0856   |
| 8.0391        | 0.0230 | 237  | 7.9883          | 0.0855   |
| 8.0078        | 0.0231 | 238  | 7.9844          | 0.0855   |
| 8.0078        | 0.0232 | 239  | 7.9766          | 0.0857   |
| 7.9883        | 0.0232 | 240  | 7.9727          | 0.0862   |
| 7.9805        | 0.0233 | 241  | 7.9648          | 0.0865   |
| 8.0234        | 0.0234 | 242  | 7.9609          | 0.0868   |
| 7.9961        | 0.0235 | 243  | 7.9570          | 0.0870   |
| 8.0156        | 0.0236 | 244  | 7.9492          | 0.0870   |
| 7.9766        | 0.0237 | 245  | 7.9453          | 0.0869   |
| 7.9297        | 0.0238 | 246  | 7.9414          | 0.0866   |
| 7.9336        | 0.0239 | 247  | 7.9375          | 0.0865   |
| 7.9219        | 0.0240 | 248  | 7.9297          | 0.0866   |
| 7.957         | 0.0241 | 249  | 7.9258          | 0.0869   |
| 7.9453        | 0.0242 | 250  | 7.9180          | 0.0874   |
| 7.9805        | 0.0243 | 251  | 7.9141          | 0.0879   |
| 7.9531        | 0.0244 | 252  | 7.9102          | 0.0883   |
| 7.9102        | 0.0245 | 253  | 7.9062          | 0.0885   |
| 7.9844        | 0.0246 | 254  | 7.8984          | 0.0886   |
| 7.9414        | 0.0247 | 255  | 7.8945          | 0.0885   |
| 7.9453        | 0.0248 | 256  | 7.8906          | 0.0883   |
| 7.9219        | 0.0249 | 257  | 7.8867          | 0.0883   |
| 7.9141        | 0.0250 | 258  | 7.8828          | 0.0885   |
| 7.9258        | 0.0251 | 259  | 7.875           | 0.0889   |
| 7.957         | 0.0252 | 260  | 7.8711          | 0.0893   |
| 7.8984        | 0.0253 | 261  | 7.8672          | 0.0896   |
| 7.8945        | 0.0254 | 262  | 7.8633          | 0.0898   |
| 7.9141        | 0.0255 | 263  | 7.8594          | 0.0899   |
| 7.9453        | 0.0256 | 264  | 7.8555          | 0.0899   |
| 7.8672        | 0.0257 | 265  | 7.8477          | 0.0900   |
| 7.9375        | 0.0258 | 266  | 7.8438          | 0.0902   |
| 7.9219        | 0.0259 | 267  | 7.8398          | 0.0905   |
| 7.8555        | 0.0260 | 268  | 7.8359          | 0.0907   |
| 7.8984        | 0.0261 | 269  | 7.8320          | 0.0908   |
| 7.8906        | 0.0262 | 270  | 7.8281          | 0.0909   |
| 7.8711        | 0.0263 | 271  | 7.8242          | 0.0910   |
| 7.8633        | 0.0263 | 272  | 7.8203          | 0.0909   |
| 7.8633        | 0.0264 | 273  | 7.8164          | 0.0909   |
| 7.8789        | 0.0265 | 274  | 7.8125          | 0.0909   |
| 7.8438        | 0.0266 | 275  | 7.8086          | 0.0910   |
| 7.8789        | 0.0267 | 276  | 7.8047          | 0.0911   |
| 7.8516        | 0.0268 | 277  | 7.8008          | 0.0912   |
| 7.8711        | 0.0269 | 278  | 7.7969          | 0.0913   |
| 7.8008        | 0.0270 | 279  | 7.7930          | 0.0916   |
| 7.8477        | 0.0271 | 280  | 7.7891          | 0.0918   |
| 7.8086        | 0.0272 | 281  | 7.7852          | 0.0919   |
| 7.8398        | 0.0273 | 282  | 7.7812          | 0.0920   |
| 7.8008        | 0.0274 | 283  | 7.7773          | 0.0922   |
| 7.8281        | 0.0275 | 284  | 7.7734          | 0.0922   |
| 7.7852        | 0.0276 | 285  | 7.7695          | 0.0926   |
| 7.793         | 0.0277 | 286  | 7.7656          | 0.0929   |
| 7.8086        | 0.0278 | 287  | 7.7617          | 0.0931   |
| 7.7812        | 0.0279 | 288  | 7.7578          | 0.0931   |
| 7.793         | 0.0280 | 289  | 7.7539          | 0.0931   |
| 7.7539        | 0.0281 | 290  | 7.75            | 0.0931   |
| 7.75          | 0.0282 | 291  | 7.7461          | 0.0930   |
| 7.8164        | 0.0283 | 292  | 7.7422          | 0.0930   |
| 7.7539        | 0.0284 | 293  | 7.7422          | 0.0931   |
| 7.8086        | 0.0285 | 294  | 7.7383          | 0.0932   |
| 7.793         | 0.0286 | 295  | 7.7344          | 0.0936   |
| 7.7695        | 0.0287 | 296  | 7.7305          | 0.0937   |
| 7.75          | 0.0288 | 297  | 7.7266          | 0.0938   |
| 7.7891        | 0.0289 | 298  | 7.7227          | 0.0938   |
| 7.7773        | 0.0290 | 299  | 7.7188          | 0.0936   |
| 7.7227        | 0.0291 | 300  | 7.7148          | 0.0935   |
| 7.7109        | 0.0292 | 301  | 7.7148          | 0.0937   |
| 7.7148        | 0.0293 | 302  | 7.7109          | 0.0939   |
| 7.7812        | 0.0294 | 303  | 7.7070          | 0.0940   |
| 7.7109        | 0.0294 | 304  | 7.7031          | 0.0941   |
| 7.7539        | 0.0295 | 305  | 7.6992          | 0.0942   |
| 7.7734        | 0.0296 | 306  | 7.6992          | 0.0943   |
| 7.6914        | 0.0297 | 307  | 7.6953          | 0.0943   |
| 7.6445        | 0.0298 | 308  | 7.6914          | 0.0944   |
| 7.6953        | 0.0299 | 309  | 7.6875          | 0.0945   |
| 7.75          | 0.0300 | 310  | 7.6836          | 0.0946   |
| 7.7539        | 0.0301 | 311  | 7.6836          | 0.0949   |
| 7.6953        | 0.0302 | 312  | 7.6797          | 0.0951   |
| 7.7188        | 0.0303 | 313  | 7.6758          | 0.0951   |
| 7.6914        | 0.0304 | 314  | 7.6719          | 0.0953   |
| 7.7344        | 0.0305 | 315  | 7.6719          | 0.0954   |
| 7.7383        | 0.0306 | 316  | 7.6680          | 0.0953   |
| 7.6875        | 0.0307 | 317  | 7.6641          | 0.0950   |
| 7.6914        | 0.0308 | 318  | 7.6602          | 0.0947   |
| 7.6758        | 0.0309 | 319  | 7.6602          | 0.0945   |
| 7.6836        | 0.0310 | 320  | 7.6562          | 0.0947   |
| 7.6914        | 0.0311 | 321  | 7.6523          | 0.0950   |
| 7.6719        | 0.0312 | 322  | 7.6523          | 0.0954   |
| 7.6914        | 0.0313 | 323  | 7.6484          | 0.0958   |
| 7.6094        | 0.0314 | 324  | 7.6445          | 0.0961   |
| 7.7148        | 0.0315 | 325  | 7.6406          | 0.0962   |
| 7.6641        | 0.0316 | 326  | 7.6406          | 0.0961   |
| 7.6602        | 0.0317 | 327  | 7.6367          | 0.0961   |
| 7.7031        | 0.0318 | 328  | 7.6328          | 0.0963   |
| 7.6953        | 0.0319 | 329  | 7.6328          | 0.0966   |
| 7.6445        | 0.0320 | 330  | 7.6289          | 0.0968   |
| 7.6445        | 0.0321 | 331  | 7.625           | 0.0969   |
| 7.6445        | 0.0322 | 332  | 7.625           | 0.0969   |
| 7.668         | 0.0323 | 333  | 7.6211          | 0.0968   |
| 7.6523        | 0.0324 | 334  | 7.6172          | 0.0967   |
| 7.6602        | 0.0325 | 335  | 7.6172          | 0.0968   |
| 7.6328        | 0.0325 | 336  | 7.6133          | 0.0972   |
| 7.6523        | 0.0326 | 337  | 7.6094          | 0.0976   |
| 7.6133        | 0.0327 | 338  | 7.6094          | 0.0981   |
| 7.6367        | 0.0328 | 339  | 7.6055          | 0.0984   |
| 7.6641        | 0.0329 | 340  | 7.6016          | 0.0985   |
| 7.6367        | 0.0330 | 341  | 7.6016          | 0.0985   |
| 7.6133        | 0.0331 | 342  | 7.5977          | 0.0985   |
| 7.6016        | 0.0332 | 343  | 7.5977          | 0.0984   |
| 7.668         | 0.0333 | 344  | 7.5938          | 0.0984   |
| 7.6172        | 0.0334 | 345  | 7.5898          | 0.0984   |
| 7.6016        | 0.0335 | 346  | 7.5898          | 0.0985   |
| 7.6328        | 0.0336 | 347  | 7.5859          | 0.0985   |
| 7.668         | 0.0337 | 348  | 7.5820          | 0.0986   |
| 7.6719        | 0.0338 | 349  | 7.5820          | 0.0987   |
| 7.6602        | 0.0339 | 350  | 7.5781          | 0.0989   |
| 7.6641        | 0.0340 | 351  | 7.5742          | 0.0992   |
| 7.6445        | 0.0341 | 352  | 7.5742          | 0.0994   |
| 7.5781        | 0.0342 | 353  | 7.5703          | 0.0995   |
| 7.6523        | 0.0343 | 354  | 7.5703          | 0.0996   |
| 7.6562        | 0.0344 | 355  | 7.5664          | 0.0996   |
| 7.5977        | 0.0345 | 356  | 7.5664          | 0.0998   |
| 7.5977        | 0.0346 | 357  | 7.5625          | 0.0998   |
| 7.5508        | 0.0347 | 358  | 7.5625          | 0.0997   |
| 7.6172        | 0.0348 | 359  | 7.5586          | 0.0997   |
| 7.5469        | 0.0349 | 360  | 7.5547          | 0.0997   |
| 7.6172        | 0.0350 | 361  | 7.5547          | 0.0997   |
| 7.625         | 0.0351 | 362  | 7.5508          | 0.0998   |
| 7.6289        | 0.0352 | 363  | 7.5508          | 0.0999   |
| 7.5234        | 0.0353 | 364  | 7.5469          | 0.1002   |
| 7.5703        | 0.0354 | 365  | 7.5430          | 0.1006   |
| 7.5859        | 0.0355 | 366  | 7.5430          | 0.1010   |
| 7.5469        | 0.0356 | 367  | 7.5391          | 0.1014   |
| 7.5508        | 0.0356 | 368  | 7.5391          | 0.1016   |
| 7.6172        | 0.0357 | 369  | 7.5352          | 0.1017   |
| 7.6172        | 0.0358 | 370  | 7.5352          | 0.1017   |
| 7.5352        | 0.0359 | 371  | 7.5312          | 0.1018   |
| 7.5859        | 0.0360 | 372  | 7.5312          | 0.1018   |
| 7.5586        | 0.0361 | 373  | 7.5273          | 0.1017   |
| 7.6406        | 0.0362 | 374  | 7.5273          | 0.1017   |
| 7.5273        | 0.0363 | 375  | 7.5234          | 0.1018   |
| 7.5312        | 0.0364 | 376  | 7.5195          | 0.1020   |
| 7.5898        | 0.0365 | 377  | 7.5195          | 0.1023   |
| 7.5898        | 0.0366 | 378  | 7.5156          | 0.1027   |
| 7.543         | 0.0367 | 379  | 7.5156          | 0.1029   |
| 7.5156        | 0.0368 | 380  | 7.5117          | 0.1030   |
| 7.5664        | 0.0369 | 381  | 7.5117          | 0.1031   |
| 7.5625        | 0.0370 | 382  | 7.5078          | 0.1031   |
| 7.5312        | 0.0371 | 383  | 7.5078          | 0.1032   |
| 7.625         | 0.0372 | 384  | 7.5078          | 0.1032   |
| 7.5898        | 0.0373 | 385  | 7.5039          | 0.1034   |
| 7.5625        | 0.0374 | 386  | 7.5             | 0.1035   |
| 7.5664        | 0.0375 | 387  | 7.5             | 0.1037   |
| 7.4609        | 0.0376 | 388  | 7.4961          | 0.1039   |
| 7.5469        | 0.0377 | 389  | 7.4961          | 0.1040   |
| 7.5742        | 0.0378 | 390  | 7.4922          | 0.1040   |
| 7.4375        | 0.0379 | 391  | 7.4922          | 0.1040   |
| 7.4961        | 0.0380 | 392  | 7.4883          | 0.1039   |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.1.0a0+32f93b1
- Datasets 2.20.0
- Tokenizers 0.19.1