sharoz's picture
update model card README.md
b007463
|
raw
history blame
23.2 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: distilgpt2-custom-functions-dataset-python
    results: []

distilgpt2-custom-functions-dataset-python

This model is a fine-tuned version of distilgpt2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0449

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
3.2459 0.02 1 2.9042
3.145 0.05 2 2.7960
3.152 0.07 3 2.7130
2.8803 0.09 4 2.6500
3.0041 0.12 5 2.5872
2.9888 0.14 6 2.5306
2.8988 0.16 7 2.4758
2.8324 0.19 8 2.4277
2.7391 0.21 9 2.3884
2.5464 0.23 10 2.3563
2.7032 0.26 11 2.3296
2.5574 0.28 12 2.2887
2.4388 0.3 13 2.2435
2.4412 0.33 14 2.2067
2.5202 0.35 15 2.1748
2.539 0.37 16 2.1492
2.3816 0.4 17 2.1224
2.2194 0.42 18 2.0913
2.3143 0.44 19 2.0629
2.4338 0.47 20 2.0381
2.2438 0.49 21 2.0147
2.2331 0.51 22 1.9929
2.3193 0.53 23 1.9737
2.4079 0.56 24 1.9516
2.0871 0.58 25 1.9325
2.0421 0.6 26 1.9131
2.0962 0.63 27 1.8915
2.2952 0.65 28 1.8700
2.0131 0.67 29 1.8527
2.0945 0.7 30 1.8395
2.0249 0.72 31 1.8241
1.7714 0.74 32 1.8035
2.1155 0.77 33 1.7834
2.0701 0.79 34 1.7706
2.1761 0.81 35 1.7681
2.024 0.84 36 1.7678
2.0026 0.86 37 1.7592
1.9519 0.88 38 1.7403
2.0078 0.91 39 1.7192
2.0886 0.93 40 1.7038
1.9394 0.95 41 1.6932
1.9368 0.98 42 1.6838
1.9336 1.0 43 1.6746
1.9114 1.02 44 1.6668
1.6817 1.05 45 1.6600
1.7199 1.07 46 1.6530
2.1471 1.09 47 1.6443
1.876 1.12 48 1.6326
2.0242 1.14 49 1.6220
1.7853 1.16 50 1.6124
1.8848 1.19 51 1.6044
1.9169 1.21 52 1.5957
1.8757 1.23 53 1.5863
1.7563 1.26 54 1.5776
1.8712 1.28 55 1.5684
1.5925 1.3 56 1.5592
1.8532 1.33 57 1.5499
1.8703 1.35 58 1.5420
1.8791 1.37 59 1.5394
1.7589 1.4 60 1.5366
1.7686 1.42 61 1.5292
1.7732 1.44 62 1.5185
1.7275 1.47 63 1.5062
1.5971 1.49 64 1.4932
1.8588 1.51 65 1.4846
1.7586 1.53 66 1.4779
1.6546 1.56 67 1.4715
1.7612 1.58 68 1.4659
1.8496 1.6 69 1.4611
1.7064 1.63 70 1.4560
1.7603 1.65 71 1.4510
1.8635 1.67 72 1.4483
1.6867 1.7 73 1.4458
1.7442 1.72 74 1.4410
1.6289 1.74 75 1.4360
1.6314 1.77 76 1.4298
1.7503 1.79 77 1.4234
1.8372 1.81 78 1.4191
1.6873 1.84 79 1.4144
1.5076 1.86 80 1.4099
1.6742 1.88 81 1.4053
1.7419 1.91 82 1.4014
1.7564 1.93 83 1.3965
1.6603 1.95 84 1.3927
1.5957 1.98 85 1.3890
1.5423 2.0 86 1.3843
1.8173 2.02 87 1.3799
1.6287 2.05 88 1.3760
1.5382 2.07 89 1.3733
1.7807 2.09 90 1.3720
1.4352 2.12 91 1.3698
1.6497 2.14 92 1.3691
1.5239 2.16 93 1.3673
1.5729 2.19 94 1.3659
1.601 2.21 95 1.3638
1.5532 2.23 96 1.3605
1.6227 2.26 97 1.3579
1.4461 2.28 98 1.3540
1.6625 2.3 99 1.3497
1.4653 2.33 100 1.3454
1.5603 2.35 101 1.3407
1.6055 2.37 102 1.3369
1.688 2.4 103 1.3339
1.6214 2.42 104 1.3329
1.6715 2.44 105 1.3312
1.4717 2.47 106 1.3275
1.6041 2.49 107 1.3241
1.3908 2.51 108 1.3196
1.4255 2.53 109 1.3160
1.6491 2.56 110 1.3138
1.4839 2.58 111 1.3117
1.6332 2.6 112 1.3103
1.5875 2.63 113 1.3095
1.6515 2.65 114 1.3085
1.5876 2.67 115 1.3047
1.571 2.7 116 1.2994
1.5104 2.72 117 1.2942
1.5415 2.74 118 1.2912
1.5497 2.77 119 1.2890
1.4377 2.79 120 1.2866
1.6653 2.81 121 1.2837
1.8368 2.84 122 1.2820
1.3668 2.86 123 1.2807
1.5136 2.88 124 1.2790
1.4872 2.91 125 1.2770
1.5815 2.93 126 1.2747
1.5128 2.95 127 1.2733
1.6116 2.98 128 1.2718
1.1527 3.0 129 1.2696
1.7834 3.02 130 1.2659
1.4598 3.05 131 1.2609
1.3641 3.07 132 1.2572
1.3965 3.09 133 1.2533
1.5373 3.12 134 1.2502
1.5436 3.14 135 1.2475
1.4855 3.16 136 1.2462
1.4651 3.19 137 1.2450
1.5451 3.21 138 1.2424
1.371 3.23 139 1.2399
1.494 3.26 140 1.2363
1.4795 3.28 141 1.2334
1.4884 3.3 142 1.2316
1.3325 3.33 143 1.2296
1.4838 3.35 144 1.2270
1.6316 3.37 145 1.2246
1.7781 3.4 146 1.2223
1.5818 3.42 147 1.2199
1.3451 3.44 148 1.2181
1.3948 3.47 149 1.2159
1.3582 3.49 150 1.2141
1.308 3.51 151 1.2128
1.5047 3.53 152 1.2115
1.5315 3.56 153 1.2111
1.292 3.58 154 1.2100
1.3923 3.6 155 1.2087
1.4429 3.63 156 1.2072
1.4175 3.65 157 1.2053
1.2825 3.67 158 1.2027
1.6203 3.7 159 1.2009
1.3873 3.72 160 1.1996
1.3914 3.74 161 1.1979
1.397 3.77 162 1.1965
1.3203 3.79 163 1.1955
1.3472 3.81 164 1.1944
1.4608 3.84 165 1.1924
1.5623 3.86 166 1.1910
1.3188 3.88 167 1.1899
1.5339 3.91 168 1.1891
1.2599 3.93 169 1.1891
1.4341 3.95 170 1.1896
1.5623 3.98 171 1.1902
1.4875 4.0 172 1.1911
1.4397 4.02 173 1.1928
1.3585 4.05 174 1.1952
1.5946 4.07 175 1.1954
1.2969 4.09 176 1.1936
1.4746 4.12 177 1.1905
1.2995 4.14 178 1.1869
1.3867 4.16 179 1.1837
1.3056 4.19 180 1.1811
1.4109 4.21 181 1.1792
1.3159 4.23 182 1.1778
1.5643 4.26 183 1.1758
1.565 4.28 184 1.1742
1.3642 4.3 185 1.1725
1.2332 4.33 186 1.1713
1.4746 4.35 187 1.1706
1.6079 4.37 188 1.1706
1.4417 4.4 189 1.1710
1.5759 4.42 190 1.1716
1.4531 4.44 191 1.1730
1.1457 4.47 192 1.1725
1.3571 4.49 193 1.1699
1.3083 4.51 194 1.1667
1.4778 4.53 195 1.1629
1.4744 4.56 196 1.1596
1.3267 4.58 197 1.1576
1.3062 4.6 198 1.1559
1.4942 4.63 199 1.1541
1.3481 4.65 200 1.1523
1.336 4.67 201 1.1504
1.178 4.7 202 1.1484
1.4255 4.72 203 1.1470
1.2686 4.74 204 1.1463
1.2701 4.77 205 1.1462
1.4147 4.79 206 1.1471
1.1865 4.81 207 1.1472
1.6298 4.84 208 1.1466
1.4168 4.86 209 1.1448
1.2948 4.88 210 1.1428
1.3592 4.91 211 1.1403
1.1308 4.93 212 1.1380
1.4951 4.95 213 1.1365
1.2861 4.98 214 1.1356
1.5495 5.0 215 1.1353
1.3874 5.02 216 1.1352
1.4681 5.05 217 1.1352
1.3233 5.07 218 1.1348
1.3016 5.09 219 1.1342
1.1397 5.12 220 1.1336
1.4272 5.14 221 1.1335
1.2665 5.16 222 1.1340
1.3082 5.19 223 1.1344
1.3535 5.21 224 1.1344
1.3061 5.23 225 1.1335
1.6763 5.26 226 1.1323
1.364 5.28 227 1.1309
1.3174 5.3 228 1.1295
1.392 5.33 229 1.1283
1.2964 5.35 230 1.1268
1.4763 5.37 231 1.1254
1.5131 5.4 232 1.1246
1.3607 5.42 233 1.1236
1.1999 5.44 234 1.1220
1.3008 5.47 235 1.1198
1.4308 5.49 236 1.1182
1.3658 5.51 237 1.1169
1.237 5.53 238 1.1162
1.2438 5.56 239 1.1157
1.2446 5.58 240 1.1148
1.1169 5.6 241 1.1144
1.1357 5.63 242 1.1138
1.2117 5.65 243 1.1135
1.6094 5.67 244 1.1132
1.4128 5.7 245 1.1121
1.1179 5.72 246 1.1108
1.5971 5.74 247 1.1090
1.4945 5.77 248 1.1075
1.3786 5.79 249 1.1064
1.2543 5.81 250 1.1055
1.2173 5.84 251 1.1051
1.2902 5.86 252 1.1045
1.349 5.88 253 1.1040
1.4361 5.91 254 1.1039
1.2974 5.93 255 1.1043
1.3718 5.95 256 1.1054
1.3251 5.98 257 1.1058
1.0946 6.0 258 1.1054
1.5179 6.02 259 1.1044
1.2111 6.05 260 1.1029
1.2518 6.07 261 1.1017
1.3981 6.09 262 1.1007
1.2579 6.12 263 1.0997
1.3362 6.14 264 1.0988
1.3431 6.16 265 1.0979
1.1871 6.19 266 1.0966
1.3084 6.21 267 1.0955
1.274 6.23 268 1.0943
1.3527 6.26 269 1.0930
1.1882 6.28 270 1.0921
1.4504 6.3 271 1.0914
1.2413 6.33 272 1.0907
1.3834 6.35 273 1.0905
1.4718 6.37 274 1.0902
1.2267 6.4 275 1.0900
1.2695 6.42 276 1.0903
1.3234 6.44 277 1.0900
1.3627 6.47 278 1.0897
1.261 6.49 279 1.0893
1.2832 6.51 280 1.0886
1.2676 6.53 281 1.0876
1.2157 6.56 282 1.0866
1.4472 6.58 283 1.0855
1.5994 6.6 284 1.0846
1.1464 6.63 285 1.0838
1.383 6.65 286 1.0832
1.3188 6.67 287 1.0828
1.1299 6.7 288 1.0825
1.332 6.72 289 1.0822
1.3761 6.74 290 1.0817
1.2563 6.77 291 1.0812
1.2101 6.79 292 1.0807
1.0416 6.81 293 1.0803
1.2677 6.84 294 1.0804
1.3173 6.86 295 1.0804
1.276 6.88 296 1.0809
1.3172 6.91 297 1.0819
1.3541 6.93 298 1.0829
1.167 6.95 299 1.0828
1.3084 6.98 300 1.0818
1.4632 7.0 301 1.0802
1.2737 7.02 302 1.0787
1.339 7.05 303 1.0771
1.2764 7.07 304 1.0755
1.1898 7.09 305 1.0738
1.0969 7.12 306 1.0725
1.3272 7.14 307 1.0713
1.3145 7.16 308 1.0706
1.3092 7.19 309 1.0700
1.4391 7.21 310 1.0693
1.3287 7.23 311 1.0687
1.1278 7.26 312 1.0678
1.3053 7.28 313 1.0669
1.3056 7.3 314 1.0661
1.2655 7.33 315 1.0654
1.2001 7.35 316 1.0652
1.2686 7.37 317 1.0652
1.2423 7.4 318 1.0655
1.2005 7.42 319 1.0660
1.2882 7.44 320 1.0662
1.2379 7.47 321 1.0662
1.1779 7.49 322 1.0656
1.2649 7.51 323 1.0654
1.3644 7.53 324 1.0654
1.3904 7.56 325 1.0653
1.1946 7.58 326 1.0647
1.3145 7.6 327 1.0643
1.3241 7.63 328 1.0641
1.2184 7.65 329 1.0636
1.3236 7.67 330 1.0633
1.1561 7.7 331 1.0630
1.1665 7.72 332 1.0628
1.1769 7.74 333 1.0626
1.3381 7.77 334 1.0623
1.3596 7.79 335 1.0620
1.2975 7.81 336 1.0615
1.3536 7.84 337 1.0610
1.2781 7.86 338 1.0606
1.3603 7.88 339 1.0602
1.3506 7.91 340 1.0598
1.2219 7.93 341 1.0596
1.3136 7.95 342 1.0595
1.4301 7.98 343 1.0596
1.2822 8.0 344 1.0595
1.3211 8.02 345 1.0592
1.1138 8.05 346 1.0591
1.3094 8.07 347 1.0589
1.3455 8.09 348 1.0586
1.1251 8.12 349 1.0582
1.1669 8.14 350 1.0577
1.2009 8.16 351 1.0572
1.2289 8.19 352 1.0570
1.294 8.21 353 1.0567
1.3906 8.23 354 1.0565
1.1368 8.26 355 1.0562
1.255 8.28 356 1.0560
1.2174 8.3 357 1.0557
1.3693 8.33 358 1.0553
1.3344 8.35 359 1.0549
1.4558 8.37 360 1.0546
1.3396 8.4 361 1.0543
1.0955 8.42 362 1.0541
1.2509 8.44 363 1.0539
1.1303 8.47 364 1.0538
1.2599 8.49 365 1.0537
1.1635 8.51 366 1.0536
1.2345 8.53 367 1.0534
1.2716 8.56 368 1.0534
1.3879 8.58 369 1.0533
1.2097 8.6 370 1.0532
1.2209 8.63 371 1.0533
1.3697 8.65 372 1.0536
1.2573 8.67 373 1.0539
1.251 8.7 374 1.0541
1.1992 8.72 375 1.0543
1.3785 8.74 376 1.0543
1.4019 8.77 377 1.0539
1.1751 8.79 378 1.0536
1.2954 8.81 379 1.0533
1.2145 8.84 380 1.0526
1.2806 8.86 381 1.0519
1.4749 8.88 382 1.0511
1.3058 8.91 383 1.0505
1.2426 8.93 384 1.0500
1.2222 8.95 385 1.0495
1.1779 8.98 386 1.0492
1.275 9.0 387 1.0489
1.3287 9.02 388 1.0487
1.4447 9.05 389 1.0486
1.3054 9.07 390 1.0484
1.2826 9.09 391 1.0481
1.1033 9.12 392 1.0479
1.3264 9.14 393 1.0477
1.2907 9.16 394 1.0475
1.2442 9.19 395 1.0473
1.1201 9.21 396 1.0472
1.3126 9.23 397 1.0470
1.1136 9.26 398 1.0469
1.1632 9.28 399 1.0468
1.1759 9.3 400 1.0466
1.3374 9.33 401 1.0466
1.2542 9.35 402 1.0465
0.9985 9.37 403 1.0465
1.0251 9.4 404 1.0465
1.2837 9.42 405 1.0464
1.4331 9.44 406 1.0463
1.3613 9.47 407 1.0462
1.3092 9.49 408 1.0461
1.2802 9.51 409 1.0460
1.199 9.53 410 1.0460
1.4372 9.56 411 1.0459
1.1819 9.58 412 1.0459
1.1887 9.6 413 1.0458
1.0923 9.63 414 1.0457
1.3018 9.65 415 1.0456
1.2139 9.67 416 1.0454
1.264 9.7 417 1.0453
1.276 9.72 418 1.0453
1.1202 9.74 419 1.0452
1.3544 9.77 420 1.0452
1.1664 9.79 421 1.0452
1.2598 9.81 422 1.0451
1.3336 9.84 423 1.0451
1.3495 9.86 424 1.0450
1.3178 9.88 425 1.0450
1.2631 9.91 426 1.0449
1.1517 9.93 427 1.0449
1.1755 9.95 428 1.0449
1.2306 9.98 429 1.0449
1.1321 10.0 430 1.0449

Framework versions

  • Transformers 4.29.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3