finetuned-model / README.md
abvijaykumar's picture
Model save
d73b9ad
|
raw
history blame
6.3 kB
metadata
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: finetuned-model
    results: []

finetuned-model

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.1626

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 6 6.6915
No log 2.0 12 6.6732
No log 3.0 18 6.6555
No log 4.0 24 6.6385
No log 5.0 30 6.6222
No log 6.0 36 6.6063
No log 7.0 42 6.5910
No log 8.0 48 6.5760
No log 9.0 54 6.5617
No log 10.0 60 6.5482
No log 11.0 66 6.5351
No log 12.0 72 6.5223
No log 13.0 78 6.5101
No log 14.0 84 6.4982
No log 15.0 90 6.4869
No log 16.0 96 6.4758
No log 17.0 102 6.4653
No log 18.0 108 6.4551
No log 19.0 114 6.4453
No log 20.0 120 6.4357
No log 21.0 126 6.4266
No log 22.0 132 6.4177
No log 23.0 138 6.4090
No log 24.0 144 6.4006
No log 25.0 150 6.3924
No log 26.0 156 6.3845
No log 27.0 162 6.3768
No log 28.0 168 6.3696
No log 29.0 174 6.3625
No log 30.0 180 6.3557
No log 31.0 186 6.3489
No log 32.0 192 6.3423
No log 33.0 198 6.3357
No log 34.0 204 6.3294
No log 35.0 210 6.3235
No log 36.0 216 6.3176
No log 37.0 222 6.3119
No log 38.0 228 6.3064
No log 39.0 234 6.3010
No log 40.0 240 6.2957
No log 41.0 246 6.2907
No log 42.0 252 6.2859
No log 43.0 258 6.2811
No log 44.0 264 6.2765
No log 45.0 270 6.2720
No log 46.0 276 6.2675
No log 47.0 282 6.2632
No log 48.0 288 6.2590
No log 49.0 294 6.2550
No log 50.0 300 6.2511
No log 51.0 306 6.2473
No log 52.0 312 6.2437
No log 53.0 318 6.2400
No log 54.0 324 6.2365
No log 55.0 330 6.2331
No log 56.0 336 6.2298
No log 57.0 342 6.2267
No log 58.0 348 6.2237
No log 59.0 354 6.2206
No log 60.0 360 6.2177
No log 61.0 366 6.2149
No log 62.0 372 6.2121
No log 63.0 378 6.2095
No log 64.0 384 6.2068
No log 65.0 390 6.2043
No log 66.0 396 6.2019
No log 67.0 402 6.1995
No log 68.0 408 6.1973
No log 69.0 414 6.1950
No log 70.0 420 6.1929
No log 71.0 426 6.1909
No log 72.0 432 6.1889
No log 73.0 438 6.1870
No log 74.0 444 6.1851
No log 75.0 450 6.1834
No log 76.0 456 6.1817
No log 77.0 462 6.1801
No log 78.0 468 6.1786
No log 79.0 474 6.1772
No log 80.0 480 6.1758
No log 81.0 486 6.1745
No log 82.0 492 6.1733
No log 83.0 498 6.1722
6.7021 84.0 504 6.1711
6.7021 85.0 510 6.1701
6.7021 86.0 516 6.1691
6.7021 87.0 522 6.1683
6.7021 88.0 528 6.1674
6.7021 89.0 534 6.1666
6.7021 90.0 540 6.1660
6.7021 91.0 546 6.1653
6.7021 92.0 552 6.1647
6.7021 93.0 558 6.1642
6.7021 94.0 564 6.1638
6.7021 95.0 570 6.1634
6.7021 96.0 576 6.1631
6.7021 97.0 582 6.1629
6.7021 98.0 588 6.1627
6.7021 99.0 594 6.1626
6.7021 100.0 600 6.1626

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1
  • Datasets 2.14.5
  • Tokenizers 0.13.3