debiased_gpt_2 / README.md
ncantalupa's picture
End of training
559fadf verified
|
raw
history blame
3.85 kB
metadata
library_name: transformers
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: debiased_gpt_2
    results: []

debiased_gpt_2

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6430

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 1.9343
No log 2.0 8 1.8912
No log 3.0 12 1.6980
No log 4.0 16 1.5873
No log 5.0 20 1.5396
No log 6.0 24 1.4100
No log 7.0 28 1.3086
No log 8.0 32 1.2451
No log 9.0 36 1.1922
No log 10.0 40 1.0959
No log 11.0 44 1.0326
No log 12.0 48 0.9796
No log 13.0 52 0.9345
No log 14.0 56 0.9033
No log 15.0 60 0.8493
No log 16.0 64 0.8235
No log 17.0 68 0.8197
No log 18.0 72 0.7852
No log 19.0 76 0.7588
No log 20.0 80 0.7384
No log 21.0 84 0.7320
No log 22.0 88 0.7220
No log 23.0 92 0.7097
No log 24.0 96 0.7019
No log 25.0 100 0.6938
No log 26.0 104 0.6912
No log 27.0 108 0.6869
No log 28.0 112 0.6760
No log 29.0 116 0.6719
No log 30.0 120 0.6732
No log 31.0 124 0.6657
No log 32.0 128 0.6610
No log 33.0 132 0.6611
No log 34.0 136 0.6624
No log 35.0 140 0.6622
No log 36.0 144 0.6595
No log 37.0 148 0.6564
No log 38.0 152 0.6523
No log 39.0 156 0.6498
No log 40.0 160 0.6493
No log 41.0 164 0.6502
No log 42.0 168 0.6503
No log 43.0 172 0.6485
No log 44.0 176 0.6471
No log 45.0 180 0.6466
No log 46.0 184 0.6458
No log 47.0 188 0.6449
No log 48.0 192 0.6438
No log 49.0 196 0.6432
No log 50.0 200 0.6430

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3