node-py's picture
Training in progress epoch 37
b8b1776
|
raw
history blame
2.85 kB
metadata
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_keras_callback
model-index:
  - name: node-py/my_awesome_eli5_clm-model
    results: []

node-py/my_awesome_eli5_clm-model

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.5653
  • Validation Loss: 1.6131
  • Epoch: 37

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
2.5315 2.2372 0
2.2709 2.1303 1
2.1837 2.0685 2
2.1268 2.0216 3
2.0821 1.9830 4
2.0436 1.9497 5
2.0105 1.9194 6
1.9810 1.8955 7
1.9552 1.8767 8
1.9311 1.8544 9
1.9080 1.8386 10
1.8864 1.8183 11
1.8676 1.7983 12
1.8487 1.7856 13
1.8304 1.7766 14
1.8150 1.7672 15
1.7992 1.7472 16
1.7841 1.7402 17
1.7687 1.7266 18
1.7554 1.7215 19
1.7422 1.7091 20
1.7279 1.7099 21
1.7163 1.6969 22
1.7051 1.6856 23
1.6925 1.6795 24
1.6819 1.6712 25
1.6709 1.6665 26
1.6593 1.6606 27
1.6504 1.6572 28
1.6402 1.6542 29
1.6308 1.6493 30
1.6205 1.6393 31
1.6104 1.6329 32
1.5999 1.6361 33
1.5915 1.6329 34
1.5832 1.6229 35
1.5746 1.6142 36
1.5653 1.6131 37

Framework versions

  • Transformers 4.44.0
  • TensorFlow 2.16.1
  • Datasets 2.21.0
  • Tokenizers 0.19.1