metadata

license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_keras_callback
model-index:
  - name: node-py/my_awesome_eli5_clm-model
    results: []

node-py/my_awesome_eli5_clm-model

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 1.5653
Validation Loss: 1.6131
Epoch: 37

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
2.5315	2.2372	0
2.2709	2.1303	1
2.1837	2.0685	2
2.1268	2.0216	3
2.0821	1.9830	4
2.0436	1.9497	5
2.0105	1.9194	6
1.9810	1.8955	7
1.9552	1.8767	8
1.9311	1.8544	9
1.9080	1.8386	10
1.8864	1.8183	11
1.8676	1.7983	12
1.8487	1.7856	13
1.8304	1.7766	14
1.8150	1.7672	15
1.7992	1.7472	16
1.7841	1.7402	17
1.7687	1.7266	18
1.7554	1.7215	19
1.7422	1.7091	20
1.7279	1.7099	21
1.7163	1.6969	22
1.7051	1.6856	23
1.6925	1.6795	24
1.6819	1.6712	25
1.6709	1.6665	26
1.6593	1.6606	27
1.6504	1.6572	28
1.6402	1.6542	29
1.6308	1.6493	30
1.6205	1.6393	31
1.6104	1.6329	32
1.5999	1.6361	33
1.5915	1.6329	34
1.5832	1.6229	35
1.5746	1.6142	36
1.5653	1.6131	37

Framework versions

Transformers 4.44.0
TensorFlow 2.16.1
Datasets 2.21.0
Tokenizers 0.19.1