metadata

license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-large-extraction-cnndm_4000-all
    results: []

flan-t5-large-extraction-cnndm_4000-all

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.7290
Rouge1: 35.0775
Rouge2: 15.2209
Rougel: 30.1796
Rougelsum: 30.1599
Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 24
seed: 1799
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.1464	0.4	200	1.8323	35.2242	15.3495	30.142	30.1331	19.0
1.9817	0.8	400	1.7729	34.3798	14.7287	29.5447	29.6052	18.986
1.8842	1.2	600	1.7602	34.5807	15.1707	29.7768	29.8081	18.986
1.8129	1.6	800	1.7629	34.5103	15.231	29.9182	29.9333	19.0
1.8238	2.0	1000	1.7290	35.0775	15.2209	30.1796	30.1599	19.0
1.7199	2.4	1200	1.7354	34.6552	15.7256	30.1894	30.2207	18.998
1.7128	2.8	1400	1.7407	34.7198	15.5771	30.0585	30.0442	19.0
1.6816	3.2	1600	1.7508	34.9611	15.5792	30.3518	30.3638	19.0

Framework versions

Transformers 4.18.0
Pytorch 1.10.0+cu111
Datasets 2.5.1
Tokenizers 0.12.1