flan-t5-base-prompt_tuning-cnn-dailymail

This model is a fine-tuned version of google/flan-t5-base on the cnn_dailymail dataset. It achieves the following results on the evaluation set:

Loss: 19.3074
Rouge1: 0.0787
Rouge2: 0.0088
Rougel: 0.0609
Rougelsum: 0.0733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
20.9307	1.0	188	19.4344	0.1471	0.0433	0.1103	0.1337
20.4274	2.0	376	20.1245	0.1199	0.0299	0.0953	0.1135
20.1641	3.0	564	19.5964	0.1178	0.024	0.0909	0.1072
20.5294	4.0	752	19.2955	0.1164	0.0213	0.0882	0.1055
20.6452	5.0	940	19.4288	0.1179	0.0239	0.0895	0.1072
20.6916	6.0	1128	19.1208	0.0997	0.0186	0.0795	0.093
20.8065	7.0	1316	18.9300	0.0865	0.0116	0.0688	0.08
20.1431	8.0	1504	19.7751	0.1118	0.0247	0.0869	0.1023
20.5281	9.0	1692	20.0590	0.1216	0.0278	0.0923	0.1118
20.1805	10.0	1880	19.3949	0.1025	0.0145	0.0818	0.0948
20.4289	11.0	2068	19.1645	0.0844	0.0086	0.0656	0.0753
20.1469	12.0	2256	19.4850	0.0905	0.0062	0.0697	0.0831
20.9285	13.0	2444	19.3351	0.0853	0.0077	0.067	0.0785
20.1419	14.0	2632	19.1241	0.0886	0.0097	0.0684	0.0822
20.5547	15.0	2820	19.1532	0.0897	0.0077	0.0704	0.0804
19.5719	16.0	3008	19.2346	0.0885	0.0107	0.0659	0.0794
20.3043	17.0	3196	19.3873	0.105	0.0188	0.0829	0.0949
20.5935	18.0	3384	19.3345	0.1132	0.0203	0.0874	0.1025
20.413	19.0	3572	18.8964	0.0751	0.0065	0.0593	0.0686
19.9286	20.0	3760	18.8474	0.0813	0.0082	0.0648	0.0725
19.9246	21.0	3948	19.3425	0.0844	0.0096	0.0694	0.0765
20.4844	22.0	4136	19.4680	0.1012	0.0143	0.0782	0.0923
20.1571	23.0	4324	19.5483	0.0808	0.0093	0.0665	0.0762
20.0099	24.0	4512	18.5052	0.056	0.0029	0.0479	0.0516
19.6279	25.0	4700	18.7629	0.0735	0.0082	0.0603	0.0649
19.303	26.0	4888	19.3608	0.1015	0.0124	0.0766	0.0885
20.8774	27.0	5076	19.3038	0.1008	0.013	0.0807	0.0932
20.1431	28.0	5264	19.3426	0.0991	0.0156	0.078	0.0918
20.4304	29.0	5452	19.3918	0.0905	0.0102	0.0734	0.0812
19.6689	30.0	5640	19.3527	0.088	0.0105	0.0669	0.0785
20.661	31.0	5828	19.4042	0.0996	0.0149	0.0767	0.0887
20.2962	32.0	6016	19.3871	0.0758	0.0101	0.0617	0.0702
20.5865	33.0	6204	19.3255	0.0786	0.0106	0.064	0.0733
21.4763	34.0	6392	19.3113	0.0755	0.0087	0.0623	0.0688
21.3826	35.0	6580	19.3089	0.075	0.0076	0.0609	0.0689
20.8869	36.0	6768	19.3614	0.0906	0.0143	0.0692	0.0812
20.527	37.0	6956	19.3784	0.0874	0.0099	0.0686	0.0797
19.5026	38.0	7144	19.4145	0.0888	0.0111	0.068	0.0823
19.3852	39.0	7332	19.3794	0.0815	0.0093	0.0616	0.0742
20.5347	40.0	7520	19.3074	0.0787	0.0088	0.0609	0.0733

Framework versions

PEFT 0.8.2
Transformers 4.37.0
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.1

RMWeerasinghe
/

flan-t5-base-prompt_tuning-cnn-dailymail

flan-t5-base-prompt_tuning-cnn-dailymail

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for RMWeerasinghe/flan-t5-base-prompt_tuning-cnn-dailymail

Dataset used to train RMWeerasinghe/flan-t5-base-prompt_tuning-cnn-dailymail

Evaluation results