flan-t5-rouge-squad-qg-test

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4416
Rouge1: 0.3489
Rouge2: 0.1081
Rougel: 0.3225
Rougelsum: 0.3335

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 80
eval_batch_size: 80
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 320
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 160

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
41.7986	1.0	3	14.9730	0.0645	0.0187	0.0619	0.0626
18.0367	2.0	6	6.4506	0.0696	0.0369	0.0673	0.0690
11.6807	3.0	9	4.8843	0.1134	0.0385	0.0991	0.1016
9.5977	4.0	12	4.1902	0.0615	0.0227	0.0529	0.0562
8.382	5.0	15	3.7084	0.0108	0.0017	0.0108	0.0107
7.3099	6.0	18	3.1393	0.0334	0.0139	0.0321	0.0319
6.2255	7.0	21	2.6959	0.0484	0.0206	0.0471	0.0478
5.3866	8.0	24	2.2886	0.0942	0.0388	0.0902	0.0928
4.4362	9.0	27	1.6919	0.1476	0.0517	0.1244	0.1336
3.5819	10.0	30	1.2444	0.2204	0.0785	0.1939	0.2064
2.7713	11.0	33	0.9173	0.3423	0.1261	0.3144	0.3277
2.1415	12.0	36	0.6726	0.3756	0.1188	0.3440	0.3607
1.6248	13.0	39	0.4801	0.3757	0.1220	0.3387	0.3573
1.3172	14.0	42	0.3892	0.3855	0.1316	0.3478	0.3655
1.0707	15.0	45	0.3425	0.3863	0.1358	0.3514	0.3691
0.8661	16.0	48	0.3104	0.3820	0.1376	0.3447	0.3624
0.7925	17.0	51	0.2946	0.3937	0.1408	0.3620	0.3738
0.6878	18.0	54	0.2863	0.3893	0.1375	0.3568	0.3672
0.6841	19.0	57	0.2810	0.3959	0.1427	0.3635	0.3731
0.6014	20.0	60	0.2782	0.3991	0.1447	0.3663	0.3794
0.5921	21.0	63	0.2786	0.4018	0.1467	0.3696	0.3841
0.5582	22.0	66	0.2776	0.3967	0.1414	0.3618	0.3767
0.5268	23.0	69	0.2785	0.3984	0.1479	0.3669	0.3822
0.4784	24.0	72	0.2796	0.4031	0.1519	0.3709	0.3851
0.4378	25.0	75	0.2831	0.4001	0.1501	0.3667	0.3805
0.4395	26.0	78	0.2876	0.4015	0.1522	0.3691	0.3811
0.4269	27.0	81	0.2897	0.4055	0.1455	0.3747	0.3845
0.3955	28.0	84	0.2925	0.3912	0.1330	0.3595	0.3694
0.3876	29.0	87	0.2976	0.3881	0.1354	0.3592	0.3677
0.3593	30.0	90	0.3008	0.3875	0.1374	0.3580	0.3686
0.3477	31.0	93	0.3038	0.3792	0.1303	0.3507	0.3609
0.3368	32.0	96	0.3079	0.3854	0.1331	0.3601	0.3677
0.3019	33.0	99	0.3134	0.3820	0.1272	0.3524	0.3633
0.3141	34.0	102	0.3202	0.3733	0.1229	0.3431	0.3541
0.2914	35.0	105	0.3233	0.3814	0.1257	0.3514	0.3638
0.2817	36.0	108	0.3250	0.3822	0.1316	0.3563	0.3636
0.2875	37.0	111	0.3280	0.3898	0.1405	0.3650	0.3737
0.267	38.0	114	0.3343	0.3878	0.1353	0.3616	0.3708
0.264	39.0	117	0.3375	0.3761	0.1182	0.3484	0.3589
0.2519	40.0	120	0.3372	0.3781	0.1228	0.3504	0.3606
0.2508	41.0	123	0.3382	0.3810	0.1244	0.3538	0.3635
0.2373	42.0	126	0.3460	0.3805	0.1230	0.3533	0.3632
0.2316	43.0	129	0.3533	0.3692	0.1125	0.3396	0.3514
0.2271	44.0	132	0.3552	0.3576	0.1133	0.3313	0.3394
0.2133	45.0	135	0.3565	0.3643	0.1244	0.3401	0.3481
0.2167	46.0	138	0.3602	0.3683	0.1245	0.3408	0.3490
0.2119	47.0	141	0.3647	0.3694	0.1278	0.3399	0.3493
0.1976	48.0	144	0.3677	0.3590	0.1194	0.3322	0.3414
0.2133	49.0	147	0.3720	0.3531	0.1115	0.3275	0.3351
0.1923	50.0	150	0.3746	0.3621	0.1189	0.3339	0.3413
0.1854	51.0	153	0.3760	0.3707	0.1280	0.3438	0.3528
0.1872	52.0	156	0.3767	0.3635	0.1219	0.3358	0.3463
0.1827	53.0	159	0.3790	0.3657	0.1196	0.3384	0.3494
0.1801	54.0	162	0.3833	0.3611	0.1195	0.3276	0.3426
0.1787	55.0	165	0.3903	0.3595	0.1202	0.3285	0.3411
0.1713	56.0	168	0.3923	0.3566	0.1179	0.3258	0.3379
0.1626	57.0	171	0.3941	0.3497	0.1152	0.3185	0.3325
0.1599	58.0	174	0.3922	0.3605	0.1216	0.3305	0.3448
0.1603	59.0	177	0.3929	0.3478	0.1079	0.3188	0.3329
0.1794	60.0	180	0.3958	0.3455	0.1057	0.3179	0.3319
0.1626	61.0	183	0.3997	0.3481	0.1078	0.3203	0.3320
0.1433	62.0	186	0.4019	0.3529	0.1129	0.3278	0.3386
0.1489	63.0	189	0.4008	0.3446	0.1137	0.3220	0.3291
0.1595	64.0	192	0.4009	0.3579	0.1159	0.3345	0.3421
0.1557	65.0	195	0.4044	0.3506	0.1165	0.3269	0.3342
0.1435	66.0	198	0.4094	0.3404	0.1082	0.3159	0.3257
0.1427	67.0	201	0.4140	0.3450	0.1103	0.3193	0.3301
0.1494	68.0	204	0.4163	0.3421	0.1090	0.3198	0.3276
0.1493	69.0	207	0.4137	0.3481	0.1101	0.3230	0.3318
0.14	70.0	210	0.4107	0.3438	0.1083	0.3193	0.3277
0.1338	71.0	213	0.4107	0.3432	0.1068	0.3199	0.3270
0.1302	72.0	216	0.4134	0.3573	0.1097	0.3317	0.3428
0.1354	73.0	219	0.4162	0.3525	0.1092	0.3270	0.3376
0.1379	74.0	222	0.4193	0.3402	0.1069	0.3177	0.3249
0.1272	75.0	225	0.4233	0.3397	0.1059	0.3173	0.3244
0.1331	76.0	228	0.4248	0.3364	0.1021	0.3149	0.3223
0.1211	77.0	231	0.4258	0.3459	0.1076	0.3235	0.3312
0.1324	78.0	234	0.4267	0.3488	0.1066	0.3257	0.3335
0.1275	79.0	237	0.4272	0.3458	0.1165	0.3201	0.3301
0.1265	80.0	240	0.4279	0.3519	0.1188	0.3288	0.3366
0.1227	81.0	243	0.4293	0.3458	0.1093	0.3261	0.3317
0.1213	82.0	246	0.4323	0.3437	0.1051	0.3189	0.3288
0.1275	83.0	249	0.4347	0.3457	0.1065	0.3212	0.3318
0.1233	84.0	252	0.4346	0.3491	0.1048	0.3235	0.3337
0.1168	85.0	255	0.4349	0.3450	0.1035	0.3208	0.3314
0.1184	86.0	258	0.4347	0.3480	0.1050	0.3255	0.3336
0.1246	87.0	261	0.4336	0.3483	0.1058	0.3272	0.3347
0.1167	88.0	264	0.4333	0.3470	0.1065	0.3269	0.3343
0.1203	89.0	267	0.4334	0.3494	0.1112	0.3278	0.3351
0.1139	90.0	270	0.4339	0.3460	0.1114	0.3253	0.3314
0.1202	91.0	273	0.4341	0.3497	0.1103	0.3252	0.3352
0.1174	92.0	276	0.4344	0.3497	0.1103	0.3252	0.3352
0.1164	93.0	279	0.4350	0.3504	0.1099	0.3249	0.3365
0.1114	94.0	282	0.4357	0.3445	0.1073	0.3188	0.3299
0.1094	95.0	285	0.4368	0.3455	0.1076	0.3197	0.3308
0.114	96.0	288	0.4376	0.3483	0.1105	0.3236	0.3336
0.1147	97.0	291	0.4381	0.3458	0.1099	0.3207	0.3303
0.116	98.0	294	0.4386	0.3458	0.1099	0.3207	0.3303
0.1187	99.0	297	0.4393	0.3499	0.1100	0.3234	0.3341
0.1112	100.0	300	0.4399	0.3519	0.1146	0.3260	0.3368
0.1124	101.0	303	0.4404	0.3519	0.1146	0.3260	0.3368
0.117	102.0	306	0.4408	0.3489	0.1081	0.3225	0.3335
0.1101	103.0	309	0.4412	0.3489	0.1081	0.3225	0.3335
0.1135	104.0	312	0.4415	0.3472	0.1075	0.3208	0.3311
0.1141	105.0	315	0.4416	0.3489	0.1081	0.3225	0.3335
0.1201	106.0	318	0.4416	0.3489	0.1081	0.3225	0.3335
0.2258	106.8	320	0.4416	0.3489	0.1081	0.3225	0.3335

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

devagonal
/

flan-t5-rouge-squad-qg-test

flan-t5-rouge-squad-qg-test

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for devagonal/flan-t5-rouge-squad-qg-test

Evaluation results