mt5-base_V25775_V44105_V53874

This model is a fine-tuned version of emilstabil/mt5-base_V25775_V44105 on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.4133
Rouge1: 32.5092
Rouge2: 11.7441
Rougel: 21.6511
Rougelsum: 26.5277
Gen Len: 89.5536

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 3
eval_batch_size: 3
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.7944	0.81	500	2.1910	29.7996	11.2696	20.869	24.5115	82.6953
1.7465	1.61	1000	2.1442	29.7799	10.6877	20.6595	24.4396	80.9399
1.7379	2.42	1500	2.1823	30.3924	11.0181	20.9591	25.0604	87.3562
1.6977	3.23	2000	2.1876	29.3696	10.6016	20.5417	24.0967	78.1373
1.6613	4.03	2500	2.1891	29.777	10.8733	20.5695	24.636	77.4635
1.6424	4.84	3000	2.1925	30.5398	11.2902	21.0876	25.0424	82.794
1.6131	5.65	3500	2.2061	30.4751	11.2886	21.0148	24.9771	79.9099
1.6193	6.45	4000	2.2357	30.8465	11.0227	21.1036	25.1891	82.6738
1.5806	7.26	4500	2.2180	31.4661	11.5008	21.4756	26.0325	86.1202
1.5742	8.06	5000	2.2132	31.3554	11.3481	21.259	25.7304	86.2146
1.5653	8.87	5500	2.2133	32.3515	11.5784	21.9243	26.6567	90.4635
1.5532	9.68	6000	2.2253	31.1892	11.2645	21.1858	25.8852	87.5408
1.5142	10.48	6500	2.2360	30.1483	10.9003	20.9238	24.7488	78.4335
1.5105	11.29	7000	2.2462	31.1562	11.3171	21.3149	25.5669	85.0
1.5068	12.1	7500	2.2288	30.1954	11.2925	20.9437	24.9113	76.6094
1.483	12.9	8000	2.2445	30.4498	11.3156	21.0888	25.0539	79.2704
1.4544	13.71	8500	2.2285	31.6744	11.7017	21.964	26.3215	85.2146
1.4833	14.52	9000	2.2336	31.2326	11.3786	21.2688	25.5345	83.176
1.4305	15.32	9500	2.2555	31.1458	11.109	21.1361	25.4995	86.5408
1.4607	16.13	10000	2.2693	31.2104	11.5511	21.4548	25.669	84.133
1.4181	16.94	10500	2.2606	32.0839	11.4895	21.353	26.02	90.1888
1.4191	17.74	11000	2.2547	32.0803	11.6566	21.7206	26.3547	86.5494
1.4009	18.55	11500	2.2888	31.2863	11.5981	21.498	25.6535	82.5665
1.3916	19.35	12000	2.2781	31.6163	11.2085	21.253	25.8589	90.6009
1.3915	20.16	12500	2.2871	31.398	11.2152	21.41	25.7807	84.1245
1.3778	20.97	13000	2.2808	31.9543	11.5922	21.6471	26.1187	87.9871
1.3398	21.77	13500	2.3114	32.5911	11.7559	21.6985	26.4832	90.2618
1.3669	22.58	14000	2.3005	32.2284	11.8151	21.8298	26.256	89.2532
1.3159	23.39	14500	2.3152	32.189	11.6752	21.6752	26.4623	89.6524
1.3231	24.19	15000	2.3172	32.2582	11.7664	21.7995	26.5449	88.6524
1.3014	25.0	15500	2.3247	32.3611	11.6169	21.7312	26.5212	89.176
1.2752	25.81	16000	2.3349	32.0774	11.8314	21.7343	26.6137	88.4077
1.2787	26.61	16500	2.3302	31.7149	11.4065	21.3784	26.1065	88.1202
1.2728	27.42	17000	2.3484	32.359	11.7853	21.8351	26.4675	88.4807
1.2524	28.23	17500	2.3529	32.1259	11.8012	21.6175	26.1721	88.4206
1.236	29.03	18000	2.3635	32.0371	11.7357	21.7101	26.387	87.5665
1.2356	29.84	18500	2.3694	32.4209	11.4981	21.558	26.5013	91.9614
1.2239	30.65	19000	2.3739	32.2042	11.6382	21.6439	26.3635	88.5107
1.2158	31.45	19500	2.3792	32.6755	11.8155	21.7073	26.7322	89.9871
1.2084	32.26	20000	2.3922	33.1023	11.7153	21.9296	27.1142	92.3906
1.1994	33.06	20500	2.3991	32.6802	11.4579	21.5642	26.6404	93.0215
1.2011	33.87	21000	2.3956	32.9197	11.8239	21.8725	26.8542	92.1803
1.1993	34.68	21500	2.4024	32.1903	11.579	21.597	26.5418	91.4335
1.1688	35.48	22000	2.3975	32.4983	11.6353	21.5989	26.5309	89.3648
1.1969	36.29	22500	2.4042	32.8631	11.8492	21.8471	26.847	90.3433
1.1595	37.1	23000	2.4141	32.708	11.7882	21.7535	26.6902	90.6609
1.1755	37.9	23500	2.4188	32.552	11.8842	21.8309	26.8171	91.3305
1.1613	38.71	24000	2.4159	32.3059	11.6832	21.7439	26.5204	89.7639
1.1549	39.52	24500	2.4133	32.5092	11.7441	21.6511	26.5277	89.5536

Framework versions

Transformers 4.32.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.13.3

emilstabil
/

mt5-base_V25775_V44105_V53874

mt5-base_V25775_V44105_V53874

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for emilstabil/mt5-base_V25775_V44105_V53874

Evaluation results