german-jeopardy-longt5-base-128

This model is a fine-tuned version of google/long-t5-tglobal-base on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:

Loss: 1.8010
Brevity Penalty: 0.8577
System Length: 18026
Reference Length: 20793
ROUGE-1: 35.34
ROUGE-2: 16.82
ROUGE-L: 34.13
ROUGE-Lsum: 34.14
Exact Match: 1.41
BLEU: 10.73
F1: 34.55

Model description

See google/long-t5-tglobal-base for more information about the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

Intended uses & limitations

This model can be used for question generation on German text.

Training and evaluation data

See lmqg/qg_dequad.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 4
seed: 7
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adafactor
lr_scheduler_type: constant
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Counts 1	Counts 2	Counts 3	Counts 4	Totals 1	Totals 2	Totals 3	Totals 4	Precisions 1	Precisions 2	Precisions 3	Precisions 4	Brevity Penalty	System Length	Reference Length	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum	Exact Match	BLEU	Mean Generated Length	F1
3.458	0.99	72	2.3696	5618	1383	463	116	15080	12876	10672	8468	37.2546	10.7409	4.3385	1.3699	0.6642	15080	21250	0.2266	0.0841	0.2197	0.2196	0.0005	4.6384	11.3013	0.2226
2.7548	1.99	145	2.1310	6361	1807	700	254	16130	13926	11722	9518	39.4358	12.9757	5.9717	2.6686	0.728	16130	21250	0.2706	0.1122	0.2596	0.2596	0.0036	6.9183	12.206	0.2635
2.5084	2.99	218	2.0244	6758	2001	780	285	16871	14667	12463	10259	40.0569	13.6429	6.2585	2.778	0.7714	16871	21250	0.2888	0.1258	0.2766	0.2767	0.0045	7.616	12.8825	0.2832
2.3562	4.0	291	1.9501	7011	2193	908	360	16796	14592	12388	10184	41.7421	15.0288	7.3297	3.535	0.7671	16796	21250	0.303	0.1375	0.2892	0.2894	0.0077	8.6611	12.9142	0.2978
2.2383	5.0	364	1.8874	7245	2386	1015	435	16708	14504	12300	10096	43.3625	16.4506	8.252	4.3086	0.762	16708	21250	0.3198	0.1498	0.3077	0.3079	0.0113	9.6159	12.8417	0.3155
2.1576	5.99	436	1.8593	7378	2382	997	429	17014	14810	12606	10402	43.3643	16.0837	7.9089	4.1242	0.7796	17014	21250	0.326	0.1497	0.3132	0.3132	0.0109	9.5745	13.2187	0.3215
2.0356	6.99	509	1.8133	7570	2520	1097	482	16999	14795	12591	10387	44.532	17.0328	8.7126	4.6404	0.7787	16999	21250	0.3384	0.158	0.3258	0.3257	0.0123	10.3053	13.0368	0.3339
1.9575	7.99	582	1.7856	7764	2637	1175	545	17379	15175	12971	10767	44.6746	17.3773	9.0587	5.0618	0.8003	17379	21250	0.345	0.1625	0.3322	0.3324	0.0136	10.993	13.4719	0.3407
1.8889	9.0	655	1.7666	7766	2644	1184	532	17102	14898	12694	10490	45.4099	17.7473	9.3272	5.0715	0.7846	17102	21250	0.3487	0.1636	0.3348	0.335	0.0123	10.9637	13.2164	0.3438
1.8201	10.0	728	1.7415	7737	2680	1238	587	17156	14952	12748	10544	45.0979	17.924	9.7113	5.5671	0.7877	17156	21250	0.3453	0.1666	0.3332	0.3333	0.0163	11.3891	13.1388	0.3406
1.7882	10.99	800	1.7331	7859	2722	1241	572	17364	15160	12956	10752	45.2603	17.9551	9.5786	5.3199	0.7995	17364	21250	0.3524	0.1673	0.3387	0.3385	0.0145	11.4047	13.4052	0.3473
1.7095	11.99	873	1.7194	7968	2783	1292	625	17467	15263	13059	10855	45.6175	18.2336	9.8936	5.7577	0.8053	17467	21250	0.3547	0.1708	0.3418	0.3414	0.0154	11.8807	13.4437	0.3495
1.6619	12.99	946	1.7032	8011	2796	1286	604	17433	15229	13025	10821	45.9531	18.3597	9.8733	5.5817	0.8034	17433	21250	0.3584	0.1736	0.3454	0.3454	0.0154	11.7968	13.4964	0.3526
1.6103	13.99	1019	1.7028	8154	2891	1347	636	17665	15461	13257	11053	46.1591	18.6987	10.1607	5.7541	0.8163	17665	21250	0.3659	0.1795	0.3509	0.3508	0.015	12.235	13.7223	0.3602
1.565	15.0	1092	1.6955	8135	2897	1362	665	17530	15326	13122	10918	46.4062	18.9025	10.3795	6.0909	0.8088	17530	21250	0.3668	0.1808	0.3518	0.3516	0.02	12.4116	13.6107	0.3603
1.522	16.0	1165	1.6793	8271	2982	1414	697	17946	15742	13538	11334	46.0883	18.943	10.4447	6.1496	0.8318	17946	21250	0.3695	0.1828	0.354	0.354	0.0191	12.8008	13.9192	0.3632
1.5022	16.99	1237	1.6849	8244	2967	1392	680	17510	15306	13102	10898	47.0817	19.3846	10.6243	6.2397	0.8077	17510	21250	0.3728	0.184	0.3569	0.3569	0.0191	12.6672	13.6243	0.366
1.4359	17.99	1310	1.6862	8328	3050	1448	717	17873	15669	13465	11261	46.5954	19.4652	10.7538	6.3671	0.8278	17873	21250	0.3742	0.1866	0.3582	0.3583	0.0181	13.0683	13.7255	0.3671
1.3994	18.99	1383	1.6775	8272	2998	1417	704	17645	15441	13237	11033	46.8801	19.4158	10.7048	6.3809	0.8152	17645	21250	0.3739	0.1866	0.3583	0.3581	0.0213	12.8728	13.6956	0.3673
1.3609	19.78	1440	1.6884	8347	3062	1465	723	17823	15619	13415	11211	46.8327	19.6043	10.9206	6.449	0.8251	17823	21250	0.3761	0.1886	0.3601	0.3596	0.0204	13.1569	13.7328	0.3692

Framework versions

Transformers 4.32.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.13.3

GiantTreeG
/

german-jeopardy-longt5-base-128