TED_CLM_gpt2_tedlium_additional_head

This model is a fine-tuned version of Lakoc/gpt2_512h_16l_add_head8 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.2945	0.62	3000	2.4760	0.4352
2.0669	1.24	6000	2.2729	0.4767
1.9754	1.86	9000	2.1827	0.4974
1.9292	2.49	12000	2.1086	0.5139
1.8983	3.11	15000	2.0666	0.5223
1.8853	3.73	18000	2.0389	0.5278
1.8708	4.35	21000	2.0216	0.5301
1.8524	4.97	24000	2.0024	0.5352
1.836	5.59	27000	1.9915	0.5365
1.8219	6.22	30000	1.9847	0.5410
1.8134	6.84	33000	1.9670	0.5408
1.8088	7.46	36000	1.9736	0.5425
1.8011	8.08	39000	1.9610	0.5426
1.7901	8.7	42000	1.9519	0.5459
1.7829	9.32	45000	1.9524	0.5463
1.7865	9.94	48000	1.9424	0.5479
1.7775	10.57	51000	1.9421	0.5480
1.7698	11.19	54000	1.9346	0.5486
1.767	11.81	57000	1.9249	0.5493
1.7578	12.43	60000	1.9262	0.5500
1.7613	13.05	63000	1.9185	0.5508
1.7591	13.67	66000	1.9191	0.5523
1.7489	14.29	69000	1.9159	0.5522
1.7506	14.92	72000	1.9139	0.5529