age_sentence_cosine

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
5.2482	0.0127	2000	4.2534
4.1494	0.0254	4000	4.0313
4.003	0.0381	6000	3.9020
3.8886	0.0508	8000	3.8179
3.8368	0.0635	10000	3.7863
3.7814	0.0762	12000	3.7358
3.7297	0.0889	14000	3.6976
3.726	0.1016	16000	3.6959
3.6673	0.1143	18000	3.6500
3.6451	0.1270	20000	3.6304
3.6604	0.1397	22000	3.6592
3.5986	0.1524	24000	3.6039
3.5943	0.1651	26000	3.6218
3.5968	0.1778	28000	3.5981
3.5601	0.1905	30000	3.5745
3.5672	0.2032	32000	3.5908
3.5477	0.2159	34000	3.5710
3.5269	0.2286	36000	3.5532
3.5498	0.2413	38000	3.5685
3.5047	0.2540	40000	3.5435
3.5003	0.2667	42000	3.5336
3.5286	0.2794	44000	3.5777
3.4799	0.2921	46000	3.5303
3.4887	0.3048	48000	3.5562
3.4944	0.3175	50000	3.5346
3.4681	0.3302	52000	3.5193
3.4823	0.3429	54000	3.5421
3.4663	0.3556	56000	3.5235
3.4521	0.3682	58000	3.5071
3.4818	0.3809	60000	3.5267
3.4372	0.3936	62000	3.5073
3.4403	0.4063	64000	3.5032
3.4689	0.4190	66000	3.5450
3.4262	0.4317	68000	3.4986
3.4406	0.4444	70000	3.5267
3.4416	0.4571	72000	3.5079
3.4209	0.4698	74000	3.4937
3.4417	0.4825	76000	3.5148
3.4205	0.4952	78000	3.4973
3.4128	0.5079	80000	3.4883
3.4453	0.5206	82000	3.5062
3.3983	0.5333	84000	3.4864
3.4047	0.5460	86000	3.5144
3.4332	0.5587	88000	3.5226
3.3971	0.5714	90000	3.4823
3.4102	0.5841	92000	3.5111
3.4115	0.5968	94000	3.4884
3.3939	0.6095	96000	3.4796
3.4183	0.6222	98000	3.5024
3.3914	0.6349	100000	3.4837
3.3908	0.6476	102000	3.4744
3.4234	0.6603	104000	3.4961
3.3744	0.6730	106000	3.4737
3.3865	0.6857	108000	3.5026