SmolLM-1.7B-sft-160k

This model is a fine-tuned version of HuggingFaceTB/SmolLM-1.7B on the generator dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time
1.1547	0.0518	200	1.1745	0.0348
1.1416	0.1037	400	1.1734	0.0348
1.1334	0.1555	600	1.1720	0.0348
1.1415	0.2073	800	1.1710	0.0348
1.1399	0.2591	1000	1.1697	0.0348
1.1448	0.3110	1200	1.1685	0.0348
1.1429	0.3628	1400	1.1673	0.0348
1.1368	0.4146	1600	1.1665	0.0348
1.1309	0.4664	1800	1.1656	0.0348
1.1429	0.5183	2000	1.1646	0.0348
1.1474	0.5701	2200	1.1638	0.0348
1.1311	0.6219	2400	1.1633	0.0348
1.126	0.6737	2600	1.1625	0.0348
1.1356	0.7256	2800	1.1618	0.0348
1.1329	0.7774	3000	1.1613	0.0348
1.129	0.8292	3200	1.1610	0.0348
1.1347	0.8811	3400	1.1605	0.0348
1.1305	0.9329	3600	1.1602	0.0348
1.1278	0.9847	3800	1.1601	0.0348