smol-135-tq

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1959
< Precision: 0.9313
< Recall: 0.9555
< F1-score: 0.9432
< Support: 2808.0
Precision: 0.9305
Recall: 0.9134
F1-score: 0.9218
Support: 1743.0
= Precision: 0.8039
= Recall: 0.7305
= F1-score: 0.7655
= Support: 449.0
- Precision: 0.0
- Recall: 0.0
- F1-score: 0.0
- Support: 0.0
Accuracy: 0.9206
Macro Avg Precision: 0.6664
Macro Avg Recall: 0.6498
Macro Avg F1-score: 0.6576
Macro Avg Support: 5000.0
Weighted Avg Precision: 0.9196
Weighted Avg Recall: 0.9206
Weighted Avg F1-score: 0.9198
Weighted Avg Support: 5000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 64
eval_batch_size: 64
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 512
total_eval_batch_size: 256
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: reduce_lr_on_plateau
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	< Precision	< Recall	< F1-score	< Support	> Precision	> Recall	> F1-score	> Support	= Precision	= Recall	= F1-score	= Support	Accuracy	Macro Avg Precision	Macro Avg Recall	Macro Avg F1-score	Macro Avg Support	Weighted Avg Precision	Weighted Avg Recall	Weighted Avg F1-score	Weighted Avg Support
0.8652	1.0	75	0.3609	0.6438	0.9127	0.7550	2808.0	0.7399	0.4326	0.5460	1743.0	0.0	0.0	0.0	449.0	0.6634	0.3459	0.3363	0.3253	5000.0	0.6195	0.6634	0.6144	5000.0
0.5947	2.0	150	0.2978	0.7978	0.8597	0.8276	2808.0	0.7472	0.6850	0.7148	1743.0	0.5106	0.4276	0.4655	449.0	0.76	0.5139	0.4931	0.5019	5000.0	0.7543	0.76	0.7557	5000.0
0.4873	3.0	225	0.2586	0.8546	0.8600	0.8573	2808.0	0.7672	0.7849	0.7760	1743.0	0.6036	0.5256	0.5619	449.0	0.8038	0.5563	0.5426	0.5488	5000.0	0.8016	0.8038	0.8024	5000.0
0.4009	4.0	300	0.2340	0.8798	0.8786	0.8792	2808.0	0.8217	0.8090	0.8153	1743.0	0.5896	0.6303	0.6093	449.0	0.832	0.5728	0.5795	0.5759	5000.0	0.8335	0.832	0.8327	5000.0
0.3242	5.0	375	0.2144	0.8869	0.9192	0.9028	2808.0	0.8504	0.8543	0.8523	1743.0	0.7611	0.5746	0.6548	449.0	0.8656	0.6246	0.5870	0.6025	5000.0	0.8629	0.8656	0.8629	5000.0
0.3002	6.0	450	0.2057	0.9039	0.9181	0.9110	2808.0	0.8655	0.8675	0.8665	1743.0	0.7332	0.6548	0.6918	449.0	0.8768	0.6256	0.6101	0.6173	5000.0	0.8752	0.8768	0.8758	5000.0
0.2216	7.0	525	0.1920	0.8967	0.9402	0.9179	2808.0	0.8881	0.8698	0.8788	1743.0	0.7937	0.6169	0.6942	449.0	0.8866	0.6446	0.6067	0.6228	5000.0	0.8845	0.8866	0.8842	5000.0
0.214	8.0	600	0.2088	0.9230	0.9220	0.9225	2808.0	0.8693	0.8853	0.8772	1743.0	0.7286	0.6815	0.7043	449.0	0.8876	0.6302	0.6222	0.6260	5000.0	0.8868	0.8876	0.8871	5000.0
0.2029	9.0	675	0.2069	0.9010	0.9402	0.9202	2808.0	0.8986	0.8589	0.8783	1743.0	0.7698	0.6927	0.7292	449.0	0.8896	0.6423	0.6229	0.6319	5000.0	0.8884	0.8896	0.8884	5000.0
0.2235	10.0	750	0.1974	0.9253	0.9263	0.9258	2808.0	0.8807	0.8933	0.8869	1743.0	0.7601	0.7127	0.7356	449.0	0.8956	0.6415	0.6331	0.6371	5000.0	0.8949	0.8956	0.8952	5000.0
0.1841	11.0	825	0.1988	0.9152	0.9384	0.9267	2808.0	0.9093	0.8738	0.8912	1743.0	0.7466	0.7416	0.7441	449.0	0.8982	0.6428	0.6385	0.6405	5000.0	0.8980	0.8982	0.8979	5000.0
0.1704	12.0	900	0.2004	0.9334	0.9281	0.9307	2808.0	0.8847	0.9071	0.8958	1743.0	0.7672	0.7194	0.7425	449.0	0.902	0.6463	0.6386	0.6422	5000.0	0.9015	0.902	0.9016	5000.0
0.1639	13.0	975	0.1904	0.9387	0.9330	0.9359	2808.0	0.8923	0.9266	0.9091	1743.0	0.7769	0.6904	0.7311	449.0	0.909	0.6520	0.6375	0.6440	5000.0	0.9080	0.909	0.9082	5000.0
0.1808	14.0	1050	0.1972	0.9216	0.9459	0.9336	2808.0	0.9133	0.9002	0.9067	1743.0	0.7975	0.7105	0.7515	449.0	0.9088	0.6581	0.6391	0.6479	5000.0	0.9075	0.9088	0.9078	5000.0
0.1664	15.0	1125	0.2045	0.9275	0.9345	0.9310	2808.0	0.8991	0.9002	0.8997	1743.0	0.7653	0.7261	0.7451	449.0	0.9038	0.6480	0.6402	0.6439	5000.0	0.9031	0.9038	0.9034	5000.0
0.1329	16.0	1200	0.1922	0.9375	0.9402	0.9388	2808.0	0.9054	0.9174	0.9114	1743.0	0.7799	0.7261	0.7520	449.0	0.913	0.6557	0.6459	0.6506	5000.0	0.9122	0.913	0.9125	5000.0
0.1289	17.0	1275	0.1983	0.9451	0.9387	0.9419	2808.0	0.9059	0.9225	0.9142	1743.0	0.7729	0.7506	0.7616	449.0	0.9162	0.6560	0.6530	0.6544	5000.0	0.9160	0.9162	0.9161	5000.0
0.1276	18.0	1350	0.1980	0.9415	0.9405	0.9410	2808.0	0.9156	0.9151	0.9154	1743.0	0.7594	0.7661	0.7627	449.0	0.916	0.6541	0.6554	0.6548	5000.0	0.9161	0.916	0.9161	5000.0
0.1334	19.0	1425	0.1959	0.9313	0.9555	0.9432	2808.0	0.9305	0.9134	0.9218	1743.0	0.8039	0.7305	0.7655	449.0	0.9206	0.6664	0.6498	0.6576	5000.0	0.9196	0.9206	0.9198	5000.0
0.1394	20.0	1500	0.2005	0.9487	0.9352	0.9419	2808.0	0.9068	0.9271	0.9169	1743.0	0.7689	0.7706	0.7697	449.0	0.9176	0.6561	0.6582	0.6571	5000.0	0.9180	0.9176	0.9177	5000.0
0.1305	21.0	1575	0.2037	0.9247	0.9573	0.9407	2808.0	0.9353	0.9036	0.9192	1743.0	0.7897	0.7194	0.7529	449.0	0.9172	0.6624	0.6451	0.6532	5000.0	0.9162	0.9172	0.9163	5000.0
0.1283	22.0	1650	0.2020	0.9366	0.9466	0.9416	2808.0	0.9242	0.9099	0.9170	1743.0	0.7668	0.7617	0.7642	449.0	0.9172	0.6569	0.6545	0.6557	5000.0	0.9170	0.9172	0.9171	5000.0

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu124
Datasets 3.0.1
Tokenizers 0.21.0

hugosousa
/

smol-135-tq

smol-135-tq

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for hugosousa/smol-135-tq

Evaluation results