smol-135-tq-closure-augment

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1047
< Precision: 0.9674
< Recall: 0.9708
< F1-score: 0.9691
< Support: 4865.0
Precision: 0.9686
Recall: 0.9706
F1-score: 0.9696
Support: 4865.0
= Precision: 0.8734
= Recall: 0.8065
= F1-score: 0.8386
= Support: 248.0
- Precision: 0.4286
- Recall: 0.2727
- F1-score: 0.3333
- Support: 22.0
Accuracy: 0.9651
Macro Avg Precision: 0.8095
Macro Avg Recall: 0.7551
Macro Avg F1-score: 0.7777
Macro Avg Support: 10000.0
Weighted Avg Precision: 0.9645
Weighted Avg Recall: 0.9651
Weighted Avg F1-score: 0.9647
Weighted Avg Support: 10000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 64
eval_batch_size: 64
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 512
total_eval_batch_size: 256
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: reduce_lr_on_plateau
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	< Precision	< Recall	< F1-score	< Support	> Precision	> Recall	> F1-score	> Support	= Precision	= Recall	= F1-score	= Support	- Precision	- Recall	- F1-score	- Support	Accuracy	Macro Avg Precision	Macro Avg Recall	Macro Avg F1-score	Macro Avg Support	Weighted Avg Precision	Weighted Avg Recall	Weighted Avg F1-score	Weighted Avg Support
0.4482	1.0	981	0.2402	0.8193	0.8399	0.8295	4865.0	0.8187	0.8436	0.8309	4865.0	0.0	0.0	0.0	248.0	0.0	0.0	0.0	22.0	0.819	0.4095	0.4209	0.4151	10000.0	0.7969	0.819	0.8078	10000.0
0.2725	2.0	1962	0.1563	0.9366	0.9137	0.9250	4865.0	0.8957	0.9531	0.9235	4865.0	0.7532	0.2339	0.3569	248.0	0.0	0.0	0.0	22.0	0.914	0.6464	0.5252	0.5514	10000.0	0.9101	0.914	0.9081	10000.0
0.2609	3.0	2943	0.1362	0.9356	0.9464	0.9409	4865.0	0.9464	0.9361	0.9412	4865.0	0.6479	0.6976	0.6718	248.0	0.0	0.0	0.0	22.0	0.9331	0.6325	0.6450	0.6385	10000.0	0.9316	0.9331	0.9323	10000.0
0.2188	4.0	3924	0.1212	0.9452	0.9599	0.9525	4865.0	0.9559	0.9494	0.9527	4865.0	0.7803	0.7016	0.7389	248.0	0.5	0.0909	0.1538	22.0	0.9465	0.7953	0.6755	0.6995	10000.0	0.9453	0.9465	0.9455	10000.0
0.2196	5.0	4905	0.1162	0.9540	0.9632	0.9586	4865.0	0.9608	0.9568	0.9588	4865.0	0.7667	0.7419	0.7541	248.0	1.0	0.1364	0.24	22.0	0.9528	0.9204	0.6996	0.7279	10000.0	0.9528	0.9528	0.9520	10000.0
0.2002	6.0	5886	0.1131	0.9548	0.9630	0.9589	4865.0	0.9539	0.9620	0.9579	4865.0	0.8743	0.6452	0.7425	248.0	0.5	0.0909	0.1538	22.0	0.9527	0.8208	0.6653	0.7033	10000.0	0.9514	0.9527	0.9513	10000.0
0.2211	7.0	6867	0.1111	0.9552	0.9718	0.9634	4865.0	0.9694	0.9587	0.9640	4865.0	0.8533	0.7742	0.8118	248.0	0.4286	0.2727	0.3333	22.0	0.959	0.8016	0.7444	0.7682	10000.0	0.9584	0.959	0.9586	10000.0
0.1976	8.0	7848	0.1137	0.9502	0.9720	0.9610	4865.0	0.9694	0.9496	0.9594	4865.0	0.8075	0.7782	0.7926	248.0	0.1667	0.1364	0.15	22.0	0.9545	0.7234	0.7091	0.7157	10000.0	0.9542	0.9545	0.9543	10000.0
0.1912	9.0	8829	0.1070	0.9677	0.9605	0.9641	4865.0	0.9566	0.9694	0.9629	4865.0	0.8475	0.8065	0.8264	248.0	1.0	0.2273	0.3704	22.0	0.9594	0.9429	0.7409	0.7810	10000.0	0.9594	0.9594	0.9588	10000.0
0.1777	10.0	9810	0.1077	0.9654	0.9591	0.9623	4865.0	0.9564	0.9704	0.9634	4865.0	0.8829	0.7903	0.8340	248.0	0.4444	0.1818	0.2581	22.0	0.9587	0.8123	0.7254	0.7544	10000.0	0.9579	0.9587	0.9581	10000.0
0.1766	11.0	10791	0.1084	0.9621	0.9659	0.9640	4865.0	0.9633	0.9651	0.9642	4865.0	0.8584	0.8065	0.8316	248.0	0.4444	0.1818	0.2581	22.0	0.9598	0.8071	0.7298	0.7545	10000.0	0.9590	0.9598	0.9592	10000.0
0.1709	12.0	11772	0.1066	0.9623	0.9698	0.9660	4865.0	0.9671	0.9655	0.9663	4865.0	0.8789	0.7903	0.8323	248.0	0.2353	0.1818	0.2051	22.0	0.9615	0.7609	0.7268	0.7424	10000.0	0.9609	0.9615	0.9611	10000.0
0.1805	13.0	12753	0.1076	0.9703	0.9614	0.9658	4865.0	0.9598	0.9727	0.9662	4865.0	0.8636	0.7661	0.8120	248.0	0.2333	0.3182	0.2692	22.0	0.9606	0.7568	0.7546	0.7533	10000.0	0.9610	0.9606	0.9607	10000.0
0.1854	14.0	13734	0.1057	0.9731	0.9581	0.9655	4865.0	0.9585	0.9731	0.9657	4865.0	0.8031	0.8387	0.8205	248.0	0.4167	0.2273	0.2941	22.0	0.9608	0.7878	0.7493	0.7615	10000.0	0.9605	0.9608	0.9605	10000.0
0.1697	15.0	14715	0.1047	0.9674	0.9708	0.9691	4865.0	0.9686	0.9706	0.9696	4865.0	0.8734	0.8065	0.8386	248.0	0.4286	0.2727	0.3333	22.0	0.9651	0.8095	0.7551	0.7777	10000.0	0.9645	0.9651	0.9647	10000.0
0.1747	16.0	15696	0.1061	0.9656	0.9706	0.9681	4865.0	0.9713	0.9671	0.9692	4865.0	0.8110	0.8306	0.8207	248.0	0.5	0.2727	0.3529	22.0	0.9639	0.8120	0.7603	0.7777	10000.0	0.9635	0.9639	0.9636	10000.0
0.176	17.0	16677	0.1056	0.9697	0.9677	0.9687	4865.0	0.9651	0.9720	0.9686	4865.0	0.8696	0.8065	0.8368	248.0	0.4	0.2727	0.3243	22.0	0.9643	0.8011	0.7547	0.7746	10000.0	0.9637	0.9643	0.9640	10000.0
0.1541	18.0	17658	0.1038	0.9661	0.9714	0.9687	4865.0	0.9688	0.9700	0.9694	4865.0	0.8884	0.8024	0.8432	248.0	0.4615	0.2727	0.3429	22.0	0.965	0.8212	0.7541	0.7811	10000.0	0.9644	0.965	0.9646	10000.0

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu124
Datasets 3.0.1
Tokenizers 0.21.0

hugosousa
/

smol-135-tq-closure-augment

smol-135-tq-closure-augment

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for hugosousa/smol-135-tq-closure-augment

Evaluation results