End of training

397c1eb verified 1 day ago

5.14 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: HuggingFaceTB/SmolLM2-135M
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: smol-135-tq-closure-augment-synthetic
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# smol-135-tq-closure-augment-synthetic

	This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1898
	- < Precision: 0.9121
	- < Recall: 0.9051
	- < F1-score: 0.9086
	- < Support: 7717.0
	- > Precision: 0.9113
	- > Recall: 0.9016
	- > F1-score: 0.9065
	- > Support: 7717.0
	- = Precision: 0.7992
	- = Recall: 0.8098
	- = F1-score: 0.8045
	- = Support: 3244.0
	- - Precision: 0.7401
	- - Recall: 0.7950
	- - F1-score: 0.7666
	- - Support: 1322.0
	- Accuracy: 0.8810
	- Macro Avg Precision: 0.8407
	- Macro Avg Recall: 0.8529
	- Macro Avg F1-score: 0.8465
	- Macro Avg Support: 20000.0
	- Weighted Avg Precision: 0.8821
	- Weighted Avg Recall: 0.8810
	- Weighted Avg F1-score: 0.8815
	- Weighted Avg Support: 20000.0

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 512
	- total_eval_batch_size: 256
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: reduce_lr_on_plateau
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| < Precision \| < Recall \| < F1-score \| < Support \| > Precision \| > Recall \| > F1-score \| > Support \| = Precision \| = Recall \| = F1-score \| = Support \| - Precision \| - Recall \| - F1-score \| - Support \| Accuracy \| Macro Avg Precision \| Macro Avg Recall \| Macro Avg F1-score \| Macro Avg Support \| Weighted Avg Precision \| Weighted Avg Recall \| Weighted Avg F1-score \| Weighted Avg Support \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-----------:\|:--------:\|:----------:\|:---------:\|:-----------:\|:--------:\|:----------:\|:---------:\|:-----------:\|:--------:\|:----------:\|:---------:\|:-----------:\|:--------:\|:----------:\|:---------:\|:--------:\|:-------------------:\|:----------------:\|:------------------:\|:-----------------:\|:----------------------:\|:-------------------:\|:---------------------:\|:--------------------:\|
	\| 0.2065 \| 1.0 \| 2708 \| 0.1948 \| 0.9182 \| 0.8800 \| 0.8987 \| 7717.0 \| 0.9012 \| 0.8923 \| 0.8967 \| 7717.0 \| 0.7478 \| 0.8576 \| 0.7990 \| 3244.0 \| 0.7788 \| 0.7322 \| 0.7548 \| 1322.0 \| 0.8713 \| 0.8365 \| 0.8405 \| 0.8373 \| 20000.0 \| 0.8748 \| 0.8713 \| 0.8722 \| 20000.0 \|
	\| 0.1833 \| 2.0 \| 5416 \| 0.1898 \| 0.9121 \| 0.9051 \| 0.9086 \| 7717.0 \| 0.9113 \| 0.9016 \| 0.9065 \| 7717.0 \| 0.7992 \| 0.8098 \| 0.8045 \| 3244.0 \| 0.7401 \| 0.7950 \| 0.7666 \| 1322.0 \| 0.8810 \| 0.8407 \| 0.8529 \| 0.8465 \| 20000.0 \| 0.8821 \| 0.8810 \| 0.8815 \| 20000.0 \|
	\| 0.1415 \| 3.0 \| 8124 \| 0.2006 \| 0.8913 \| 0.9220 \| 0.9064 \| 7717.0 \| 0.9039 \| 0.9116 \| 0.9077 \| 7717.0 \| 0.8096 \| 0.7747 \| 0.7917 \| 3244.0 \| 0.8018 \| 0.6853 \| 0.7390 \| 1322.0 \| 0.8784 \| 0.8516 \| 0.8234 \| 0.8362 \| 20000.0 \| 0.8770 \| 0.8784 \| 0.8772 \| 20000.0 \|
	\| 0.1136 \| 4.0 \| 10832 \| 0.2063 \| 0.9045 \| 0.9136 \| 0.9090 \| 7717.0 \| 0.9038 \| 0.9106 \| 0.9072 \| 7717.0 \| 0.7968 \| 0.8039 \| 0.8004 \| 3244.0 \| 0.7876 \| 0.6899 \| 0.7355 \| 1322.0 \| 0.8799 \| 0.8482 \| 0.8295 \| 0.8380 \| 20000.0 \| 0.8790 \| 0.8799 \| 0.8792 \| 20000.0 \|
	\| 0.1051 \| 5.0 \| 13540 \| 0.2285 \| 0.9131 \| 0.9079 \| 0.9105 \| 7717.0 \| 0.9138 \| 0.9093 \| 0.9115 \| 7717.0 \| 0.7882 \| 0.7975 \| 0.7928 \| 3244.0 \| 0.7313 \| 0.7557 \| 0.7433 \| 1322.0 \| 0.8804 \| 0.8366 \| 0.8426 \| 0.8395 \| 20000.0 \| 0.8811 \| 0.8804 \| 0.8807 \| 20000.0 \|


	### Framework versions

	- Transformers 4.47.1
	- Pytorch 2.5.1+cu124
	- Datasets 3.0.1
	- Tokenizers 0.21.0