hugosousa's picture
End of training
397c1eb verified
metadata
library_name: transformers
license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-135M
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: smol-135-tq-closure-augment-synthetic
    results: []

smol-135-tq-closure-augment-synthetic

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1898
  • < Precision: 0.9121
  • < Recall: 0.9051
  • < F1-score: 0.9086
  • < Support: 7717.0
  • Precision: 0.9113

  • Recall: 0.9016

  • F1-score: 0.9065

  • Support: 7717.0

  • = Precision: 0.7992
  • = Recall: 0.8098
  • = F1-score: 0.8045
  • = Support: 3244.0
    • Precision: 0.7401
    • Recall: 0.7950
    • F1-score: 0.7666
    • Support: 1322.0
  • Accuracy: 0.8810
  • Macro Avg Precision: 0.8407
  • Macro Avg Recall: 0.8529
  • Macro Avg F1-score: 0.8465
  • Macro Avg Support: 20000.0
  • Weighted Avg Precision: 0.8821
  • Weighted Avg Recall: 0.8810
  • Weighted Avg F1-score: 0.8815
  • Weighted Avg Support: 20000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • total_eval_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: reduce_lr_on_plateau
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss < Precision < Recall < F1-score < Support > Precision > Recall > F1-score > Support = Precision = Recall = F1-score = Support - Precision - Recall - F1-score - Support Accuracy Macro Avg Precision Macro Avg Recall Macro Avg F1-score Macro Avg Support Weighted Avg Precision Weighted Avg Recall Weighted Avg F1-score Weighted Avg Support
0.2065 1.0 2708 0.1948 0.9182 0.8800 0.8987 7717.0 0.9012 0.8923 0.8967 7717.0 0.7478 0.8576 0.7990 3244.0 0.7788 0.7322 0.7548 1322.0 0.8713 0.8365 0.8405 0.8373 20000.0 0.8748 0.8713 0.8722 20000.0
0.1833 2.0 5416 0.1898 0.9121 0.9051 0.9086 7717.0 0.9113 0.9016 0.9065 7717.0 0.7992 0.8098 0.8045 3244.0 0.7401 0.7950 0.7666 1322.0 0.8810 0.8407 0.8529 0.8465 20000.0 0.8821 0.8810 0.8815 20000.0
0.1415 3.0 8124 0.2006 0.8913 0.9220 0.9064 7717.0 0.9039 0.9116 0.9077 7717.0 0.8096 0.7747 0.7917 3244.0 0.8018 0.6853 0.7390 1322.0 0.8784 0.8516 0.8234 0.8362 20000.0 0.8770 0.8784 0.8772 20000.0
0.1136 4.0 10832 0.2063 0.9045 0.9136 0.9090 7717.0 0.9038 0.9106 0.9072 7717.0 0.7968 0.8039 0.8004 3244.0 0.7876 0.6899 0.7355 1322.0 0.8799 0.8482 0.8295 0.8380 20000.0 0.8790 0.8799 0.8792 20000.0
0.1051 5.0 13540 0.2285 0.9131 0.9079 0.9105 7717.0 0.9138 0.9093 0.9115 7717.0 0.7882 0.7975 0.7928 3244.0 0.7313 0.7557 0.7433 1322.0 0.8804 0.8366 0.8426 0.8395 20000.0 0.8811 0.8804 0.8807 20000.0

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0