smol-135-tq

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1959
  • < Precision: 0.9313
  • < Recall: 0.9555
  • < F1-score: 0.9432
  • < Support: 2808.0
  • Precision: 0.9305

  • Recall: 0.9134

  • F1-score: 0.9218

  • Support: 1743.0

  • = Precision: 0.8039
  • = Recall: 0.7305
  • = F1-score: 0.7655
  • = Support: 449.0
    • Precision: 0.0
    • Recall: 0.0
    • F1-score: 0.0
    • Support: 0.0
  • Accuracy: 0.9206
  • Macro Avg Precision: 0.6664
  • Macro Avg Recall: 0.6498
  • Macro Avg F1-score: 0.6576
  • Macro Avg Support: 5000.0
  • Weighted Avg Precision: 0.9196
  • Weighted Avg Recall: 0.9206
  • Weighted Avg F1-score: 0.9198
  • Weighted Avg Support: 5000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • total_eval_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: reduce_lr_on_plateau
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss < Precision < Recall < F1-score < Support > Precision > Recall > F1-score > Support = Precision = Recall = F1-score = Support - Precision - Recall - F1-score - Support Accuracy Macro Avg Precision Macro Avg Recall Macro Avg F1-score Macro Avg Support Weighted Avg Precision Weighted Avg Recall Weighted Avg F1-score Weighted Avg Support
0.8652 1.0 75 0.3609 0.6438 0.9127 0.7550 2808.0 0.7399 0.4326 0.5460 1743.0 0.0 0.0 0.0 449.0 0.0 0.0 0.0 0.0 0.6634 0.3459 0.3363 0.3253 5000.0 0.6195 0.6634 0.6144 5000.0
0.5947 2.0 150 0.2978 0.7978 0.8597 0.8276 2808.0 0.7472 0.6850 0.7148 1743.0 0.5106 0.4276 0.4655 449.0 0.0 0.0 0.0 0.0 0.76 0.5139 0.4931 0.5019 5000.0 0.7543 0.76 0.7557 5000.0
0.4873 3.0 225 0.2586 0.8546 0.8600 0.8573 2808.0 0.7672 0.7849 0.7760 1743.0 0.6036 0.5256 0.5619 449.0 0.0 0.0 0.0 0.0 0.8038 0.5563 0.5426 0.5488 5000.0 0.8016 0.8038 0.8024 5000.0
0.4009 4.0 300 0.2340 0.8798 0.8786 0.8792 2808.0 0.8217 0.8090 0.8153 1743.0 0.5896 0.6303 0.6093 449.0 0.0 0.0 0.0 0.0 0.832 0.5728 0.5795 0.5759 5000.0 0.8335 0.832 0.8327 5000.0
0.3242 5.0 375 0.2144 0.8869 0.9192 0.9028 2808.0 0.8504 0.8543 0.8523 1743.0 0.7611 0.5746 0.6548 449.0 0.0 0.0 0.0 0.0 0.8656 0.6246 0.5870 0.6025 5000.0 0.8629 0.8656 0.8629 5000.0
0.3002 6.0 450 0.2057 0.9039 0.9181 0.9110 2808.0 0.8655 0.8675 0.8665 1743.0 0.7332 0.6548 0.6918 449.0 0.0 0.0 0.0 0.0 0.8768 0.6256 0.6101 0.6173 5000.0 0.8752 0.8768 0.8758 5000.0
0.2216 7.0 525 0.1920 0.8967 0.9402 0.9179 2808.0 0.8881 0.8698 0.8788 1743.0 0.7937 0.6169 0.6942 449.0 0.0 0.0 0.0 0.0 0.8866 0.6446 0.6067 0.6228 5000.0 0.8845 0.8866 0.8842 5000.0
0.214 8.0 600 0.2088 0.9230 0.9220 0.9225 2808.0 0.8693 0.8853 0.8772 1743.0 0.7286 0.6815 0.7043 449.0 0.0 0.0 0.0 0.0 0.8876 0.6302 0.6222 0.6260 5000.0 0.8868 0.8876 0.8871 5000.0
0.2029 9.0 675 0.2069 0.9010 0.9402 0.9202 2808.0 0.8986 0.8589 0.8783 1743.0 0.7698 0.6927 0.7292 449.0 0.0 0.0 0.0 0.0 0.8896 0.6423 0.6229 0.6319 5000.0 0.8884 0.8896 0.8884 5000.0
0.2235 10.0 750 0.1974 0.9253 0.9263 0.9258 2808.0 0.8807 0.8933 0.8869 1743.0 0.7601 0.7127 0.7356 449.0 0.0 0.0 0.0 0.0 0.8956 0.6415 0.6331 0.6371 5000.0 0.8949 0.8956 0.8952 5000.0
0.1841 11.0 825 0.1988 0.9152 0.9384 0.9267 2808.0 0.9093 0.8738 0.8912 1743.0 0.7466 0.7416 0.7441 449.0 0.0 0.0 0.0 0.0 0.8982 0.6428 0.6385 0.6405 5000.0 0.8980 0.8982 0.8979 5000.0
0.1704 12.0 900 0.2004 0.9334 0.9281 0.9307 2808.0 0.8847 0.9071 0.8958 1743.0 0.7672 0.7194 0.7425 449.0 0.0 0.0 0.0 0.0 0.902 0.6463 0.6386 0.6422 5000.0 0.9015 0.902 0.9016 5000.0
0.1639 13.0 975 0.1904 0.9387 0.9330 0.9359 2808.0 0.8923 0.9266 0.9091 1743.0 0.7769 0.6904 0.7311 449.0 0.0 0.0 0.0 0.0 0.909 0.6520 0.6375 0.6440 5000.0 0.9080 0.909 0.9082 5000.0
0.1808 14.0 1050 0.1972 0.9216 0.9459 0.9336 2808.0 0.9133 0.9002 0.9067 1743.0 0.7975 0.7105 0.7515 449.0 0.0 0.0 0.0 0.0 0.9088 0.6581 0.6391 0.6479 5000.0 0.9075 0.9088 0.9078 5000.0
0.1664 15.0 1125 0.2045 0.9275 0.9345 0.9310 2808.0 0.8991 0.9002 0.8997 1743.0 0.7653 0.7261 0.7451 449.0 0.0 0.0 0.0 0.0 0.9038 0.6480 0.6402 0.6439 5000.0 0.9031 0.9038 0.9034 5000.0
0.1329 16.0 1200 0.1922 0.9375 0.9402 0.9388 2808.0 0.9054 0.9174 0.9114 1743.0 0.7799 0.7261 0.7520 449.0 0.0 0.0 0.0 0.0 0.913 0.6557 0.6459 0.6506 5000.0 0.9122 0.913 0.9125 5000.0
0.1289 17.0 1275 0.1983 0.9451 0.9387 0.9419 2808.0 0.9059 0.9225 0.9142 1743.0 0.7729 0.7506 0.7616 449.0 0.0 0.0 0.0 0.0 0.9162 0.6560 0.6530 0.6544 5000.0 0.9160 0.9162 0.9161 5000.0
0.1276 18.0 1350 0.1980 0.9415 0.9405 0.9410 2808.0 0.9156 0.9151 0.9154 1743.0 0.7594 0.7661 0.7627 449.0 0.0 0.0 0.0 0.0 0.916 0.6541 0.6554 0.6548 5000.0 0.9161 0.916 0.9161 5000.0
0.1334 19.0 1425 0.1959 0.9313 0.9555 0.9432 2808.0 0.9305 0.9134 0.9218 1743.0 0.8039 0.7305 0.7655 449.0 0.0 0.0 0.0 0.0 0.9206 0.6664 0.6498 0.6576 5000.0 0.9196 0.9206 0.9198 5000.0
0.1394 20.0 1500 0.2005 0.9487 0.9352 0.9419 2808.0 0.9068 0.9271 0.9169 1743.0 0.7689 0.7706 0.7697 449.0 0.0 0.0 0.0 0.0 0.9176 0.6561 0.6582 0.6571 5000.0 0.9180 0.9176 0.9177 5000.0
0.1305 21.0 1575 0.2037 0.9247 0.9573 0.9407 2808.0 0.9353 0.9036 0.9192 1743.0 0.7897 0.7194 0.7529 449.0 0.0 0.0 0.0 0.0 0.9172 0.6624 0.6451 0.6532 5000.0 0.9162 0.9172 0.9163 5000.0
0.1283 22.0 1650 0.2020 0.9366 0.9466 0.9416 2808.0 0.9242 0.9099 0.9170 1743.0 0.7668 0.7617 0.7642 449.0 0.0 0.0 0.0 0.0 0.9172 0.6569 0.6545 0.6557 5000.0 0.9170 0.9172 0.9171 5000.0

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0
Downloads last month
76
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hugosousa/smol-135-tq

Finetuned
(263)
this model