hugosousa
/

smol-135-tq-closure-augment-synthetic

+---
+library_name: transformers
+license: apache-2.0
+base_model: HuggingFaceTB/SmolLM2-135M
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: smol-135-tq-closure-augment-synthetic
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# smol-135-tq-closure-augment-synthetic
+This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2285
+- < Precision: 0.9131
+- < Recall: 0.9079
+- < F1-score: 0.9105
+- < Support: 7717.0
+- > Precision: 0.9138
+- > Recall: 0.9093
+- > F1-score: 0.9115
+- > Support: 7717.0
+- = Precision: 0.7882
+- = Recall: 0.7975
+- = F1-score: 0.7928
+- = Support: 3244.0
+- - Precision: 0.7313
+- - Recall: 0.7557
+- - F1-score: 0.7433
+- - Support: 1322.0
+- Accuracy: 0.8804
+- Macro Avg Precision: 0.8366
+- Macro Avg Recall: 0.8426
+- Macro Avg F1-score: 0.8395
+- Macro Avg Support: 20000.0
+- Weighted Avg Precision: 0.8811
+- Weighted Avg Recall: 0.8804
+- Weighted Avg F1-score: 0.8807
+- Weighted Avg Support: 20000.0
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 512
+- total_eval_batch_size: 256
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: reduce_lr_on_plateau
+- num_epochs: 30
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | < Precision | < Recall | < F1-score | < Support | > Precision | > Recall | > F1-score | > Support | = Precision | = Recall | = F1-score | = Support | - Precision | - Recall | - F1-score | - Support | Accuracy | Macro Avg Precision | Macro Avg Recall | Macro Avg F1-score | Macro Avg Support | Weighted Avg Precision | Weighted Avg Recall | Weighted Avg F1-score | Weighted Avg Support |
+|:-------------:|:-----:|:-----:|:---------------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:--------:|:-------------------:|:----------------:|:------------------:|:-----------------:|:----------------------:|:-------------------:|:---------------------:|:--------------------:|
+| 0.2065        | 1.0   | 2708  | 0.1948          | 0.9182      | 0.8800   | 0.8987     | 7717.0    | 0.9012      | 0.8923   | 0.8967     | 7717.0    | 0.7478      | 0.8576   | 0.7990     | 3244.0    | 0.7788      | 0.7322   | 0.7548     | 1322.0    | 0.8713   | 0.8365              | 0.8405           | 0.8373             | 20000.0           | 0.8748                 | 0.8713              | 0.8722                | 20000.0              |
+| 0.1833        | 2.0   | 5416  | 0.1898          | 0.9121      | 0.9051   | 0.9086     | 7717.0    | 0.9113      | 0.9016   | 0.9065     | 7717.0    | 0.7992      | 0.8098   | 0.8045     | 3244.0    | 0.7401      | 0.7950   | 0.7666     | 1322.0    | 0.8810   | 0.8407              | 0.8529           | 0.8465             | 20000.0           | 0.8821                 | 0.8810              | 0.8815                | 20000.0              |
+| 0.1415        | 3.0   | 8124  | 0.2006          | 0.8913      | 0.9220   | 0.9064     | 7717.0    | 0.9039      | 0.9116   | 0.9077     | 7717.0    | 0.8096      | 0.7747   | 0.7917     | 3244.0    | 0.8018      | 0.6853   | 0.7390     | 1322.0    | 0.8784   | 0.8516              | 0.8234           | 0.8362             | 20000.0           | 0.8770                 | 0.8784              | 0.8772                | 20000.0              |
+| 0.1136        | 4.0   | 10832 | 0.2063          | 0.9045      | 0.9136   | 0.9090     | 7717.0    | 0.9038      | 0.9106   | 0.9072     | 7717.0    | 0.7968      | 0.8039   | 0.8004     | 3244.0    | 0.7876      | 0.6899   | 0.7355     | 1322.0    | 0.8799   | 0.8482              | 0.8295           | 0.8380             | 20000.0           | 0.8790                 | 0.8799              | 0.8792                | 20000.0              |
+| 0.1051        | 5.0   | 13540 | 0.2285          | 0.9131      | 0.9079   | 0.9105     | 7717.0    | 0.9138      | 0.9093   | 0.9115     | 7717.0    | 0.7882      | 0.7975   | 0.7928     | 3244.0    | 0.7313      | 0.7557   | 0.7433     | 1322.0    | 0.8804   | 0.8366              | 0.8426           | 0.8395             | 20000.0           | 0.8811                 | 0.8804              | 0.8807                | 20000.0              |
+### Framework versions
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu124
+- Datasets 3.0.1
+- Tokenizers 0.21.0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5817a386c370b8c3d14c62b4ddcdedf3652e46d9b8b53dc7d59857ae4e36da73
 size 269074456

 version https://git-lfs.github.com/spec/v1
+oid sha256:2bcde79c40c3c95afaddf2ad7acba7c60d7ffd017906fdc208533dfbb0dc3320
 size 269074456