hugosousa
/

smol-135-tq-closure-augment

+---
+library_name: transformers
+license: apache-2.0
+base_model: HuggingFaceTB/SmolLM2-135M
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: smol-135-tq-closure-augment
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# smol-135-tq-closure-augment
+This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1038
+- < Precision: 0.9661
+- < Recall: 0.9714
+- < F1-score: 0.9687
+- < Support: 4865.0
+- > Precision: 0.9688
+- > Recall: 0.9700
+- > F1-score: 0.9694
+- > Support: 4865.0
+- = Precision: 0.8884
+- = Recall: 0.8024
+- = F1-score: 0.8432
+- = Support: 248.0
+- - Precision: 0.4615
+- - Recall: 0.2727
+- - F1-score: 0.3429
+- - Support: 22.0
+- Accuracy: 0.965
+- Macro Avg Precision: 0.8212
+- Macro Avg Recall: 0.7541
+- Macro Avg F1-score: 0.7811
+- Macro Avg Support: 10000.0
+- Weighted Avg Precision: 0.9644
+- Weighted Avg Recall: 0.965
+- Weighted Avg F1-score: 0.9646
+- Weighted Avg Support: 10000.0
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 512
+- total_eval_batch_size: 256
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: reduce_lr_on_plateau
+- num_epochs: 30
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | < Precision | < Recall | < F1-score | < Support | > Precision | > Recall | > F1-score | > Support | = Precision | = Recall | = F1-score | = Support | - Precision | - Recall | - F1-score | - Support | Accuracy | Macro Avg Precision | Macro Avg Recall | Macro Avg F1-score | Macro Avg Support | Weighted Avg Precision | Weighted Avg Recall | Weighted Avg F1-score | Weighted Avg Support |
+|:-------------:|:-----:|:-----:|:---------------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:--------:|:-------------------:|:----------------:|:------------------:|:-----------------:|:----------------------:|:-------------------:|:---------------------:|:--------------------:|
+| 0.4482        | 1.0   | 981   | 0.2402          | 0.8193      | 0.8399   | 0.8295     | 4865.0    | 0.8187      | 0.8436   | 0.8309     | 4865.0    | 0.0         | 0.0      | 0.0        | 248.0     | 0.0         | 0.0      | 0.0        | 22.0      | 0.819    | 0.4095              | 0.4209           | 0.4151             | 10000.0           | 0.7969                 | 0.819               | 0.8078                | 10000.0              |
+| 0.2725        | 2.0   | 1962  | 0.1563          | 0.9366      | 0.9137   | 0.9250     | 4865.0    | 0.8957      | 0.9531   | 0.9235     | 4865.0    | 0.7532      | 0.2339   | 0.3569     | 248.0     | 0.0         | 0.0      | 0.0        | 22.0      | 0.914    | 0.6464              | 0.5252           | 0.5514             | 10000.0           | 0.9101                 | 0.914               | 0.9081                | 10000.0              |
+| 0.2609        | 3.0   | 2943  | 0.1362          | 0.9356      | 0.9464   | 0.9409     | 4865.0    | 0.9464      | 0.9361   | 0.9412     | 4865.0    | 0.6479      | 0.6976   | 0.6718     | 248.0     | 0.0         | 0.0      | 0.0        | 22.0      | 0.9331   | 0.6325              | 0.6450           | 0.6385             | 10000.0           | 0.9316                 | 0.9331              | 0.9323                | 10000.0              |
+| 0.2188        | 4.0   | 3924  | 0.1212          | 0.9452      | 0.9599   | 0.9525     | 4865.0    | 0.9559      | 0.9494   | 0.9527     | 4865.0    | 0.7803      | 0.7016   | 0.7389     | 248.0     | 0.5         | 0.0909   | 0.1538     | 22.0      | 0.9465   | 0.7953              | 0.6755           | 0.6995             | 10000.0           | 0.9453                 | 0.9465              | 0.9455                | 10000.0              |
+| 0.2196        | 5.0   | 4905  | 0.1162          | 0.9540      | 0.9632   | 0.9586     | 4865.0    | 0.9608      | 0.9568   | 0.9588     | 4865.0    | 0.7667      | 0.7419   | 0.7541     | 248.0     | 1.0         | 0.1364   | 0.24       | 22.0      | 0.9528   | 0.9204              | 0.6996           | 0.7279             | 10000.0           | 0.9528                 | 0.9528              | 0.9520                | 10000.0              |
+| 0.2002        | 6.0   | 5886  | 0.1131          | 0.9548      | 0.9630   | 0.9589     | 4865.0    | 0.9539      | 0.9620   | 0.9579     | 4865.0    | 0.8743      | 0.6452   | 0.7425     | 248.0     | 0.5         | 0.0909   | 0.1538     | 22.0      | 0.9527   | 0.8208              | 0.6653           | 0.7033             | 10000.0           | 0.9514                 | 0.9527              | 0.9513                | 10000.0              |
+| 0.2211        | 7.0   | 6867  | 0.1111          | 0.9552      | 0.9718   | 0.9634     | 4865.0    | 0.9694      | 0.9587   | 0.9640     | 4865.0    | 0.8533      | 0.7742   | 0.8118     | 248.0     | 0.4286      | 0.2727   | 0.3333     | 22.0      | 0.959    | 0.8016              | 0.7444           | 0.7682             | 10000.0           | 0.9584                 | 0.959               | 0.9586                | 10000.0              |
+| 0.1976        | 8.0   | 7848  | 0.1137          | 0.9502      | 0.9720   | 0.9610     | 4865.0    | 0.9694      | 0.9496   | 0.9594     | 4865.0    | 0.8075      | 0.7782   | 0.7926     | 248.0     | 0.1667      | 0.1364   | 0.15       | 22.0      | 0.9545   | 0.7234              | 0.7091           | 0.7157             | 10000.0           | 0.9542                 | 0.9545              | 0.9543                | 10000.0              |
+| 0.1912        | 9.0   | 8829  | 0.1070          | 0.9677      | 0.9605   | 0.9641     | 4865.0    | 0.9566      | 0.9694   | 0.9629     | 4865.0    | 0.8475      | 0.8065   | 0.8264     | 248.0     | 1.0         | 0.2273   | 0.3704     | 22.0      | 0.9594   | 0.9429              | 0.7409           | 0.7810             | 10000.0           | 0.9594                 | 0.9594              | 0.9588                | 10000.0              |
+| 0.1777        | 10.0  | 9810  | 0.1077          | 0.9654      | 0.9591   | 0.9623     | 4865.0    | 0.9564      | 0.9704   | 0.9634     | 4865.0    | 0.8829      | 0.7903   | 0.8340     | 248.0     | 0.4444      | 0.1818   | 0.2581     | 22.0      | 0.9587   | 0.8123              | 0.7254           | 0.7544             | 10000.0           | 0.9579                 | 0.9587              | 0.9581                | 10000.0              |
+| 0.1766        | 11.0  | 10791 | 0.1084          | 0.9621      | 0.9659   | 0.9640     | 4865.0    | 0.9633      | 0.9651   | 0.9642     | 4865.0    | 0.8584      | 0.8065   | 0.8316     | 248.0     | 0.4444      | 0.1818   | 0.2581     | 22.0      | 0.9598   | 0.8071              | 0.7298           | 0.7545             | 10000.0           | 0.9590                 | 0.9598              | 0.9592                | 10000.0              |
+| 0.1709        | 12.0  | 11772 | 0.1066          | 0.9623      | 0.9698   | 0.9660     | 4865.0    | 0.9671      | 0.9655   | 0.9663     | 4865.0    | 0.8789      | 0.7903   | 0.8323     | 248.0     | 0.2353      | 0.1818   | 0.2051     | 22.0      | 0.9615   | 0.7609              | 0.7268           | 0.7424             | 10000.0           | 0.9609                 | 0.9615              | 0.9611                | 10000.0              |
+| 0.1805        | 13.0  | 12753 | 0.1076          | 0.9703      | 0.9614   | 0.9658     | 4865.0    | 0.9598      | 0.9727   | 0.9662     | 4865.0    | 0.8636      | 0.7661   | 0.8120     | 248.0     | 0.2333      | 0.3182   | 0.2692     | 22.0      | 0.9606   | 0.7568              | 0.7546           | 0.7533             | 10000.0           | 0.9610                 | 0.9606              | 0.9607                | 10000.0              |
+| 0.1854        | 14.0  | 13734 | 0.1057          | 0.9731      | 0.9581   | 0.9655     | 4865.0    | 0.9585      | 0.9731   | 0.9657     | 4865.0    | 0.8031      | 0.8387   | 0.8205     | 248.0     | 0.4167      | 0.2273   | 0.2941     | 22.0      | 0.9608   | 0.7878              | 0.7493           | 0.7615             | 10000.0           | 0.9605                 | 0.9608              | 0.9605                | 10000.0              |
+| 0.1697        | 15.0  | 14715 | 0.1047          | 0.9674      | 0.9708   | 0.9691     | 4865.0    | 0.9686      | 0.9706   | 0.9696     | 4865.0    | 0.8734      | 0.8065   | 0.8386     | 248.0     | 0.4286      | 0.2727   | 0.3333     | 22.0      | 0.9651   | 0.8095              | 0.7551           | 0.7777             | 10000.0           | 0.9645                 | 0.9651              | 0.9647                | 10000.0              |
+| 0.1747        | 16.0  | 15696 | 0.1061          | 0.9656      | 0.9706   | 0.9681     | 4865.0    | 0.9713      | 0.9671   | 0.9692     | 4865.0    | 0.8110      | 0.8306   | 0.8207     | 248.0     | 0.5         | 0.2727   | 0.3529     | 22.0      | 0.9639   | 0.8120              | 0.7603           | 0.7777             | 10000.0           | 0.9635                 | 0.9639              | 0.9636                | 10000.0              |
+| 0.176         | 17.0  | 16677 | 0.1056          | 0.9697      | 0.9677   | 0.9687     | 4865.0    | 0.9651      | 0.9720   | 0.9686     | 4865.0    | 0.8696      | 0.8065   | 0.8368     | 248.0     | 0.4         | 0.2727   | 0.3243     | 22.0      | 0.9643   | 0.8011              | 0.7547           | 0.7746             | 10000.0           | 0.9637                 | 0.9643              | 0.9640                | 10000.0              |
+| 0.1541        | 18.0  | 17658 | 0.1038          | 0.9661      | 0.9714   | 0.9687     | 4865.0    | 0.9688      | 0.9700   | 0.9694     | 4865.0    | 0.8884      | 0.8024   | 0.8432     | 248.0     | 0.4615      | 0.2727   | 0.3429     | 22.0      | 0.965    | 0.8212              | 0.7541           | 0.7811             | 10000.0           | 0.9644                 | 0.965               | 0.9646                | 10000.0              |
+### Framework versions
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu124
+- Datasets 3.0.1
+- Tokenizers 0.21.0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:336452b241589cf9a6839b8d3e8f439b2667fd3530e9575718fd4b25aafcbe5b
 size 269074456

 version https://git-lfs.github.com/spec/v1
+oid sha256:a8a18ee292655b324735c08a31207335f04edd842ebeea13cd0dfc5963c7a9e6
 size 269074456