File size: 5,139 Bytes
b5571c0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
397c1eb
 
 
 
b5571c0
397c1eb
 
 
b5571c0
397c1eb
 
 
b5571c0
397c1eb
 
 
b5571c0
397c1eb
 
 
 
b5571c0
397c1eb
 
 
b5571c0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
library_name: transformers
license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-135M
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: smol-135-tq-closure-augment-synthetic
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# smol-135-tq-closure-augment-synthetic

This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1898
- < Precision: 0.9121
- < Recall: 0.9051
- < F1-score: 0.9086
- < Support: 7717.0
- > Precision: 0.9113
- > Recall: 0.9016
- > F1-score: 0.9065
- > Support: 7717.0
- = Precision: 0.7992
- = Recall: 0.8098
- = F1-score: 0.8045
- = Support: 3244.0
- - Precision: 0.7401
- - Recall: 0.7950
- - F1-score: 0.7666
- - Support: 1322.0
- Accuracy: 0.8810
- Macro Avg Precision: 0.8407
- Macro Avg Recall: 0.8529
- Macro Avg F1-score: 0.8465
- Macro Avg Support: 20000.0
- Weighted Avg Precision: 0.8821
- Weighted Avg Recall: 0.8810
- Weighted Avg F1-score: 0.8815
- Weighted Avg Support: 20000.0

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 512
- total_eval_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: reduce_lr_on_plateau
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step  | Validation Loss | < Precision | < Recall | < F1-score | < Support | > Precision | > Recall | > F1-score | > Support | = Precision | = Recall | = F1-score | = Support | - Precision | - Recall | - F1-score | - Support | Accuracy | Macro Avg Precision | Macro Avg Recall | Macro Avg F1-score | Macro Avg Support | Weighted Avg Precision | Weighted Avg Recall | Weighted Avg F1-score | Weighted Avg Support |
|:-------------:|:-----:|:-----:|:---------------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:-----------:|:--------:|:----------:|:---------:|:--------:|:-------------------:|:----------------:|:------------------:|:-----------------:|:----------------------:|:-------------------:|:---------------------:|:--------------------:|
| 0.2065        | 1.0   | 2708  | 0.1948          | 0.9182      | 0.8800   | 0.8987     | 7717.0    | 0.9012      | 0.8923   | 0.8967     | 7717.0    | 0.7478      | 0.8576   | 0.7990     | 3244.0    | 0.7788      | 0.7322   | 0.7548     | 1322.0    | 0.8713   | 0.8365              | 0.8405           | 0.8373             | 20000.0           | 0.8748                 | 0.8713              | 0.8722                | 20000.0              |
| 0.1833        | 2.0   | 5416  | 0.1898          | 0.9121      | 0.9051   | 0.9086     | 7717.0    | 0.9113      | 0.9016   | 0.9065     | 7717.0    | 0.7992      | 0.8098   | 0.8045     | 3244.0    | 0.7401      | 0.7950   | 0.7666     | 1322.0    | 0.8810   | 0.8407              | 0.8529           | 0.8465             | 20000.0           | 0.8821                 | 0.8810              | 0.8815                | 20000.0              |
| 0.1415        | 3.0   | 8124  | 0.2006          | 0.8913      | 0.9220   | 0.9064     | 7717.0    | 0.9039      | 0.9116   | 0.9077     | 7717.0    | 0.8096      | 0.7747   | 0.7917     | 3244.0    | 0.8018      | 0.6853   | 0.7390     | 1322.0    | 0.8784   | 0.8516              | 0.8234           | 0.8362             | 20000.0           | 0.8770                 | 0.8784              | 0.8772                | 20000.0              |
| 0.1136        | 4.0   | 10832 | 0.2063          | 0.9045      | 0.9136   | 0.9090     | 7717.0    | 0.9038      | 0.9106   | 0.9072     | 7717.0    | 0.7968      | 0.8039   | 0.8004     | 3244.0    | 0.7876      | 0.6899   | 0.7355     | 1322.0    | 0.8799   | 0.8482              | 0.8295           | 0.8380             | 20000.0           | 0.8790                 | 0.8799              | 0.8792                | 20000.0              |
| 0.1051        | 5.0   | 13540 | 0.2285          | 0.9131      | 0.9079   | 0.9105     | 7717.0    | 0.9138      | 0.9093   | 0.9115     | 7717.0    | 0.7882      | 0.7975   | 0.7928     | 3244.0    | 0.7313      | 0.7557   | 0.7433     | 1322.0    | 0.8804   | 0.8366              | 0.8426           | 0.8395             | 20000.0           | 0.8811                 | 0.8804              | 0.8807                | 20000.0              |


### Framework versions

- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.0.1
- Tokenizers 0.21.0