mt5-base-ft-rf-02
This model is a fine-tuned version of google/mt5-base on an unknown dataset.
It achieves the following results on the evaluation set:
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
43.082 |
0.24 |
50 |
37.1069 |
34.6827 |
0.49 |
100 |
28.8296 |
21.0188 |
0.73 |
150 |
19.9344 |
18.3905 |
0.98 |
200 |
12.0120 |
14.342 |
1.22 |
250 |
9.2877 |
6.2116 |
1.46 |
300 |
6.1602 |
6.5474 |
1.71 |
350 |
4.6816 |
1.9222 |
1.95 |
400 |
2.6431 |
2.0579 |
2.2 |
450 |
1.2741 |
1.1028 |
2.44 |
500 |
0.9638 |
1.3341 |
2.68 |
550 |
0.8896 |
0.6531 |
2.93 |
600 |
0.8461 |
0.9805 |
3.17 |
650 |
0.7652 |
0.7167 |
3.41 |
700 |
0.7544 |
1.0224 |
3.66 |
750 |
0.7493 |
0.5367 |
3.9 |
800 |
0.7188 |
0.9352 |
4.15 |
850 |
0.6844 |
0.4927 |
4.39 |
900 |
0.6595 |
0.7141 |
4.63 |
950 |
0.6458 |
0.5773 |
4.88 |
1000 |
0.5911 |
0.4791 |
5.12 |
1050 |
0.5691 |
0.498 |
5.37 |
1100 |
0.5572 |
0.4306 |
5.61 |
1150 |
0.5315 |
0.334 |
5.85 |
1200 |
0.5123 |
0.3783 |
6.1 |
1250 |
0.4970 |
0.7719 |
6.34 |
1300 |
0.4774 |
0.3732 |
6.59 |
1350 |
0.4591 |
0.6203 |
6.83 |
1400 |
0.4482 |
0.4669 |
7.07 |
1450 |
0.4434 |
0.5568 |
7.32 |
1500 |
0.4307 |
0.6352 |
7.56 |
1550 |
0.4257 |
1.4137 |
7.8 |
1600 |
0.4229 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3