madatnlp/mt5-kormath

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.7119
  • Validation Loss: 1.1299
  • Epoch: 61

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'learning_rate': 0.001, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
  • training_precision: mixed_bfloat16

Training results

Train Loss Validation Loss Epoch
17.9929 5.9287 0
5.4802 3.9942 1
4.1718 3.2517 2
3.5750 2.9586 3
3.1535 2.4970 4
2.8665 2.4626 5
2.6682 2.3795 6
2.5323 2.2238 7
2.4057 2.0684 8
2.3107 2.2033 9
2.2501 1.8339 10
2.1089 1.9064 11
2.0741 2.0256 12
1.9868 1.8107 13
1.9719 1.7157 14
1.8762 1.6966 15
1.8814 1.6580 16
1.8052 1.6043 17
1.7567 1.6572 18
1.7209 1.5485 19
1.7347 1.6464 20
1.6760 1.5892 21
1.6286 1.5765 22
1.6124 1.7408 23
1.5683 1.4875 24
1.5814 1.4448 25
1.5306 1.4902 26
1.5121 1.5133 27
1.4869 1.4217 28
1.4539 1.5602 29
1.4650 1.4699 30
1.4508 1.4319 31
1.3910 1.5975 32
1.3758 1.4031 33
1.3550 1.4295 34
1.3405 1.3804 35
1.3144 1.4202 36
1.3136 1.5135 37
1.2617 1.4790 38
1.2260 1.4108 39
1.2348 1.3108 40
1.2019 1.1461 41
1.1775 1.2509 42
1.1690 1.2179 43
1.1318 1.2483 44
1.1013 1.0815 45
1.0735 1.2135 46
1.0439 1.1260 47
1.0182 1.1993 48
0.9971 1.0797 49
0.9583 1.2587 50
0.9505 1.0793 51
0.9366 1.0501 52
0.9170 1.1476 53
0.8741 1.0560 54
0.8558 1.0024 55
0.8394 0.9604 56
0.8203 1.2700 57
0.7938 1.1081 58
0.7800 1.0198 59
0.7378 1.1748 60
0.7119 1.1299 61

Framework versions

  • Transformers 4.18.0
  • TensorFlow 2.8.0
  • Datasets 2.2.0
  • Tokenizers 0.12.1
Downloads last month
9
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.