mlm-code-mixed-finetuned-final
This model is a fine-tuned version of sagorsarker/bangla-bert-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.2888
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
5.3581 | 1.0 | 595 | 4.6147 |
4.3617 | 2.0 | 1190 | 4.3025 |
3.992 | 3.0 | 1785 | 4.1354 |
3.6837 | 4.0 | 2380 | 3.8904 |
3.4246 | 5.0 | 2975 | 3.7318 |
3.2778 | 6.0 | 3570 | 3.7400 |
3.1339 | 7.0 | 4165 | 3.6844 |
2.9912 | 8.0 | 4760 | 3.6205 |
2.8621 | 9.0 | 5355 | 3.5747 |
2.7373 | 10.0 | 5950 | 3.5430 |
2.648 | 11.0 | 6545 | 3.4527 |
2.5479 | 12.0 | 7140 | 3.4770 |
2.4683 | 13.0 | 7735 | 3.4124 |
2.3578 | 14.0 | 8330 | 3.4087 |
2.3106 | 15.0 | 8925 | 3.4033 |
2.2233 | 16.0 | 9520 | 3.3202 |
2.162 | 17.0 | 10115 | 3.3960 |
2.0955 | 18.0 | 10710 | 3.3660 |
2.021 | 19.0 | 11305 | 3.2778 |
2.0122 | 20.0 | 11900 | 3.3029 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.0
- Datasets 2.1.0
- Tokenizers 0.13.2
- Downloads last month
- 0