mms-1b-bsbigcgen-combined-model
This model is a fine-tuned version of facebook/mms-1b-all on the BSBIGCGEN - BEM dataset. It achieves the following results on the evaluation set:
- Loss: 0.4207
- Wer: 0.4623
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 30.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
13.9355 | 0.1435 | 100 | 0.8513 | 0.7488 |
1.4992 | 0.2869 | 200 | 0.5382 | 0.5681 |
1.3313 | 0.4304 | 300 | 0.5200 | 0.5323 |
1.2534 | 0.5739 | 400 | 0.5143 | 0.5271 |
1.1283 | 0.7174 | 500 | 0.5079 | 0.5200 |
1.3508 | 0.8608 | 600 | 0.5003 | 0.5192 |
1.2932 | 1.0043 | 700 | 0.4889 | 0.5369 |
1.1563 | 1.1478 | 800 | 0.4810 | 0.5136 |
1.3196 | 1.2912 | 900 | 0.4783 | 0.5051 |
1.2192 | 1.4347 | 1000 | 0.4746 | 0.5063 |
1.2438 | 1.5782 | 1100 | 0.4722 | 0.5063 |
1.1431 | 1.7217 | 1200 | 0.4712 | 0.5124 |
1.1551 | 1.8651 | 1300 | 0.4713 | 0.5021 |
1.1323 | 2.0086 | 1400 | 0.4661 | 0.5164 |
1.102 | 2.1521 | 1500 | 0.4579 | 0.4910 |
1.2423 | 2.2956 | 1600 | 0.4684 | 0.4956 |
1.1187 | 2.4390 | 1700 | 0.4470 | 0.4838 |
1.1542 | 2.5825 | 1800 | 0.4421 | 0.4782 |
1.1252 | 2.7260 | 1900 | 0.4362 | 0.4848 |
1.018 | 2.8694 | 2000 | 0.4483 | 0.4810 |
1.1281 | 3.0129 | 2100 | 0.4357 | 0.4754 |
1.1278 | 3.1564 | 2200 | 0.4405 | 0.4670 |
1.0072 | 3.2999 | 2300 | 0.4450 | 0.4704 |
1.0484 | 3.4433 | 2400 | 0.4355 | 0.4778 |
1.0515 | 3.5868 | 2500 | 0.4268 | 0.4796 |
0.9878 | 3.7303 | 2600 | 0.4359 | 0.4660 |
1.1363 | 3.8737 | 2700 | 0.4255 | 0.4858 |
1.0978 | 4.0172 | 2800 | 0.4171 | 0.4648 |
0.9957 | 4.1607 | 2900 | 0.4241 | 0.4744 |
1.0 | 4.3042 | 3000 | 0.4156 | 0.4560 |
1.0098 | 4.4476 | 3100 | 0.4154 | 0.4596 |
1.0682 | 4.5911 | 3200 | 0.4186 | 0.4708 |
1.0633 | 4.7346 | 3300 | 0.4255 | 0.4636 |
1.0339 | 4.8780 | 3400 | 0.4206 | 0.4581 |
1.0169 | 5.0215 | 3500 | 0.4207 | 0.4623 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for csikasote/mms-1b-bsbigcgen-combined-model
Base model
facebook/mms-1b-all