mms-1b-bsbigcgen-female-model

This model is a fine-tuned version of facebook/mms-1b-all on the BSBIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4088
  • Wer: 0.4342

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
13.8683 0.2841 100 0.8184 0.7100
1.4586 0.5682 200 0.5705 0.5411
1.3937 0.8523 300 0.5288 0.5116
1.1452 1.1364 400 0.5221 0.5031
1.2291 1.4205 500 0.5149 0.4944
1.2051 1.7045 600 0.5003 0.4884
1.1823 1.9886 700 0.4786 0.4857
1.1112 2.2727 800 0.4861 0.4710
1.1922 2.5568 900 0.4690 0.4745
1.0149 2.8409 1000 0.4645 0.4674
1.0106 3.125 1100 0.4579 0.4630
0.9846 3.4091 1200 0.4369 0.4656
1.1968 3.6932 1300 0.4437 0.4999
0.9886 3.9773 1400 0.4274 0.4520
0.9513 4.2614 1500 0.4267 0.4692
0.9436 4.5455 1600 0.4292 0.5063
1.0578 4.8295 1700 0.4246 0.4846
0.967 5.1136 1800 0.4542 0.5221
0.9696 5.3977 1900 0.4294 0.4852
0.9668 5.6818 2000 0.4203 0.4642
0.9362 5.9659 2100 0.4160 0.4692
0.8978 6.25 2200 0.4122 0.4985
0.9527 6.5341 2300 0.4077 0.4569
0.9326 6.8182 2400 0.4215 0.4834
0.849 7.1023 2500 0.4102 0.4633
0.9871 7.3864 2600 0.4034 0.4672
0.9225 7.6705 2700 0.4108 0.4628
0.8008 7.9545 2800 0.4054 0.4367
0.8907 8.2386 2900 0.4248 0.4813
0.841 8.5227 3000 0.4088 0.4342

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
26
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for csikasote/mms-1b-bsbigcgen-female-model

Finetuned
(258)
this model