You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

facebook/mms-1b-all

This model is a fine-tuned version of facebook/mms-1b-all on the BIG_C dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3426
  • Wer: 0.4060
  • Cer: 0.0753

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.7367 1.0 5149 0.5238 0.4604 0.1187
0.5437 2.0 10298 0.5032 0.4461 0.1163
0.5201 3.0 15447 0.4849 0.4213 0.1092
0.507 4.0 20596 0.4776 0.4179 0.1109
0.4967 5.0 25745 0.4725 0.4062 0.1066
0.4889 6.0 30894 0.4758 0.3988 0.1038
0.4813 7.0 36043 0.4741 0.3888 0.1035
0.4754 8.0 41192 0.4625 0.3916 0.1055
0.4703 9.0 46341 0.4572 0.3853 0.1029
0.4655 10.0 51490 0.4541 0.3924 0.1030
0.4608 11.0 56639 0.4562 0.3833 0.1024
0.4569 12.0 61788 0.4540 0.3842 0.1020
0.4536 13.0 66937 0.4581 0.3709 0.0989
0.4498 14.0 72086 0.4572 0.3715 0.0991
0.4463 15.0 77235 0.4547 0.3696 0.0990
0.4424 16.0 82384 0.4505 0.3703 0.0990
0.439 17.0 87533 0.4494 0.3683 0.0985
0.4362 18.0 92682 0.4448 0.3648 0.0983
0.4326 19.0 97831 0.4528 0.3650 0.0984
0.4304 20.0 102980 0.4428 0.3660 0.0988
0.4266 21.0 108129 0.4478 0.3633 0.0976
0.4238 22.0 113278 0.4528 0.3669 0.0978
0.4212 23.0 118427 0.4478 0.3619 0.0974
0.4191 24.0 123576 0.4473 0.3596 0.0974
0.416 25.0 128725 0.4535 0.3601 0.0978
0.4131 26.0 133874 0.4486 0.3591 0.0977
0.4111 27.0 139023 0.4458 0.3668 0.0981
0.4084 28.0 144172 0.4482 0.3590 0.0967
0.4061 29.0 149321 0.4504 0.3541 0.0962
0.4031 30.0 154470 0.4442 0.3584 0.0967
0.401 31.0 159619 0.4453 0.3596 0.0971
0.3985 32.0 164768 0.4479 0.3567 0.0964
0.3966 33.0 169917 0.4464 0.3556 0.0963
0.3937 34.0 175066 0.4450 0.3525 0.0962
0.392 35.0 180215 0.4415 0.3514 0.0973
0.3898 36.0 185364 0.4552 0.3532 0.0959
0.3879 37.0 190513 0.4543 0.3537 0.0978
0.3861 38.0 195662 0.4484 0.3608 0.0989
0.3838 39.0 200811 0.4481 0.3534 0.0968
0.3808 40.0 205960 0.4566 0.3496 0.0979
0.379 41.0 211109 0.4474 0.3495 0.0969
0.3766 42.0 216258 0.4513 0.3587 0.0995
0.3753 43.0 221407 0.4571 0.3480 0.0956
0.3726 44.0 226556 0.4511 0.3480 0.0963
0.3706 45.0 231705 0.4587 0.3438 0.0944
0.3688 46.0 236854 0.4585 0.3461 0.0952
0.3662 47.0 242003 0.4575 0.3530 0.0978
0.3645 48.0 247152 0.4485 0.3510 0.0966
0.3628 49.0 252301 0.4624 0.3482 0.0957
0.3614 50.0 257450 0.4540 0.3557 0.0965
0.3589 51.0 262599 0.4652 0.3445 0.0947
0.3584 52.0 267748 0.4611 0.3489 0.0957
0.3552 53.0 272897 0.4566 0.3464 0.0952
0.3549 54.0 278046 0.4590 0.3477 0.0961
0.3526 55.0 283195 0.4550 0.3549 0.0988

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.1.0+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
0
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for asr-africa/mms-1B_all_BIG_C_Bemba_167hr_v1

Finetuned
(214)
this model

Collection including asr-africa/mms-1B_all_BIG_C_Bemba_167hr_v1

Evaluation results