metadata

library_name: transformers
language:
  - bem
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
  - generated_from_trainer
datasets:
  - BIG_C/Bemba
metrics:
  - wer
model-index:
  - name: facebook/mms-1b-all
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: BIG_C
          type: BIG_C/Bemba
        metrics:
          - name: Wer
            type: wer
            value: 0.40599536838493866

facebook/mms-1b-all

This model is a fine-tuned version of facebook/mms-1b-all on the BIG_C dataset. It achieves the following results on the evaluation set:

Loss: 0.3426
Wer: 0.4060
Cer: 0.0753

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.7367	1.0	5149	0.5238	0.4604	0.1187
0.5437	2.0	10298	0.5032	0.4461	0.1163
0.5201	3.0	15447	0.4849	0.4213	0.1092
0.507	4.0	20596	0.4776	0.4179	0.1109
0.4967	5.0	25745	0.4725	0.4062	0.1066
0.4889	6.0	30894	0.4758	0.3988	0.1038
0.4813	7.0	36043	0.4741	0.3888	0.1035
0.4754	8.0	41192	0.4625	0.3916	0.1055
0.4703	9.0	46341	0.4572	0.3853	0.1029
0.4655	10.0	51490	0.4541	0.3924	0.1030
0.4608	11.0	56639	0.4562	0.3833	0.1024
0.4569	12.0	61788	0.4540	0.3842	0.1020
0.4536	13.0	66937	0.4581	0.3709	0.0989
0.4498	14.0	72086	0.4572	0.3715	0.0991
0.4463	15.0	77235	0.4547	0.3696	0.0990
0.4424	16.0	82384	0.4505	0.3703	0.0990
0.439	17.0	87533	0.4494	0.3683	0.0985
0.4362	18.0	92682	0.4448	0.3648	0.0983
0.4326	19.0	97831	0.4528	0.3650	0.0984
0.4304	20.0	102980	0.4428	0.3660	0.0988
0.4266	21.0	108129	0.4478	0.3633	0.0976
0.4238	22.0	113278	0.4528	0.3669	0.0978
0.4212	23.0	118427	0.4478	0.3619	0.0974
0.4191	24.0	123576	0.4473	0.3596	0.0974
0.416	25.0	128725	0.4535	0.3601	0.0978
0.4131	26.0	133874	0.4486	0.3591	0.0977
0.4111	27.0	139023	0.4458	0.3668	0.0981
0.4084	28.0	144172	0.4482	0.3590	0.0967
0.4061	29.0	149321	0.4504	0.3541	0.0962
0.4031	30.0	154470	0.4442	0.3584	0.0967
0.401	31.0	159619	0.4453	0.3596	0.0971
0.3985	32.0	164768	0.4479	0.3567	0.0964
0.3966	33.0	169917	0.4464	0.3556	0.0963
0.3937	34.0	175066	0.4450	0.3525	0.0962
0.392	35.0	180215	0.4415	0.3514	0.0973
0.3898	36.0	185364	0.4552	0.3532	0.0959
0.3879	37.0	190513	0.4543	0.3537	0.0978
0.3861	38.0	195662	0.4484	0.3608	0.0989
0.3838	39.0	200811	0.4481	0.3534	0.0968
0.3808	40.0	205960	0.4566	0.3496	0.0979
0.379	41.0	211109	0.4474	0.3495	0.0969
0.3766	42.0	216258	0.4513	0.3587	0.0995
0.3753	43.0	221407	0.4571	0.3480	0.0956
0.3726	44.0	226556	0.4511	0.3480	0.0963
0.3706	45.0	231705	0.4587	0.3438	0.0944
0.3688	46.0	236854	0.4585	0.3461	0.0952
0.3662	47.0	242003	0.4575	0.3530	0.0978
0.3645	48.0	247152	0.4485	0.3510	0.0966
0.3628	49.0	252301	0.4624	0.3482	0.0957
0.3614	50.0	257450	0.4540	0.3557	0.0965
0.3589	51.0	262599	0.4652	0.3445	0.0947
0.3584	52.0	267748	0.4611	0.3489	0.0957
0.3552	53.0	272897	0.4566	0.3464	0.0952
0.3549	54.0	278046	0.4590	0.3477	0.0961
0.3526	55.0	283195	0.4550	0.3549	0.0988

Framework versions

Transformers 4.47.0.dev0
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.20.3