mms-1b-bsbigcgen-combined-model

This model is a fine-tuned version of facebook/mms-1b-all on the BSBIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
13.9355	0.1435	100	0.8513	0.7488
1.4992	0.2869	200	0.5382	0.5681
1.3313	0.4304	300	0.5200	0.5323
1.2534	0.5739	400	0.5143	0.5271
1.1283	0.7174	500	0.5079	0.5200
1.3508	0.8608	600	0.5003	0.5192
1.2932	1.0043	700	0.4889	0.5369
1.1563	1.1478	800	0.4810	0.5136
1.3196	1.2912	900	0.4783	0.5051
1.2192	1.4347	1000	0.4746	0.5063
1.2438	1.5782	1100	0.4722	0.5063
1.1431	1.7217	1200	0.4712	0.5124
1.1551	1.8651	1300	0.4713	0.5021
1.1323	2.0086	1400	0.4661	0.5164
1.102	2.1521	1500	0.4579	0.4910
1.2423	2.2956	1600	0.4684	0.4956
1.1187	2.4390	1700	0.4470	0.4838
1.1542	2.5825	1800	0.4421	0.4782
1.1252	2.7260	1900	0.4362	0.4848
1.018	2.8694	2000	0.4483	0.4810
1.1281	3.0129	2100	0.4357	0.4754
1.1278	3.1564	2200	0.4405	0.4670
1.0072	3.2999	2300	0.4450	0.4704
1.0484	3.4433	2400	0.4355	0.4778
1.0515	3.5868	2500	0.4268	0.4796
0.9878	3.7303	2600	0.4359	0.4660
1.1363	3.8737	2700	0.4255	0.4858
1.0978	4.0172	2800	0.4171	0.4648
0.9957	4.1607	2900	0.4241	0.4744
1.0	4.3042	3000	0.4156	0.4560
1.0098	4.4476	3100	0.4154	0.4596
1.0682	4.5911	3200	0.4186	0.4708
1.0633	4.7346	3300	0.4255	0.4636
1.0339	4.8780	3400	0.4206	0.4581
1.0169	5.0215	3500	0.4207	0.4623