mms-1b-swagen-balanced-model

This model is a fine-tuned version of facebook/mms-1b-all on the SWAGEN - SWA dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 2500
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
16.6036	0.2392	100	4.9362	1.0067
8.5948	0.4785	200	4.2592	1.0
7.8678	0.7177	300	3.5876	1.0
5.2547	0.9569	400	0.3053	0.2088
0.5538	1.1962	500	0.2696	0.2008
0.5574	1.4354	600	0.2543	0.1941
0.5282	1.6746	700	0.2459	0.1941
0.4837	1.9139	800	0.2387	0.1931
0.4969	2.1531	900	0.2372	0.1996
0.5005	2.3923	1000	0.2337	0.1941
0.4712	2.6316	1100	0.2309	0.1921
0.4783	2.8708	1200	0.2287	0.1902
0.4406	3.1100	1300	0.2316	0.1916
0.463	3.3493	1400	0.2288	0.1892
0.448	3.5885	1500	0.2317	0.1914
0.4567	3.8278	1600	0.2293	0.1945