mms-1b-all-swagen-combined-5hrs-42

This model is a fine-tuned version of facebook/mms-1b-all on the SWAGEN - NYA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3301
  • Wer: 0.2354

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
7.2606 0.4785 100 3.1592 1.0085
2.7521 0.9569 200 2.3371 0.9931
2.0221 1.4354 300 1.5196 0.8476
1.152 1.9139 400 0.7881 0.4487
0.7455 2.3923 500 0.6427 0.3592
0.6182 2.8708 600 0.5445 0.3369
0.5513 3.3493 700 0.4796 0.3189
0.4933 3.8278 800 0.4499 0.3112
0.441 4.3062 900 0.4219 0.2904
0.4248 4.7847 1000 0.4193 0.2819
0.4207 5.2632 1100 0.3872 0.2785
0.3842 5.7416 1200 0.3788 0.2687
0.3717 6.2201 1300 0.3735 0.2650
0.36 6.6986 1400 0.3771 0.2620
0.3503 7.1770 1500 0.3682 0.2526
0.3373 7.6555 1600 0.3540 0.2519
0.3404 8.1340 1700 0.3481 0.2452
0.3347 8.6124 1800 0.3396 0.2484
0.327 9.0909 1900 0.3536 0.2427
0.3035 9.5694 2000 0.3567 0.2420
0.3232 10.0478 2100 0.3385 0.2439
0.3051 10.5263 2200 0.3420 0.2398
0.3093 11.0048 2300 0.3330 0.2331
0.3013 11.4833 2400 0.3301 0.2354
0.2975 11.9617 2500 0.3385 0.2336
0.2741 12.4402 2600 0.3271 0.2305
0.2794 12.9187 2700 0.3289 0.2279
0.282 13.3971 2800 0.3331 0.2284
0.2711 13.8756 2900 0.3350 0.2267

Framework versions

  • Transformers 4.53.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.0
Downloads last month
10
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for csikasote/mms-1b-all-swagen-combined-5hrs-42

Finetuned
(358)
this model