fl_asr_speech_recognition

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2947
  • Wer: 0.1449
  • Cer: 0.0451

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.625 12.6582 1000 0.4832 0.5625 0.1090
0.3037 25.3165 2000 0.3879 0.3665 0.0686
0.2127 37.9747 3000 0.4096 0.2926 0.0617
0.1767 50.6329 4000 0.3967 0.25 0.0552
0.1238 63.2911 5000 0.3024 0.2273 0.0529
0.0868 75.9494 6000 0.3768 0.2330 0.0487
0.0823 88.6076 7000 0.2742 0.2244 0.0420
0.0696 101.2658 8000 0.2792 0.2074 0.0383
0.0496 113.9241 9000 0.3362 0.1591 0.0359
0.0413 126.5823 10000 0.3061 0.1562 0.0400
0.0286 139.2405 11000 0.3264 0.1591 0.0406
0.0294 151.8987 12000 0.3046 0.1648 0.0424
0.0183 164.5570 13000 0.3083 0.1506 0.0400
0.0159 177.2152 14000 0.2947 0.1449 0.0451
0.009 189.8734 15000 0.3198 0.1477 0.0411

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .