Whisper Small GA-EN Speech Translation + VAD

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7352
  • Bleu: 28.22
  • Chrf: 44.19
  • Wer: 68.5277

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Wer
1.9529 0.2188 100 1.7388 12.76 29.03 97.1184
1.5762 0.4376 200 1.5362 15.3 33.31 98.4241
1.2624 0.6565 300 1.4346 17.94 37.2 101.4408
1.0367 0.8753 400 1.4502 21.52 39.13 85.4120
0.4677 1.0941 500 1.4693 23.26 40.49 78.4331
0.4284 1.3129 600 1.5163 21.31 41.41 86.0873
0.4026 1.5317 700 1.4999 24.11 40.59 79.3787
0.4132 1.7505 800 1.5134 27.77 43.01 70.1936
0.3701 1.9694 900 1.5368 27.74 42.61 66.0964
0.1337 2.1882 1000 1.5692 27.96 43.77 64.9257
0.143 2.4070 1100 1.5516 26.06 42.12 71.3192
0.144 2.6258 1200 1.5839 27.55 43.19 69.7434
0.1372 2.8446 1300 1.5510 27.93 43.07 66.1414
0.0573 3.0635 1400 1.6567 26.34 41.69 72.3998
0.0554 3.2823 1500 1.6511 27.98 42.66 68.5277
0.0534 3.5011 1600 1.6732 28.29 43.2 67.1319
0.0588 3.7199 1700 1.6687 27.0 43.31 70.7789
0.0486 3.9387 1800 1.6759 28.02 43.97 66.3665
0.0224 4.1575 1900 1.7597 26.86 41.81 70.5538
0.0264 4.3764 2000 1.7113 27.58 43.38 70.4638
0.0233 4.5952 2100 1.7013 27.83 42.87 68.2575
0.0192 4.8140 2200 1.7351 25.39 42.09 78.0279
0.0149 5.0328 2300 1.7350 27.62 43.99 70.5538
0.0086 5.2516 2400 1.7331 29.37 45.08 68.5277
0.006 5.4705 2500 1.7145 29.04 44.19 66.9968
0.0064 5.6893 2600 1.7322 28.27 43.6 70.2386
0.0053 5.9081 2700 1.7239 27.86 43.78 69.6083
0.0021 6.1269 2800 1.7288 28.14 44.12 68.5727
0.0016 6.3457 2900 1.7375 28.26 44.14 68.7078
0.0023 6.5646 3000 1.7352 28.22 44.19 68.5277

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
28
Safetensors
Model size
242M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ymoslem/whisper-small-ga2en-v1.5-r

Finetuned
(2105)
this model

Datasets used to train ymoslem/whisper-small-ga2en-v1.5-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    28.220
  • Wer on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    68.528