metadata
library_name: transformers
language:
- hu
license: apache-2.0
base_model: openai/whisper-tiny
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-tiny-finetuned-hu
results: []
whisper-tiny-finetuned-hu
This model is a fine-tuned version of openai/whisper-tiny on the custom dataset. It achieves the following results on the evaluation set:
- Loss: 0.0418
- Wer: 0.1249
Tests on another databases and compare another models (tiny, base, small, mediun, large)
model_name | WER | CER | Norm WER | Norm CER | dataset | batch_size | language | runtime |
---|---|---|---|---|---|---|---|---|
openai/whisper-large-v3 | 19.77 | 4.81 | 14.62 | 3.73 | g_fleurs_test_hu | 16 | hu | 617.91 |
openai/whisper-large-v3 | 21.81 | 5.81 | 18.07 | 4.95 | CV_17_0_hu_test | 16 | hu | 5676.63 |
openai/whisper-large-v2 | 24.04 | 6.24 | 19.26 | 5.15 | g_fleurs_test_hu | 16 | hu | 627.70 |
openai/whisper-large-v2 | 25.97 | 6.57 | 21.82 | 5.47 | CV_17_0_hu_test | 16 | hu | 9275.54 |
sarpba/whisper-base-hungarian_v1 | 27.65 | 6.77 | 23.53 | 5.77 | CV_17_0_hu_test | 32 | hu | 460.27 |
openai/whisper-large | 30.13 | 8.93 | 26.20 | 8.04 | CV_17_0_hu_test | 16 | hu | 5909.03 |
---> sarpba/whisper-hu-tiny-finetuned | 30.81 | 7.67 | 26.63 | 6.60 | CV_17_0_hu_test | 32 | hu | 328.25 |
openai/whisper-large | 31.74 | 10.69 | 26.67 | 9.57 | g_fleurs_test_hu | 16 | hu | 711.97 |
openai/whisper-medium | 33.04 | 9.93 | 27.97 | 8.34 | g_fleurs_test_hu | 32 | hu | 450.89 |
sarpba/whisper-base-hungarian_v1 | 37.16 | 11.96 | 30.60 | 10.43 | g_fleurs_test_hu | 32 | hu | 67.86 |
openai/whisper-medium | 34.46 | 9.12 | 30.63 | 8.05 | CV_17_0_hu_test | 32 | hu | 3317.29 |
---> sarpba/whisper-hu-tiny-finetuned | 40.32 | 12.85 | 33.99 | 11.33 | g_fleurs_test_hu | 32 | hu | 51.74 |
openai/whisper-small | 50.07 | 15.69 | 45.54 | 14.40 | g_fleurs_test_hu | 32 | hu | 185.89 |
openai/whisper-small | 55.67 | 16.77 | 52.20 | 15.62 | CV_17_0_hu_test | 32 | hu | 1398.06 |
openai/whisper-base | 89.82 | 40.00 | 86.61 | 37.75 | g_fleurs_test_hu | 32 | hu | 118.69 |
openai/whisper-base | 95.66 | 39.98 | 93.67 | 38.51 | CV_17_0_hu_test | 32 | hu | 779.32 |
openai/whisper-tiny | 108.61 | 58.69 | 106.29 | 55.98 | g_fleurs_test_hu | 32 | hu | 90.65 |
openai/whisper-tiny | 120.86 | 55.10 | 119.12 | 53.19 | CV_17_0_hu_test | 32 | hu | 597.92 |
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 2
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
0.1078 | 0.0902 | 2000 | 0.1127 | 0.3073 |
0.0889 | 0.1804 | 4000 | 0.0899 | 0.2509 |
0.0766 | 0.2707 | 6000 | 0.0797 | 0.2238 |
0.0743 | 0.3609 | 8000 | 0.0733 | 0.2094 |
0.0691 | 0.4511 | 10000 | 0.0685 | 0.1963 |
0.0646 | 0.5413 | 12000 | 0.0650 | 0.1858 |
0.0602 | 0.6316 | 14000 | 0.0618 | 0.1759 |
0.0586 | 0.7218 | 16000 | 0.0594 | 0.1737 |
0.0553 | 0.8120 | 18000 | 0.0568 | 0.1665 |
0.055 | 0.9022 | 20000 | 0.0552 | 0.1635 |
0.0522 | 0.9925 | 22000 | 0.0531 | 0.1558 |
0.0415 | 1.0827 | 24000 | 0.0523 | 0.1555 |
0.0419 | 1.1729 | 26000 | 0.0512 | 0.1497 |
0.0406 | 1.2631 | 28000 | 0.0496 | 0.1483 |
0.042 | 1.3534 | 30000 | 0.0490 | 0.1464 |
0.0393 | 1.4436 | 32000 | 0.0473 | 0.1397 |
0.0395 | 1.5338 | 34000 | 0.0458 | 0.1373 |
0.0375 | 1.6240 | 36000 | 0.0448 | 0.1343 |
0.0372 | 1.7143 | 38000 | 0.0442 | 0.1328 |
0.036 | 1.8045 | 40000 | 0.0432 | 0.1286 |
0.0358 | 1.8947 | 42000 | 0.0424 | 0.1273 |
0.035 | 1.9849 | 44000 | 0.0418 | 0.1249 |
Framework versions
- Transformers 4.47.0
- Pytorch 2.5.1+cu118
- Datasets 3.1.0
- Tokenizers 0.21.0