whisper-SER-base-v2
This model is a fine-tuned version of openai/whisper-base on the facebook_voxpopulik_16k_Whisper_Compatible dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5113
- Wer: 31.9908
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 12000
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
0.4753 |
0.4322 |
1000 |
0.4532 |
24.8077 |
0.4303 |
0.8643 |
2000 |
0.4212 |
25.1645 |
0.2697 |
1.2965 |
3000 |
0.4265 |
27.7174 |
0.2267 |
1.7286 |
4000 |
0.4122 |
27.1307 |
0.1764 |
2.1608 |
5000 |
0.4505 |
39.1422 |
0.2175 |
2.5929 |
6000 |
0.4206 |
26.8770 |
0.0845 |
3.0251 |
7000 |
0.4547 |
32.9739 |
0.0907 |
3.4572 |
8000 |
0.4707 |
28.8353 |
0.0968 |
3.8894 |
9000 |
0.4768 |
32.9660 |
0.0495 |
4.3215 |
10000 |
0.5026 |
31.2455 |
0.051 |
4.7537 |
11000 |
0.5037 |
32.8312 |
0.0668 |
5.1858 |
12000 |
0.5113 |
31.9908 |
Framework versions
- Transformers 4.48.0
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0