metadata

license: apache-2.0
base_model: openai/whisper-medium
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: whisper_medium_finetuning_maior4s_8kh
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: pt
          split: None
          args: pt
        metrics:
          - name: Wer
            type: wer
            value: 26.357459011283584

whisper_medium_finetuning_maior4s_8kh

This model is a fine-tuned version of openai/whisper-medium on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.2421
Wer: 26.3575

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 8000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1168	1.0707	1000	0.1729	24.3127
0.087	2.1413	2000	0.1663	18.2013
0.058	3.2120	3000	0.1709	19.9302
0.0499	4.2827	4000	0.1780	21.3661
0.0336	5.3533	5000	0.1948	25.4951
0.029	6.4240	6000	0.2105	27.6541
0.0245	7.4946	7000	0.2315	26.5528
0.0195	8.5653	8000	0.2421	26.3575

Framework versions

Transformers 4.41.2
Pytorch 2.2.1
Datasets 2.19.2
Tokenizers 0.19.1