imTak's picture
Update README.md
b8b9548 verified
metadata
library_name: transformers
language:
  - ko
license: mit
base_model: imTak/whisper_large_v3_turbo_Korean2
tags:
  - generated_from_trainer
datasets:
  - imTak/korean-speak-Develop
metrics:
  - wer
model-index:
  - name: Whisper large v3 turbo Korean-Develop
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Develop
          type: imTak/korean-speak-Develop
          args: 'config: ko, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 16.43703941044537

Whisper large v3 turbo Korean-Develop

This model is a fine-tuned version of imTak/whisper_large_v3_ko_ft_ft on the Develop dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3054
  • Wer: 16.4370

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.2119 1.9455 500 0.2721 22.6690
0.0714 3.8911 1000 0.2542 19.9135
0.0145 5.8366 1500 0.2417 18.5037
0.0018 7.7821 2000 0.2410 16.6453
0.0263 9.7276 2500 0.2818 19.4169
0.0179 11.6732 3000 0.2806 18.5838
0.008 13.6187 3500 0.2977 18.1032
0.0072 15.5642 4000 0.2920 17.8949
0.0011 17.5097 4500 0.2875 16.8376
0.0024 19.4553 5000 0.3072 17.8629
0.0009 21.4008 5500 0.2943 16.8536
0.0002 23.3463 6000 0.3041 16.8055
0.0001 25.2918 6500 0.2993 16.6773
0.0001 27.2374 7000 0.3016 16.4851
0.0001 29.1829 7500 0.3043 16.4050
0.0001 31.1284 8000 0.3054 16.4370

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3