--- license: apache-2.0 datasets: - openslr/openslr - google/fleurs - PhanithLIM/rfi-news-dataset - seanghay/km-speech-corpus language: - km metrics: - wer base_model: - openai/whisper-small pipeline_tag: automatic-speech-recognition widget: - src: output/1.wav example_title: Audio 1 output: text: "ក្នុងរាត្រីកាលដ៏ស្ងប់ស្ងាត់មួយ បានផ្តិតជាប់នៅរូបភាពដ៏សែនសោកសង្រែងជាខ្លាំងចំពោះបុរសចំទង់ម៉ុនាស់" - src: output/2.wav example_title: Audio 2 output: text: "ពុក កុំជាទៅដល់ហើយ!សុំទេវិត្តអាចពុកកុំអោយកើតឯងមុនពេលខ្ញុំទៅដល់!" --- This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset. It achieves the following results on the evaluation set: - eval_loss: 0.18 - eval_wer: 65.4881 (0.654881) - eval_runtime: 2738.0001 - eval_samples_per_second: 1.588 - eval_steps_per_second: 0.199 - epoch: 4.0 - step: 4345 ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_steps: 1000 - num_epochs: 10 ### Framework versions - Transformers 4.45.2 - Pytorch 2.5.1+cu121 - Datasets 3.1.0 - Tokenizers 0.20.3