This model is fine-tuned from microsoft/Phi-4-multimodal-instruct on Bingsu/zeroth-korean, google/flerus in 5 epochs.
This model is trained 960 steps on datasets for Korean Audio Speech Recognition on H100.
After that, we will check if it can perform scalable work through additional training with synthetic data from CoVoST2 Dataset into Korean.
Evaluation
Evaluation by
from whisper_normalizer.basic import BasicTextNormalizer
from evaluate import load
normalizer = BasicTextNormalizer()
cer_metric = load("cer")
wer_metric = load("wer")
Model | zeroth-test-BLEU | zeroth-test-CER | zeroth-test-WER | fleurs-test-BLEU | fleurs-test-CER | fleurs-test-WER |
---|---|---|---|---|---|---|
original | 0.071 | 126.4 | 121.5 | 0.010 | 115.7 | 112.8 |
finetune (this model) | 94.837 | 1.429 | 2.951 | 67.659 | 7.951 | 18.313 |
Evaluation was done on the following datasets:
- ASR (Automatic Speech Recognition): Evaluated with CER (Character Error Rate) on zeroth-test set (457 samples).
- AST (Automatic Speech Translation): Evaluated with BLEU score on fleurs ko <-> en speech translation result (270 samples).
Script is retrieved from here.
Compared to Phi-4-mm-inst-zeroth-kor and Phi-4-multimodal-finetune-ko-speech, ASR is significantly improved.
Model | zeroth-test | fleurs-ko2en | fleurs-ko2en-cot | fleurs-en2ko | fleurs-en2ko-cot |
---|---|---|---|---|---|
original | 198.32 | 5.63 | 2.42 | 6.86 | 4.17 |
finetune (this model) | 1.31 | 7.46 | 6.24 | 12.15 | 8.91 |
daekeun-ml/Phi-4-multimodal-finetune-ko-speech | 3.80 | 7.03 | 7.04 | 12.50 | 9.54 |
seastar105/Phi-4-mm-inst-zeroth-kor | 7.02 | 7.07 | 9.19 | 13.08 | 9.35 |
- Downloads last month
- 28
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support model that require custom code execution.
Model tree for junnei/Phi-4-multimodal-instruct-ko-asr
Base model
microsoft/Phi-4-multimodal-instructDatasets used to train junnei/Phi-4-multimodal-instruct-ko-asr
Evaluation results
- zeroth-test-BLEU on zeroth-korean-testself-reported94.837
- zeroth-test-CER on zeroth-korean-testself-reported1.429
- zeroth-test-WER on zeroth-korean-testself-reported2.951
- fleurs-test-BLEU on flerus-ko-testself-reported67.659
- fleurs-test-CER on flerus-ko-testself-reported7.951
- fleurs-test-WER on flerus-ko-testself-reported18.313