metadata

library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: whisper-small-es-ja
    results: []
datasets:
  - Marianoleiras/voxpopuli_es-ja
language:
  - es
  - ja
base_model:
  - openai/whisper-small

whisper-small-es-ja

This model is a fine-tuned version of OpenAI's whisper-small on the Marianoleiras/voxpopuli_es-ja dataset, designed for Spanish-to-Japanese and Japanese-to-Spanish speech-to-text (STT) tasks. It leverages OpenAI's Whisper architecture, which is well-suited for multilingual speech recognition and translation tasks. The model achieves robust performance on both the evaluation and test sets, demonstrating its effectiveness in multilingual STT applications.

It achieves the following results on the evaluation set:

Loss: 1.1724
Bleu: 22.2850

It achieves the following results on the test set:

Bleu: 21.4557

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 3500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Validation Loss
1.5787	0.3962	250	11.6756	1.5196
1.3535	0.7924	500	16.0514	1.3470
1.0658	1.1886	750	17.7743	1.2533
1.0303	1.5848	1000	19.1894	1.2046
0.9893	1.9810	1250	20.1198	1.1591
0.7569	2.3772	1500	21.0054	1.1546
0.7571	2.7734	1750	21.6425	1.1378
0.5557	3.1696	2000	21.7563	1.1500
0.5612	3.5658	2250	21.1391	1.1395
0.5581	3.9620	2500	22.0412	1.1343
0.4144	4.3582	2750	22.2850	1.1724
0.4114	4.7544	3000	22.1925	1.1681
0.3005	5.1506	3250	21.4948	1.1947
0.2945	5.5468	3500	22.1454	1.1921

Framework versions

Transformers 4.47.1
Pytorch 2.4.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0