Whisper Small GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia as well as a copy of the dataset with noise reduction and normalization (for both train and test) dataset. The datasets were processed with noise reduction and normalization (both the train and test splits). It achieves the following results on the evaluation set:

Loss: 1.3339
Bleu: 30.66
Chrf: 46.99
Wer: 65.4660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.01
training_steps: 3000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
1.41	0.07	100	9.78	25.23	1.8782	96.3980
1.2436	0.13	200	10.23	28.66	1.8301	125.9343
1.593	0.2	300	9.53	30.7	1.7066	137.1454
1.9589	0.26	400	12.08	32.94	1.5629	109.3652
1.8174	0.33	500	13.73	34.5	1.5154	123.5930
1.6775	0.39	600	15.8	35.68	1.5220	102.2062
1.7074	0.46	700	16.62	37.96	1.4570	100.5853
1.5793	0.53	800	24.5	39.91	1.4265	71.3643
1.3708	0.59	900	24.35	42.26	1.3845	73.7956
1.3217	0.66	1000	19.34	41.3	1.3662	87.7533
1.2572	0.72	1100	21.59	41.35	1.3529	88.4286
1.1447	0.79	1200	28.39	44.99	1.3228	65.9163
1.1544	0.85	1300	23.69	43.07	1.2972	80.1891
1.0291	0.92	1400	29.36	45.45	1.2828	70.9590
0.9394	0.98	1500	26.44	44.0	1.2812	74.1558
0.3764	1.05	1600	26.95	44.82	1.3248	73.8406
0.3338	1.12	1700	26.5	44.96	1.3212	77.3976
0.3148	1.18	1800	29.57	46.31	1.3188	66.7267
0.3206	1.25	1900	30.87	47.21	1.3050	64.4755
0.3069	1.31	2000	30.15	46.19	1.3053	65.6911
0.3342	1.38	2100	1.3506	24.14	44.12	77.2625
0.3125	1.44	2200	1.3369	30.21	46.08	63.9802
0.319	1.51	2300	1.3601	27.71	45.45	69.9235
0.3067	1.58	2400	1.3473	26.92	45.73	69.3381
0.2621	1.64	2500	1.3354	28.36	46.14	66.9068
0.2709	1.71	2600	1.3339	28.75	45.47	65.2859
0.2644	1.77	2700	1.3100	28.84	47.35	65.8262
0.2511	1.84	2800	1.3261	29.41	47.31	69.4732
0.2232	1.9	2900	1.3382	30.79	46.63	64.1153
0.236	1.97	3000	1.3339	30.66	46.99	65.4660

Framework versions

Transformers 4.39.3
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

ymoslem
/

whisper-small-ga2en-v4

Whisper Small GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-small-ga2en-v4

Datasets used to train ymoslem/whisper-small-ga2en-v4

Evaluation results