Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

Loss: 1.0491
Bleu: 27.38
Chrf: 51.97
Wer: 72.1747

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.02
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Chrf	Wer
2.6534	0.0138	100	2.2446	1.43	15.99	269.1130
2.4519	0.0276	200	2.1941	2.13	18.36	250.5178
2.2928	0.0414	300	2.0086	7.14	25.95	128.3656
2.233	0.0552	400	2.0239	5.61	24.25	134.0837
2.0406	0.0690	500	1.9215	5.64	25.65	183.8361
2.0273	0.0828	600	1.8556	13.41	30.96	83.7010
1.895	0.0966	700	1.8278	7.02	26.82	158.2170
1.9889	0.1103	800	1.7842	12.22	31.62	99.6398
1.8484	0.1241	900	1.7648	10.97	30.45	91.1751
1.7491	0.1379	1000	1.7498	10.0	29.42	109.0050
1.699	0.1517	1100	1.6662	12.53	34.87	109.9054
1.6959	0.1655	1200	1.6287	14.54	34.8	92.3008
1.6682	0.1793	1300	1.5800	13.26	33.5	103.0617
1.6625	0.1931	1400	1.6115	19.71	37.33	75.9118
1.5462	0.2069	1500	1.4993	18.3	39.49	93.7866
1.3834	0.2207	1600	1.4906	20.32	40.87	79.2436
1.39	0.2345	1700	1.4752	17.3	38.16	93.1562
1.5061	0.2483	1800	1.4004	20.11	39.69	81.0446
1.4125	0.2621	1900	1.3854	23.82	42.67	73.3904
1.3181	0.2759	2000	1.3979	20.57	40.87	78.8384
1.283	0.2897	2100	1.3446	17.97	40.47	88.8789
1.2061	0.3034	2200	1.3130	25.12	45.42	73.5254
1.2091	0.3172	2300	1.3274	22.12	43.56	79.8739
1.1264	0.3310	2400	1.2771	22.94	45.96	78.2080
1.0972	0.3448	2500	1.2858	24.38	46.04	75.4615
1.0822	0.3586	2600	1.2376	27.39	48.34	67.6722
1.0316	0.3724	2700	1.2461	28.0	47.61	68.5277
1.165	0.3862	2800	1.1869	26.05	48.13	71.6794
1.025	0.4	2900	1.1716	27.14	47.91	68.7528
0.8978	0.4138	3000	1.1628	28.34	49.15	65.6461
0.9146	0.4276	3100	1.1703	25.81	48.42	71.7244
0.9764	0.4414	3200	1.1526	29.63	51.22	67.3570
0.9455	0.4552	3300	1.1108	25.31	49.73	72.6249
0.9073	0.4690	3400	1.1085	27.7	50.85	72.7150
0.8596	0.4828	3500	1.0927	28.34	52.39	67.9424
0.8241	0.4966	3600	1.1026	29.95	51.37	65.2859
0.8436	0.5103	3700	1.0718	27.18	51.45	71.2292
0.8318	0.5241	3800	1.0678	30.71	53.35	64.3404
0.8262	0.5379	3900	1.0534	27.05	51.94	71.5894
0.8129	0.5517	4000	1.0491	27.38	51.97	72.1747

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v6.2-r

Whisper Medium GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v6.2-r

Datasets used to train ymoslem/whisper-medium-ga2en-v6.2-r

Evaluation results