Whisper Small GA-EN Speech Translation, 1 epoch, 10k steps

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. It achieves the following results on the evaluation set:

Loss: 1.3134
Bleu: 36.12
Chrf: 53.74
Wer: 58.3071

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.02
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.6291	0.0109	100	2.33	16.34	2.1971	175.5516
2.6591	0.0219	200	5.57	22.49	2.0357	122.2873
2.5637	0.0328	300	7.67	26.29	1.8690	133.0032
2.2954	0.0438	400	11.2	30.03	1.8062	114.2278
2.3292	0.0547	500	9.85	29.28	1.7421	117.2895
2.1223	0.0657	600	14.56	32.56	1.6739	84.2864
2.2398	0.0766	700	13.86	34.74	1.7187	98.9644
2.002	0.0876	800	15.53	36.64	1.6392	96.7582
1.8611	0.0985	900	15.8	36.32	1.6283	94.3719
1.8498	0.1095	1000	17.58	36.0	1.6102	85.5921
1.7585	0.1204	1100	15.91	36.61	1.6337	100.2251
1.6115	0.1314	1200	22.21	39.94	1.5381	76.8122
1.4415	0.1423	1300	20.36	37.87	1.5864	79.1986
1.5103	0.1533	1400	23.2	41.26	1.4925	75.2364
1.6576	0.1642	1500	18.12	40.49	1.4508	102.9266
1.3429	0.1752	1600	27.88	43.74	1.4399	69.7884
1.2522	0.1861	1700	23.04	43.31	1.4256	77.1724
1.2018	0.1970	1800	21.06	40.39	1.4072	78.6583
1.1945	0.2080	1900	23.0	42.71	1.4222	76.7222
1.1869	0.2189	2000	22.54	42.02	1.3992	75.8667
1.1752	0.2299	2100	20.81	41.07	1.3926	79.5137
1.0281	0.2408	2200	27.24	45.55	1.3633	69.6083
0.894	0.2518	2300	28.6	45.58	1.3287	65.8712
0.9788	0.2627	2400	27.75	46.21	1.3138	69.2931
0.8418	0.2737	2500	27.85	46.17	1.3064	68.3026
0.7559	0.2846	2600	28.44	48.52	1.2903	68.3476
0.8632	0.2956	2700	27.87	46.86	1.2834	68.3476
0.7501	0.3065	2800	28.63	49.25	1.2669	68.5277
0.6953	0.3175	2900	30.46	48.83	1.2615	64.4304
0.7195	0.3284	3000	27.49	47.94	1.2514	71.0941
0.6155	0.3394	3100	30.06	49.64	1.2428	66.5916
0.605	0.3503	3200	31.64	50.27	1.2040	63.8451
0.6349	0.3612	3300	28.96	49.35	1.2077	65.3760
0.4669	0.3722	3400	31.17	48.95	1.2219	64.2503
0.5196	0.3831	3500	30.97	50.13	1.2124	63.8001
0.5141	0.3941	3600	31.97	50.8	1.2026	63.0347
0.4221	0.4050	3700	31.76	51.35	1.1893	63.4399
0.2951	0.4160	3800	32.4	51.08	1.2049	63.1247
0.3898	0.4269	3900	32.15	51.09	1.1906	63.5299
0.4071	0.4379	4000	33.1	51.85	1.1873	62.4043
0.3975	0.4488	4100	29.58	49.33	1.2117	70.3287
0.4206	0.4598	4200	31.69	50.8	1.2150	65.0158
0.2935	0.4707	4300	32.9	50.01	1.2484	62.8546
0.3718	0.4817	4400	31.64	50.55	1.2055	63.8451
0.3722	0.4926	4500	28.16	49.28	1.2200	70.4638
0.2986	0.5036	4600	28.76	49.9	1.2240	68.7528
0.3327	0.5145	4700	29.34	49.67	1.2052	67.5822
0.2489	0.5255	4800	32.52	51.77	1.2083	62.4493
0.3653	0.5364	4900	31.48	51.16	1.2166	63.8451
0.3326	0.5473	5000	33.04	51.71	1.2169	62.4493
0.3045	0.5583	5100	27.45	48.22	1.2460	68.9779
0.3444	0.5692	5200	33.14	50.76	1.2829	62.2692
0.3236	0.5802	5300	28.89	49.37	1.2499	70.3737
0.3004	0.5911	5400	29.89	49.29	1.3165	68.7078
0.3019	0.6021	5500	32.8	49.78	1.2782	62.8095
0.2923	0.6130	5600	31.75	50.26	1.2468	63.3498
0.3237	0.6240	5700	34.4	52.59	1.2511	61.0986
0.2226	0.6349	5800	30.51	50.38	1.2479	63.3498
0.2207	0.6459	5900	32.68	51.97	1.2641	62.1342
0.2017	0.6568	6000	32.47	51.36	1.2640	62.6745
0.201	0.6678	6100	33.6	52.29	1.2774	61.4588
0.203	0.6787	6200	30.27	50.84	1.2670	65.6461
0.1456	0.6897	6300	31.2	51.05	1.2656	63.3048
0.1607	0.7006	6400	30.39	51.04	1.2611	65.8262
0.1933	0.7115	6500	31.78	50.92	1.2545	63.0797
0.1537	0.7225	6600	30.18	50.18	1.2500	64.7006
0.1279	0.7334	6700	33.23	51.0	1.2548	59.8379
0.1189	0.7444	6800	33.51	50.67	1.2594	61.1887
0.1056	0.7553	6900	32.97	51.02	1.2578	61.9991
0.1105	0.7663	7000	32.74	50.83	1.2569	62.0441
0.1183	0.7772	7100	34.07	52.2	1.2590	60.4232
0.1373	0.7882	7200	33.55	50.6	1.2430	61.2787
0.1325	0.7991	7300	32.36	50.39	1.2548	62.3143
0.0907	0.8101	7400	32.28	50.99	1.2578	61.2787
0.0919	0.8210	7500	33.01	51.81	1.2791	60.4683
0.0852	0.8320	7600	32.97	51.56	1.2782	61.5489
0.1223	0.8429	7700	33.57	52.33	1.2638	59.9280
0.0826	0.8539	7800	33.83	52.7	1.2634	60.1531
0.0783	0.8648	7900	33.79	52.31	1.2595	60.1081
0.0986	0.8758	8000	34.33	52.54	1.2608	59.4327
0.1148	0.8867	8100	1.2736	34.03	52.52	59.8829
0.1134	0.8976	8200	1.3073	34.14	51.64	61.5038
0.1166	0.9086	8300	1.3385	30.51	49.26	65.5561
0.0871	0.9195	8400	1.3313	32.31	51.06	62.5394
0.0927	0.9305	8500	1.3898	28.64	48.43	69.3832
0.1012	0.9414	8600	1.3144	33.12	52.02	61.4138
0.0742	0.9524	8700	1.3284	33.68	51.38	61.7740
0.0802	0.9633	8800	1.3300	34.33	51.38	61.4138
0.0799	0.9743	8900	1.3328	33.72	50.77	60.1981
0.0936	0.9852	9000	1.3181	34.76	51.4	60.0630
0.1091	0.9962	9100	1.3096	35.13	52.6	59.9730
0.0427	1.0071	9200	1.2905	35.49	53.12	59.8379
0.0338	1.0181	9300	1.3097	35.33	52.62	60.5133
0.0363	1.0290	9400	1.3172	35.51	53.06	59.6128
0.0319	1.0400	9500	1.3166	36.82	53.6	58.3971
0.0434	1.0509	9600	1.3050	35.62	53.28	59.6578
0.0218	1.0619	9700	1.3096	35.57	53.28	59.5227
0.0316	1.0728	9800	1.3162	36.14	53.87	58.3971
0.0315	1.0837	9900	1.3121	36.26	54.16	58.3521
0.0229	1.0947	10000	1.3134	36.12	53.74	58.3071

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.19.2
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v5.2.1-r

Whisper Small GA-EN Speech Translation, 1 epoch, 10k steps

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v5.2.1-r

Datasets used to train ymoslem/whisper-medium-ga2en-v5.2.1-r

Evaluation results