hubert-large-ls960-ft-lg-CV-v1

This model is a fine-tuned version of facebook/hubert-large-ls960-ft on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.6251
Wer: 0.2041
Cer: 0.0609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.5741	1.0	4442	0.4573	0.4271	0.1144
0.3189	2.0	8884	0.3821	0.3444	0.0932
0.2692	3.0	13326	0.3881	0.3310	0.0912
0.2405	4.0	17768	0.3453	0.3136	0.0854
0.2191	5.0	22210	0.3476	0.2931	0.0823
0.2039	6.0	26652	0.3841	0.2880	0.0825
0.1913	7.0	31094	0.3532	0.2869	0.0798
0.18	8.0	35536	0.3727	0.2849	0.0823
0.1708	9.0	39978	0.3410	0.2773	0.0785
0.1624	10.0	44420	0.3604	0.2705	0.0794
0.1552	11.0	48862	0.3589	0.2661	0.0765
0.1485	12.0	53304	0.3614	0.2687	0.0770
0.1418	13.0	57746	0.3500	0.2637	0.0762
0.1358	14.0	62188	0.3713	0.2628	0.0766
0.131	15.0	66630	0.3908	0.2603	0.0758
0.1255	16.0	71072	0.4089	0.2608	0.0758
0.1205	17.0	75514	0.3848	0.2595	0.0742
0.1162	18.0	79956	0.3554	0.2594	0.0739
0.1125	19.0	84398	0.3461	0.2593	0.0742
0.1073	20.0	88840	0.3663	0.2545	0.0729
0.1039	21.0	93282	0.4556	0.2578	0.0743
0.1	22.0	97724	0.4258	0.2504	0.0724
0.0965	23.0	102166	0.4246	0.2545	0.0754
0.0931	24.0	106608	0.4570	0.2603	0.0757
0.0894	25.0	111050	0.4039	0.2488	0.0732
0.0865	26.0	115492	0.4119	0.2510	0.0720
0.083	27.0	119934	0.4227	0.2454	0.0716
0.0805	28.0	124376	0.4424	0.2541	0.0728
0.0777	29.0	128818	0.4061	0.2457	0.0709
0.0762	30.0	133260	0.4114	0.2450	0.0704
0.0724	31.0	137702	0.4599	0.2516	0.0719
0.0711	32.0	142144	0.4311	0.2466	0.0714
0.069	33.0	146586	0.4517	0.2482	0.0717
0.0673	34.0	151028	0.4728	0.2467	0.0712
0.0655	35.0	155470	0.4542	0.2437	0.0713
0.0634	36.0	159912	0.4546	0.2480	0.0713
0.0612	37.0	164354	0.4852	0.2479	0.0718
0.0607	38.0	168796	0.4892	0.2433	0.0705
0.0585	39.0	173238	0.4686	0.2416	0.0702
0.0573	40.0	177680	0.4725	0.2412	0.0710
0.0556	41.0	182122	0.4737	0.2385	0.0696
0.0548	42.0	186564	0.4964	0.2448	0.0704
0.0527	43.0	191006	0.5236	0.2429	0.0706
0.052	44.0	195448	0.5130	0.2415	0.0714
0.0503	45.0	199890	0.4936	0.2375	0.0688
0.0496	46.0	204332	0.5120	0.2336	0.0680
0.048	47.0	208774	0.4964	0.2362	0.0694
0.0473	48.0	213216	0.5200	0.2372	0.0687
0.0465	49.0	217658	0.5433	0.2424	0.0708
0.0447	50.0	222100	0.5008	0.2335	0.0680
0.0444	51.0	226542	0.5024	0.2247	0.0668
0.0431	52.0	230984	0.5003	0.2307	0.0669
0.0423	53.0	235426	0.4892	0.2331	0.0676
0.0403	54.0	239868	0.5495	0.2316	0.0679
0.0406	55.0	244310	0.5193	0.2278	0.0661
0.0391	56.0	248752	0.5961	0.2331	0.0687
0.0389	57.0	253194	0.5227	0.2297	0.0667
0.0379	58.0	257636	0.5506	0.2295	0.0672
0.0366	59.0	262078	0.5725	0.2231	0.0673
0.0357	60.0	266520	0.5493	0.2280	0.0662
0.0357	61.0	270962	0.5355	0.2269	0.0656
0.035	62.0	275404	0.5430	0.2226	0.0653
0.0343	63.0	279846	0.5375	0.2211	0.0644
0.0334	64.0	284288	0.5769	0.2248	0.0668
0.0333	65.0	288730	0.5763	0.2183	0.0642
0.0322	66.0	293172	0.5787	0.2190	0.0653
0.0314	67.0	297614	0.5564	0.2207	0.0642
0.0305	68.0	302056	0.5813	0.2208	0.0666
0.03	69.0	306498	0.5837	0.2217	0.0647
0.0292	70.0	310940	0.5723	0.2238	0.0649
0.0284	71.0	315382	0.5503	0.2218	0.0645
0.0285	72.0	319824	0.5615	0.2187	0.0636
0.0276	73.0	324266	0.5725	0.2178	0.0650
0.0273	74.0	328708	0.5483	0.2187	0.0634
0.027	75.0	333150	0.5627	0.2148	0.0632
0.026	76.0	337592	0.5610	0.2203	0.0655
0.0253	77.0	342034	0.5776	0.2153	0.0635
0.0248	78.0	346476	0.5823	0.2173	0.0643
0.0242	79.0	350918	0.5968	0.2172	0.0639
0.0241	80.0	355360	0.6121	0.2185	0.0647
0.0232	81.0	359802	0.5909	0.2140	0.0648
0.0227	82.0	364244	0.6262	0.2209	0.0663
0.0224	83.0	368686	0.5913	0.2137	0.0645
0.0215	84.0	373128	0.6057	0.2141	0.0642
0.0212	85.0	377570	0.6079	0.2135	0.0635
0.0209	86.0	382012	0.6067	0.2117	0.0639
0.0201	87.0	386454	0.6119	0.2108	0.0638
0.0199	88.0	390896	0.6298	0.2112	0.0638
0.0194	89.0	395338	0.6054	0.2083	0.0620
0.0192	90.0	399780	0.6238	0.2083	0.0634
0.0184	91.0	404222	0.6293	0.2099	0.0630
0.0184	92.0	408664	0.6166	0.2058	0.0611
0.0182	93.0	413106	0.6175	0.2072	0.0618
0.0179	94.0	417548	0.6196	0.2061	0.0610
0.0176	95.0	421990	0.6181	0.2059	0.0614
0.0174	96.0	426432	0.6187	0.2039	0.0606
0.0167	97.0	430874	0.6381	0.2064	0.0615
0.017	98.0	435316	0.6268	0.2049	0.0611
0.0165	99.0	439758	0.6262	0.2041	0.0610
0.0166	100.0	444200	0.6251	0.2041	0.0609

Framework versions

Transformers 4.46.3
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.20.3

sulaimank
/

hubert-large-ls960-ft-lg-CV-v1

hubert-large-ls960-ft-lg-CV-v1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sulaimank/hubert-large-ls960-ft-lg-CV-v1

Collection including sulaimank/hubert-large-ls960-ft-lg-CV-v1

Grain Models

Evaluation results