hubert-large-ls960-ft-lg-CV_GRAIN-v1

This model is a fine-tuned version of facebook/hubert-large-ls960-ft on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1921
Wer: 0.0389
Cer: 0.0143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
1.1234	1.0	1385	0.3733	0.4333	0.0910
0.5164	2.0	2770	0.2676	0.2680	0.0634
0.4223	3.0	4155	0.2327	0.2027	0.0508
0.3671	4.0	5540	0.2044	0.1743	0.0446
0.3242	5.0	6925	0.1881	0.1466	0.0393
0.292	6.0	8310	0.1792	0.1307	0.0357
0.2669	7.0	9695	0.1740	0.1225	0.0341
0.244	8.0	11080	0.1647	0.1120	0.0321
0.2248	9.0	12465	0.1678	0.1033	0.0305
0.2111	10.0	13850	0.1653	0.0974	0.0291
0.1958	11.0	15235	0.1624	0.0910	0.0275
0.1852	12.0	16620	0.1482	0.0884	0.0266
0.1718	13.0	18005	0.1580	0.0859	0.0261
0.164	14.0	19390	0.1537	0.0802	0.0246
0.1531	15.0	20775	0.1525	0.0789	0.0246
0.1456	16.0	22160	0.1476	0.0761	0.0236
0.1376	17.0	23545	0.1513	0.0730	0.0232
0.1329	18.0	24930	0.1508	0.0732	0.0231
0.1267	19.0	26315	0.1580	0.0719	0.0222
0.123	20.0	27700	0.1538	0.0670	0.0214
0.1158	21.0	29085	0.1625	0.0677	0.0218
0.1111	22.0	30470	0.1451	0.0626	0.0205
0.1049	23.0	31855	0.1652	0.0635	0.0210
0.1023	24.0	33240	0.1562	0.0650	0.0209
0.0982	25.0	34625	0.1541	0.0626	0.0203
0.0954	26.0	36010	0.1545	0.0618	0.0202
0.0898	27.0	37395	0.1666	0.0598	0.0199
0.0881	28.0	38780	0.1656	0.0575	0.0196
0.0857	29.0	40165	0.1611	0.0590	0.0195
0.0815	30.0	41550	0.1595	0.0584	0.0193
0.0798	31.0	42935	0.1592	0.0576	0.0193
0.0784	32.0	44320	0.1586	0.0568	0.0187
0.0742	33.0	45705	0.1622	0.0568	0.0187
0.0736	34.0	47090	0.1705	0.0554	0.0187
0.0721	35.0	48475	0.1570	0.0530	0.0178
0.0686	36.0	49860	0.1658	0.0543	0.0179
0.0657	37.0	51245	0.1615	0.0526	0.0179
0.0647	38.0	52630	0.1646	0.0519	0.0178
0.0637	39.0	54015	0.1635	0.0515	0.0179
0.0614	40.0	55400	0.1716	0.0521	0.0175
0.0601	41.0	56785	0.1701	0.0504	0.0173
0.0596	42.0	58170	0.1598	0.0514	0.0174
0.0574	43.0	59555	0.1678	0.0506	0.0176
0.0564	44.0	60940	0.1679	0.0486	0.0170
0.0534	45.0	62325	0.1760	0.0490	0.0170
0.0536	46.0	63710	0.1722	0.0494	0.0170
0.0516	47.0	65095	0.1635	0.0486	0.0166
0.0504	48.0	66480	0.1652	0.0489	0.0169
0.0493	49.0	67865	0.1757	0.0480	0.0169
0.0491	50.0	69250	0.1734	0.0481	0.0167
0.0482	51.0	70635	0.1750	0.0479	0.0166
0.0465	52.0	72020	0.1762	0.0481	0.0166
0.0452	53.0	73405	0.1695	0.0461	0.0160
0.0456	54.0	74790	0.1732	0.0464	0.0160
0.0441	55.0	76175	0.1738	0.0455	0.0161
0.0438	56.0	77560	0.1771	0.0457	0.0161
0.0421	57.0	78945	0.1794	0.0452	0.0160
0.0416	58.0	80330	0.1673	0.0440	0.0157
0.0401	59.0	81715	0.1871	0.0448	0.0160
0.0407	60.0	83100	0.1705	0.0448	0.0156
0.0404	61.0	84485	0.1786	0.0446	0.0157
0.0379	62.0	85870	0.1760	0.0435	0.0155
0.0376	63.0	87255	0.1815	0.0445	0.0156
0.0358	64.0	88640	0.1808	0.0444	0.0158
0.0361	65.0	90025	0.1775	0.0433	0.0154
0.0347	66.0	91410	0.1740	0.0438	0.0155
0.0346	67.0	92795	0.1808	0.0437	0.0155
0.0343	68.0	94180	0.1774	0.0418	0.0153
0.0332	69.0	95565	0.1786	0.0408	0.0152
0.0324	70.0	96950	0.1846	0.0428	0.0155
0.0322	71.0	98335	0.1801	0.0422	0.0154
0.0331	72.0	99720	0.1740	0.0408	0.0147
0.0311	73.0	101105	0.1830	0.0418	0.0152
0.0299	74.0	102490	0.1874	0.0417	0.0153
0.0305	75.0	103875	0.1816	0.0411	0.0150
0.0301	76.0	105260	0.1799	0.0398	0.0146
0.029	77.0	106645	0.1890	0.0408	0.0149
0.0285	78.0	108030	0.1810	0.0385	0.0146
0.0286	79.0	109415	0.1874	0.0395	0.0147
0.0279	80.0	110800	0.1868	0.0399	0.0148
0.0274	81.0	112185	0.1852	0.0398	0.0147
0.0265	82.0	113570	0.1890	0.0408	0.0148
0.0267	83.0	114955	0.1908	0.0402	0.0148
0.0258	84.0	116340	0.1834	0.0396	0.0146
0.0268	85.0	117725	0.1945	0.0395	0.0146
0.0247	86.0	119110	0.1893	0.0397	0.0145
0.0249	87.0	120495	0.1904	0.0397	0.0145
0.0254	88.0	121880	0.1880	0.0403	0.0147
0.0248	89.0	123265	0.1860	0.0393	0.0146
0.0241	90.0	124650	0.1936	0.0389	0.0146
0.0232	91.0	126035	0.1922	0.0393	0.0144
0.0235	92.0	127420	0.1854	0.0390	0.0143
0.0227	93.0	128805	0.1921	0.0389	0.0143

Framework versions

Transformers 4.46.3
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.20.3

sulaimank
/

hubert-large-ls960-ft-lg_GRAIN-v1

hubert-large-ls960-ft-lg-CV_GRAIN-v1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sulaimank/hubert-large-ls960-ft-lg_GRAIN-v1

Collection including sulaimank/hubert-large-ls960-ft-lg_GRAIN-v1

Grain Models

Evaluation results