hubert-large-ls960-ft-lg-CV_GRAIN-v1

This model is a fine-tuned version of facebook/hubert-large-ls960-ft on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1921
  • Wer: 0.0389
  • Cer: 0.0143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.1234 1.0 1385 0.3733 0.4333 0.0910
0.5164 2.0 2770 0.2676 0.2680 0.0634
0.4223 3.0 4155 0.2327 0.2027 0.0508
0.3671 4.0 5540 0.2044 0.1743 0.0446
0.3242 5.0 6925 0.1881 0.1466 0.0393
0.292 6.0 8310 0.1792 0.1307 0.0357
0.2669 7.0 9695 0.1740 0.1225 0.0341
0.244 8.0 11080 0.1647 0.1120 0.0321
0.2248 9.0 12465 0.1678 0.1033 0.0305
0.2111 10.0 13850 0.1653 0.0974 0.0291
0.1958 11.0 15235 0.1624 0.0910 0.0275
0.1852 12.0 16620 0.1482 0.0884 0.0266
0.1718 13.0 18005 0.1580 0.0859 0.0261
0.164 14.0 19390 0.1537 0.0802 0.0246
0.1531 15.0 20775 0.1525 0.0789 0.0246
0.1456 16.0 22160 0.1476 0.0761 0.0236
0.1376 17.0 23545 0.1513 0.0730 0.0232
0.1329 18.0 24930 0.1508 0.0732 0.0231
0.1267 19.0 26315 0.1580 0.0719 0.0222
0.123 20.0 27700 0.1538 0.0670 0.0214
0.1158 21.0 29085 0.1625 0.0677 0.0218
0.1111 22.0 30470 0.1451 0.0626 0.0205
0.1049 23.0 31855 0.1652 0.0635 0.0210
0.1023 24.0 33240 0.1562 0.0650 0.0209
0.0982 25.0 34625 0.1541 0.0626 0.0203
0.0954 26.0 36010 0.1545 0.0618 0.0202
0.0898 27.0 37395 0.1666 0.0598 0.0199
0.0881 28.0 38780 0.1656 0.0575 0.0196
0.0857 29.0 40165 0.1611 0.0590 0.0195
0.0815 30.0 41550 0.1595 0.0584 0.0193
0.0798 31.0 42935 0.1592 0.0576 0.0193
0.0784 32.0 44320 0.1586 0.0568 0.0187
0.0742 33.0 45705 0.1622 0.0568 0.0187
0.0736 34.0 47090 0.1705 0.0554 0.0187
0.0721 35.0 48475 0.1570 0.0530 0.0178
0.0686 36.0 49860 0.1658 0.0543 0.0179
0.0657 37.0 51245 0.1615 0.0526 0.0179
0.0647 38.0 52630 0.1646 0.0519 0.0178
0.0637 39.0 54015 0.1635 0.0515 0.0179
0.0614 40.0 55400 0.1716 0.0521 0.0175
0.0601 41.0 56785 0.1701 0.0504 0.0173
0.0596 42.0 58170 0.1598 0.0514 0.0174
0.0574 43.0 59555 0.1678 0.0506 0.0176
0.0564 44.0 60940 0.1679 0.0486 0.0170
0.0534 45.0 62325 0.1760 0.0490 0.0170
0.0536 46.0 63710 0.1722 0.0494 0.0170
0.0516 47.0 65095 0.1635 0.0486 0.0166
0.0504 48.0 66480 0.1652 0.0489 0.0169
0.0493 49.0 67865 0.1757 0.0480 0.0169
0.0491 50.0 69250 0.1734 0.0481 0.0167
0.0482 51.0 70635 0.1750 0.0479 0.0166
0.0465 52.0 72020 0.1762 0.0481 0.0166
0.0452 53.0 73405 0.1695 0.0461 0.0160
0.0456 54.0 74790 0.1732 0.0464 0.0160
0.0441 55.0 76175 0.1738 0.0455 0.0161
0.0438 56.0 77560 0.1771 0.0457 0.0161
0.0421 57.0 78945 0.1794 0.0452 0.0160
0.0416 58.0 80330 0.1673 0.0440 0.0157
0.0401 59.0 81715 0.1871 0.0448 0.0160
0.0407 60.0 83100 0.1705 0.0448 0.0156
0.0404 61.0 84485 0.1786 0.0446 0.0157
0.0379 62.0 85870 0.1760 0.0435 0.0155
0.0376 63.0 87255 0.1815 0.0445 0.0156
0.0358 64.0 88640 0.1808 0.0444 0.0158
0.0361 65.0 90025 0.1775 0.0433 0.0154
0.0347 66.0 91410 0.1740 0.0438 0.0155
0.0346 67.0 92795 0.1808 0.0437 0.0155
0.0343 68.0 94180 0.1774 0.0418 0.0153
0.0332 69.0 95565 0.1786 0.0408 0.0152
0.0324 70.0 96950 0.1846 0.0428 0.0155
0.0322 71.0 98335 0.1801 0.0422 0.0154
0.0331 72.0 99720 0.1740 0.0408 0.0147
0.0311 73.0 101105 0.1830 0.0418 0.0152
0.0299 74.0 102490 0.1874 0.0417 0.0153
0.0305 75.0 103875 0.1816 0.0411 0.0150
0.0301 76.0 105260 0.1799 0.0398 0.0146
0.029 77.0 106645 0.1890 0.0408 0.0149
0.0285 78.0 108030 0.1810 0.0385 0.0146
0.0286 79.0 109415 0.1874 0.0395 0.0147
0.0279 80.0 110800 0.1868 0.0399 0.0148
0.0274 81.0 112185 0.1852 0.0398 0.0147
0.0265 82.0 113570 0.1890 0.0408 0.0148
0.0267 83.0 114955 0.1908 0.0402 0.0148
0.0258 84.0 116340 0.1834 0.0396 0.0146
0.0268 85.0 117725 0.1945 0.0395 0.0146
0.0247 86.0 119110 0.1893 0.0397 0.0145
0.0249 87.0 120495 0.1904 0.0397 0.0145
0.0254 88.0 121880 0.1880 0.0403 0.0147
0.0248 89.0 123265 0.1860 0.0393 0.0146
0.0241 90.0 124650 0.1936 0.0389 0.0146
0.0232 91.0 126035 0.1922 0.0393 0.0144
0.0235 92.0 127420 0.1854 0.0390 0.0143
0.0227 93.0 128805 0.1921 0.0389 0.0143

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.1.0+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
101
Safetensors
Model size
315M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sulaimank/hubert-large-ls960-ft-lg_GRAIN-v1

Finetuned
(22)
this model

Collection including sulaimank/hubert-large-ls960-ft-lg_GRAIN-v1