w2v-bert-2.0-malayalam_mixeddataset-CV16.0
This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1616
- Wer: 0.1199
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
1.8432 | 0.24 | 300 | 0.4191 | 0.4882 |
0.2257 | 0.47 | 600 | 0.3822 | 0.4822 |
0.183 | 0.71 | 900 | 0.3063 | 0.3934 |
0.1502 | 0.95 | 1200 | 0.2451 | 0.3329 |
0.1235 | 1.19 | 1500 | 0.2359 | 0.3065 |
0.1162 | 1.42 | 1800 | 0.2203 | 0.3011 |
0.1048 | 1.66 | 2100 | 0.2130 | 0.2889 |
0.1005 | 1.9 | 2400 | 0.2066 | 0.2580 |
0.0844 | 2.14 | 2700 | 0.1873 | 0.2585 |
0.076 | 2.37 | 3000 | 0.1846 | 0.2349 |
0.0738 | 2.61 | 3300 | 0.1703 | 0.2326 |
0.0726 | 2.85 | 3600 | 0.1815 | 0.2316 |
0.0643 | 3.08 | 3900 | 0.1655 | 0.2192 |
0.0538 | 3.32 | 4200 | 0.1667 | 0.2274 |
0.0541 | 3.56 | 4500 | 0.1695 | 0.2100 |
0.0549 | 3.8 | 4800 | 0.1782 | 0.2160 |
0.05 | 4.03 | 5100 | 0.1620 | 0.1884 |
0.0387 | 4.27 | 5400 | 0.1714 | 0.2038 |
0.041 | 4.51 | 5700 | 0.1622 | 0.1903 |
0.0376 | 4.74 | 6000 | 0.1553 | 0.1861 |
0.0379 | 4.98 | 6300 | 0.1398 | 0.1913 |
0.0294 | 5.22 | 6600 | 0.1585 | 0.1774 |
0.0271 | 5.46 | 6900 | 0.1541 | 0.1732 |
0.0262 | 5.69 | 7200 | 0.1391 | 0.1670 |
0.0266 | 5.93 | 7500 | 0.1310 | 0.1535 |
0.021 | 6.17 | 7800 | 0.1442 | 0.1563 |
0.0207 | 6.41 | 8100 | 0.1457 | 0.1545 |
0.0192 | 6.64 | 8400 | 0.1476 | 0.1510 |
0.0179 | 6.88 | 8700 | 0.1396 | 0.1535 |
0.0156 | 7.12 | 9000 | 0.1487 | 0.1341 |
0.0113 | 7.35 | 9300 | 0.1536 | 0.1383 |
0.0137 | 7.59 | 9600 | 0.1549 | 0.1438 |
0.0124 | 7.83 | 9900 | 0.1501 | 0.1324 |
0.0108 | 8.07 | 10200 | 0.1463 | 0.1346 |
0.0078 | 8.3 | 10500 | 0.1495 | 0.1301 |
0.0075 | 8.54 | 10800 | 0.1442 | 0.1306 |
0.007 | 8.78 | 11100 | 0.1510 | 0.1289 |
0.0065 | 9.02 | 11400 | 0.1536 | 0.1271 |
0.0034 | 9.25 | 11700 | 0.1580 | 0.1219 |
0.0038 | 9.49 | 12000 | 0.1583 | 0.1207 |
0.0043 | 9.73 | 12300 | 0.1604 | 0.1222 |
0.0039 | 9.96 | 12600 | 0.1616 | 0.1199 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.1.1+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Bajiyo/w2v-bert-2.0-malayalam_mixeddataset-CV16.0
Base model
facebook/w2v-bert-2.0