byt5-small-finetuned-yiddish-experiment-10

This model is a fine-tuned version of google/byt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3450
  • Cer: 0.1505
  • Wer: 0.4654

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 600
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Cer Wer
10.741 0.4717 100 10.9313 0.2881 0.7176
7.6063 0.9434 200 10.5495 0.2706 0.6850
8.4739 1.4151 300 9.8632 0.2572 0.6595
8.3278 1.8868 400 8.9330 0.2470 0.6396
8.0051 2.3585 500 7.9314 0.2354 0.6181
7.7765 2.8302 600 7.0184 0.2308 0.6150
5.6897 3.3019 700 6.0913 0.2245 0.6094
5.3547 3.7736 800 5.1003 0.2186 0.6038
4.9118 4.2453 900 4.3067 0.2174 0.6030
3.9777 4.7170 1000 3.5975 0.2130 0.5982
3.5601 5.1887 1100 2.8719 0.2098 0.5959
2.821 5.6604 1200 2.2820 0.2069 0.5919
2.2335 6.1321 1300 1.7483 0.2047 0.5887
1.8581 6.6038 1400 1.3001 0.2008 0.5823
1.6247 7.0755 1500 1.1757 0.1982 0.5744
1.3292 7.5472 1600 1.1475 0.1939 0.5688
1.1853 8.0189 1700 1.0804 0.1920 0.5688
1.077 8.4906 1800 0.8688 0.1902 0.5656
0.9039 8.9623 1900 0.7849 0.1683 0.4972
0.7846 9.4340 2000 0.7405 0.1667 0.4964
0.7805 9.9057 2100 0.6959 0.1644 0.4893
0.7415 10.3774 2200 0.6571 0.1615 0.4853
0.6541 10.8491 2300 0.6114 0.1602 0.4869
0.6443 11.3208 2400 0.5624 0.1590 0.4845
0.5984 11.7925 2500 0.5103 0.1579 0.4805
0.5499 12.2642 2600 0.4620 0.1576 0.4813
0.5194 12.7358 2700 0.4317 0.1570 0.4773
0.5052 13.2075 2800 0.4088 0.1565 0.4781
0.4724 13.6792 2900 0.3981 0.1562 0.4757
0.4601 14.1509 3000 0.3827 0.1564 0.4765
0.4342 14.6226 3100 0.3803 0.1541 0.4741
0.432 15.0943 3200 0.3719 0.1556 0.4749
0.4365 15.5660 3300 0.3700 0.1550 0.4733
0.4094 16.0377 3400 0.3660 0.1538 0.4710
0.4126 16.5094 3500 0.3610 0.1538 0.4741
0.3976 16.9811 3600 0.3614 0.1534 0.4694
0.3933 17.4528 3700 0.3600 0.1522 0.4694
0.4019 17.9245 3800 0.3539 0.1513 0.4686
0.3813 18.3962 3900 0.3598 0.1522 0.4694
0.3812 18.8679 4000 0.3551 0.1519 0.4678
0.382 19.3396 4100 0.3517 0.1508 0.4670
0.3887 19.8113 4200 0.3502 0.1510 0.4678
0.3756 20.2830 4300 0.3520 0.1516 0.4686
0.3761 20.7547 4400 0.3499 0.1514 0.4670
0.38 21.2264 4500 0.3480 0.1507 0.4670
0.3673 21.6981 4600 0.3484 0.1514 0.4678
0.3778 22.1698 4700 0.3472 0.1507 0.4670
0.3642 22.6415 4800 0.3475 0.1507 0.4662
0.3701 23.1132 4900 0.3468 0.1511 0.4662
0.3753 23.5849 5000 0.3460 0.1510 0.4670
0.3672 24.0566 5100 0.3458 0.1508 0.4662
0.3711 24.5283 5200 0.3453 0.1508 0.4662
0.3631 25.0 5300 0.3457 0.1507 0.4662
0.3733 25.4717 5400 0.3456 0.1508 0.4670
0.3667 25.9434 5500 0.3455 0.1508 0.4662
0.3568 26.4151 5600 0.3455 0.1507 0.4662
0.3729 26.8868 5700 0.3453 0.1508 0.4662
0.3652 27.3585 5800 0.3452 0.1507 0.4662
0.3658 27.8302 5900 0.3450 0.1505 0.4654
0.3621 28.3019 6000 0.3448 0.1507 0.4654
0.3724 28.7736 6100 0.3449 0.1508 0.4662
0.3594 29.2453 6200 0.3448 0.1508 0.4662
0.3643 29.7170 6300 0.3448 0.1508 0.4662

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 2.14.4
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
300M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Addaci/byt5-small-finetuned-yiddish-experiment-10

Base model

google/byt5-small
Finetuned
(20)
this model

Spaces using Addaci/byt5-small-finetuned-yiddish-experiment-10 2