t5-base-p-l-akk-en-20241125-151008

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 200
  • eval_batch_size: 200
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2000
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
0.9667 1.0 10362 0.9112
0.8355 2.0 20724 0.7909
0.772 3.0 31086 0.7263
0.7326 4.0 41448 0.6910
0.7033 5.0 51810 0.6666
0.6787 6.0 62172 0.6455
0.6633 7.0 72534 0.6329
0.652 8.0 82896 0.6206
0.6408 9.0 93258 0.6073
0.6315 10.0 103620 0.6015
0.6161 11.0 113982 0.5914
0.6211 12.0 124344 0.5857
0.6053 13.0 134706 0.5766
0.6043 14.0 145068 0.5727
0.5954 15.0 155430 0.5681
0.59 16.0 165792 0.5649
0.5844 17.0 176154 0.5628
0.579 18.0 186516 0.5564
0.5792 19.0 196878 0.5493
0.5739 20.0 207240 0.5479
0.567 21.0 217602 0.5435
0.5626 22.0 227964 0.5406
0.5591 23.0 238326 0.5375
0.5508 24.0 248688 0.5356
0.5548 25.0 259050 0.5329
0.5512 26.0 269412 0.5299
0.5473 27.0 279774 0.5267
0.5413 28.0 290136 0.5243
0.5433 29.0 300498 0.5246
0.5378 30.0 310860 0.5209
0.5375 31.0 321222 0.5206
0.5363 32.0 331584 0.5178
0.528 33.0 341946 0.5143
0.532 34.0 352308 0.5121
0.5279 35.0 362670 0.5137
0.5265 36.0 373032 0.5080
0.5231 37.0 383394 0.5077
0.5187 38.0 393756 0.5082
0.5191 39.0 404118 0.5047
0.5159 40.0 414480 0.5029
0.5159 41.0 424842 0.5014
0.5131 42.0 435204 0.4998
0.5137 43.0 445566 0.4973
0.5128 44.0 455928 0.4972
0.5101 45.0 466290 0.4985
0.505 46.0 476652 0.4969
0.5014 47.0 487014 0.4964
0.4988 48.0 497376 0.4938
0.5051 49.0 507738 0.4898
0.4974 50.0 518100 0.4928
0.4999 51.0 528462 0.4904
0.4973 52.0 538824 0.4884
0.4973 53.0 549186 0.4877
0.4913 54.0 559548 0.4879
0.4968 55.0 569910 0.4846
0.4916 56.0 580272 0.4838
0.4938 57.0 590634 0.4833
0.4866 58.0 600996 0.4819
0.4871 59.0 611358 0.4818
0.4837 60.0 621720 0.4792
0.4855 61.0 632082 0.4783
0.4828 62.0 642444 0.4781
0.4789 63.0 652806 0.4780
0.4781 64.0 663168 0.4785
0.4803 65.0 673530 0.4767
0.4791 66.0 683892 0.4755
0.4783 67.0 694254 0.4743
0.4772 68.0 704616 0.4739
0.4757 69.0 714978 0.4730
0.4708 70.0 725340 0.4711
0.4698 71.0 735702 0.4717
0.4719 72.0 746064 0.4733
0.4708 73.0 756426 0.4703
0.4717 74.0 766788 0.4700
0.4714 75.0 777150 0.4677
0.4641 76.0 787512 0.4688
0.4642 77.0 797874 0.4678
0.4656 78.0 808236 0.4666
0.4625 79.0 818598 0.4661
0.4623 80.0 828960 0.4664
0.4619 81.0 839322 0.4657
0.4574 82.0 849684 0.4635
0.4562 83.0 860046 0.4628
0.4593 84.0 870408 0.4613
0.4583 85.0 880770 0.4600
0.4573 86.0 891132 0.4598
0.4518 87.0 901494 0.4564
0.4599 88.0 911856 0.4577
0.4545 89.0 922218 0.4594
0.4534 90.0 932580 0.4564
0.449 91.0 942942 0.4564
0.4523 92.0 953304 0.4584

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.6.0.dev20241022+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
6
Safetensors
Model size
368M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.