long_t5_4

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5697
  • Rouge1: 0.5326
  • Rouge2: 0.3464
  • Rougel: 0.4843
  • Rougelsum: 0.4839
  • Gen Len: 27.734

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.4555 1.0 1000 1.7755 0.4481 0.2666 0.402 0.4019 26.852
2.2467 2.0 2000 1.7041 0.4657 0.2823 0.421 0.4205 27.304
2.1352 3.0 3000 1.6575 0.4752 0.2902 0.4295 0.4292 27.117
2.0543 4.0 4000 1.6314 0.4788 0.295 0.4348 0.4341 26.2045
2.0058 5.0 5000 1.6106 0.4856 0.3016 0.4415 0.441 26.3885
1.9365 6.0 6000 1.5924 0.4882 0.3037 0.4431 0.4425 26.048
1.9234 7.0 7000 1.5743 0.4882 0.3049 0.4435 0.443 26.207
1.8728 8.0 8000 1.5649 0.4925 0.3094 0.4479 0.4474 26.4
1.814 9.0 9000 1.5558 0.495 0.3113 0.4498 0.4495 26.383
1.8025 10.0 10000 1.5436 0.4966 0.3121 0.4517 0.4512 25.8435
1.7683 11.0 11000 1.5424 0.4987 0.3143 0.4535 0.453 25.8365
1.7299 12.0 12000 1.5308 0.4994 0.3147 0.4543 0.4537 25.718
1.7308 13.0 13000 1.5245 0.5002 0.3168 0.4554 0.4548 25.7385
1.7075 14.0 14000 1.5218 0.5028 0.3176 0.4569 0.4564 25.87
1.6969 15.0 15000 1.5171 0.5042 0.3194 0.4586 0.4583 25.7615
1.6618 16.0 16000 1.5138 0.5073 0.3216 0.4617 0.4609 25.772
1.6658 17.0 17000 1.5089 0.5051 0.3198 0.4602 0.4596 25.6465
1.6249 18.0 18000 1.5073 0.5052 0.3199 0.4604 0.4599 25.4575
1.6098 19.0 19000 1.5055 0.5068 0.321 0.4619 0.4614 26.0035
1.6018 20.0 20000 1.5015 0.5098 0.3244 0.4648 0.4644 25.4315
1.5637 21.0 21000 1.5027 0.5087 0.3243 0.4635 0.4633 26.032
1.5664 22.0 22000 1.5029 0.5118 0.3268 0.4672 0.4668 25.6305
1.561 23.0 23000 1.4968 0.5115 0.3255 0.4667 0.4661 25.7905
1.5388 24.0 24000 1.4997 0.5112 0.3259 0.4657 0.4653 26.007
1.5173 25.0 25000 1.4981 0.5129 0.3273 0.4683 0.4679 25.9415
1.5057 26.0 26000 1.4995 0.5134 0.3289 0.4692 0.4687 26.128
1.4967 27.0 27000 1.4973 0.5149 0.3308 0.4704 0.4701 25.7005
1.4755 28.0 28000 1.5033 0.5155 0.3304 0.4703 0.4699 26.4255
1.4673 29.0 29000 1.4995 0.5174 0.3319 0.4727 0.4725 25.891
1.4515 30.0 30000 1.5012 0.5158 0.3309 0.4712 0.4709 25.668
1.4502 31.0 31000 1.5021 0.518 0.3336 0.4739 0.4737 25.8405
1.4369 32.0 32000 1.4996 0.5176 0.333 0.4732 0.4729 26.093
1.4347 33.0 33000 1.5033 0.5184 0.3334 0.4731 0.4726 26.225
1.4014 34.0 34000 1.5044 0.5185 0.3333 0.4735 0.4733 26.1955
1.399 35.0 35000 1.5061 0.5192 0.3341 0.4733 0.473 26.5095
1.3941 36.0 36000 1.5067 0.5193 0.3343 0.4739 0.4735 26.2715
1.3646 37.0 37000 1.5060 0.5201 0.335 0.4753 0.4751 25.932
1.3677 38.0 38000 1.5046 0.5213 0.3354 0.4757 0.4751 26.1425
1.3623 39.0 39000 1.5084 0.5202 0.3342 0.4747 0.4743 25.9125
1.3438 40.0 40000 1.5103 0.5204 0.3356 0.4756 0.4752 26.231
1.3476 41.0 41000 1.5083 0.5203 0.3357 0.4748 0.4746 26.4745
1.3258 42.0 42000 1.5135 0.5195 0.3349 0.4744 0.474 26.265
1.3484 43.0 43000 1.5110 0.5222 0.3375 0.4762 0.476 26.4365
1.324 44.0 44000 1.5136 0.5229 0.3386 0.4781 0.4777 26.192
1.3225 45.0 45000 1.5148 0.5233 0.3377 0.477 0.4767 26.3725
1.2867 46.0 46000 1.5160 0.5224 0.3372 0.4762 0.4758 26.565
1.296 47.0 47000 1.5170 0.5224 0.3363 0.4757 0.4755 26.8325
1.2834 48.0 48000 1.5165 0.5227 0.3382 0.4772 0.477 26.5355
1.2908 49.0 49000 1.5216 0.5255 0.3391 0.4784 0.4782 26.835
1.2719 50.0 50000 1.5234 0.525 0.3392 0.4779 0.4775 26.4905
1.2768 51.0 51000 1.5257 0.5262 0.34 0.4789 0.4785 27.0845
1.2703 52.0 52000 1.5216 0.5262 0.3408 0.4798 0.4793 26.578
1.2599 53.0 53000 1.5270 0.5279 0.3409 0.4811 0.4809 26.7485
1.2502 54.0 54000 1.5250 0.5276 0.3412 0.4797 0.4794 26.8205
1.2207 55.0 55000 1.5278 0.5259 0.3408 0.4792 0.4789 26.477
1.238 56.0 56000 1.5276 0.5281 0.3423 0.4812 0.4809 26.2345
1.2199 57.0 57000 1.5303 0.5262 0.3413 0.4792 0.4788 26.818
1.2193 58.0 58000 1.5335 0.528 0.3421 0.4804 0.4802 27.0625
1.2075 59.0 59000 1.5330 0.5275 0.3405 0.4793 0.4791 27.1185
1.2096 60.0 60000 1.5401 0.5283 0.3421 0.4807 0.4805 27.2025
1.2032 61.0 61000 1.5377 0.5281 0.342 0.4806 0.4803 26.784
1.2165 62.0 62000 1.5378 0.5288 0.3423 0.4804 0.4802 27.143
1.2025 63.0 63000 1.5391 0.5275 0.3415 0.4799 0.4797 27.172
1.199 64.0 64000 1.5415 0.5303 0.3445 0.4821 0.4819 27.1665
1.1847 65.0 65000 1.5445 0.5289 0.3432 0.4815 0.4812 27.115
1.1815 66.0 66000 1.5482 0.5286 0.3428 0.4802 0.4801 27.408
1.1828 67.0 67000 1.5468 0.5299 0.3443 0.4823 0.4819 27.2485
1.1823 68.0 68000 1.5484 0.5297 0.3441 0.4813 0.4809 27.3335
1.1771 69.0 69000 1.5488 0.5305 0.3441 0.4811 0.4808 27.6115
1.1748 70.0 70000 1.5475 0.5296 0.3439 0.4814 0.4811 27.2955
1.1732 71.0 71000 1.5493 0.5304 0.3444 0.482 0.4818 27.504
1.1504 72.0 72000 1.5529 0.5305 0.3449 0.4826 0.4824 27.313
1.1497 73.0 73000 1.5528 0.5318 0.3466 0.4838 0.4835 27.463
1.1589 74.0 74000 1.5543 0.5312 0.3452 0.4826 0.4823 27.482
1.1453 75.0 75000 1.5561 0.5309 0.3447 0.4826 0.4822 27.5885
1.1451 76.0 76000 1.5577 0.5305 0.3445 0.4825 0.4822 27.3815
1.154 77.0 77000 1.5571 0.5303 0.3449 0.4828 0.4822 27.3945
1.152 78.0 78000 1.5572 0.5311 0.3456 0.4832 0.4828 27.473
1.1205 79.0 79000 1.5598 0.5317 0.3458 0.4839 0.4835 27.355
1.1376 80.0 80000 1.5619 0.5325 0.347 0.4846 0.4843 27.483
1.1391 81.0 81000 1.5614 0.5321 0.3465 0.4839 0.4835 27.7635
1.1293 82.0 82000 1.5632 0.5329 0.3472 0.4847 0.4842 27.777
1.1551 83.0 83000 1.5616 0.5323 0.3468 0.4842 0.4837 27.7005
1.1312 84.0 84000 1.5628 0.5318 0.3459 0.4835 0.4832 27.772
1.1109 85.0 85000 1.5654 0.5327 0.3469 0.4847 0.4843 27.5055
1.1371 86.0 86000 1.5653 0.534 0.3478 0.4856 0.4852 27.666
1.1355 87.0 87000 1.5642 0.5336 0.3481 0.4858 0.4855 27.617
1.1133 88.0 88000 1.5667 0.5333 0.3478 0.485 0.4847 27.725
1.1143 89.0 89000 1.5674 0.5329 0.3471 0.4849 0.4845 27.781
1.1203 90.0 90000 1.5673 0.5331 0.3474 0.4851 0.4846 27.7695
1.121 91.0 91000 1.5681 0.5333 0.3471 0.4849 0.4845 27.7595
1.0999 92.0 92000 1.5680 0.533 0.347 0.4845 0.4842 27.8525
1.1179 93.0 93000 1.5691 0.533 0.3473 0.485 0.4846 27.74
1.1057 94.0 94000 1.5688 0.5333 0.3476 0.4852 0.4847 27.6195
1.1186 95.0 95000 1.5687 0.5334 0.3474 0.4853 0.4849 27.677
1.1063 96.0 96000 1.5688 0.5329 0.3468 0.4844 0.4842 27.6925
1.0992 97.0 97000 1.5692 0.5332 0.3471 0.4849 0.4846 27.6885
1.1114 98.0 98000 1.5696 0.5328 0.3467 0.4845 0.4842 27.744
1.1085 99.0 99000 1.5697 0.5328 0.3468 0.4846 0.4842 27.744
1.101 100.0 100000 1.5697 0.5326 0.3464 0.4843 0.4839 27.734

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.2.1
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
24
Safetensors
Model size
297M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for zera09/long_t5_4

Finetuned
(19)
this model
Finetunes
3 models