mt5-base_V25775_V44105_V53874

This model is a fine-tuned version of emilstabil/mt5-base_V25775_V44105 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4133
  • Rouge1: 32.5092
  • Rouge2: 11.7441
  • Rougel: 21.6511
  • Rougelsum: 26.5277
  • Gen Len: 89.5536

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.7944 0.81 500 2.1910 29.7996 11.2696 20.869 24.5115 82.6953
1.7465 1.61 1000 2.1442 29.7799 10.6877 20.6595 24.4396 80.9399
1.7379 2.42 1500 2.1823 30.3924 11.0181 20.9591 25.0604 87.3562
1.6977 3.23 2000 2.1876 29.3696 10.6016 20.5417 24.0967 78.1373
1.6613 4.03 2500 2.1891 29.777 10.8733 20.5695 24.636 77.4635
1.6424 4.84 3000 2.1925 30.5398 11.2902 21.0876 25.0424 82.794
1.6131 5.65 3500 2.2061 30.4751 11.2886 21.0148 24.9771 79.9099
1.6193 6.45 4000 2.2357 30.8465 11.0227 21.1036 25.1891 82.6738
1.5806 7.26 4500 2.2180 31.4661 11.5008 21.4756 26.0325 86.1202
1.5742 8.06 5000 2.2132 31.3554 11.3481 21.259 25.7304 86.2146
1.5653 8.87 5500 2.2133 32.3515 11.5784 21.9243 26.6567 90.4635
1.5532 9.68 6000 2.2253 31.1892 11.2645 21.1858 25.8852 87.5408
1.5142 10.48 6500 2.2360 30.1483 10.9003 20.9238 24.7488 78.4335
1.5105 11.29 7000 2.2462 31.1562 11.3171 21.3149 25.5669 85.0
1.5068 12.1 7500 2.2288 30.1954 11.2925 20.9437 24.9113 76.6094
1.483 12.9 8000 2.2445 30.4498 11.3156 21.0888 25.0539 79.2704
1.4544 13.71 8500 2.2285 31.6744 11.7017 21.964 26.3215 85.2146
1.4833 14.52 9000 2.2336 31.2326 11.3786 21.2688 25.5345 83.176
1.4305 15.32 9500 2.2555 31.1458 11.109 21.1361 25.4995 86.5408
1.4607 16.13 10000 2.2693 31.2104 11.5511 21.4548 25.669 84.133
1.4181 16.94 10500 2.2606 32.0839 11.4895 21.353 26.02 90.1888
1.4191 17.74 11000 2.2547 32.0803 11.6566 21.7206 26.3547 86.5494
1.4009 18.55 11500 2.2888 31.2863 11.5981 21.498 25.6535 82.5665
1.3916 19.35 12000 2.2781 31.6163 11.2085 21.253 25.8589 90.6009
1.3915 20.16 12500 2.2871 31.398 11.2152 21.41 25.7807 84.1245
1.3778 20.97 13000 2.2808 31.9543 11.5922 21.6471 26.1187 87.9871
1.3398 21.77 13500 2.3114 32.5911 11.7559 21.6985 26.4832 90.2618
1.3669 22.58 14000 2.3005 32.2284 11.8151 21.8298 26.256 89.2532
1.3159 23.39 14500 2.3152 32.189 11.6752 21.6752 26.4623 89.6524
1.3231 24.19 15000 2.3172 32.2582 11.7664 21.7995 26.5449 88.6524
1.3014 25.0 15500 2.3247 32.3611 11.6169 21.7312 26.5212 89.176
1.2752 25.81 16000 2.3349 32.0774 11.8314 21.7343 26.6137 88.4077
1.2787 26.61 16500 2.3302 31.7149 11.4065 21.3784 26.1065 88.1202
1.2728 27.42 17000 2.3484 32.359 11.7853 21.8351 26.4675 88.4807
1.2524 28.23 17500 2.3529 32.1259 11.8012 21.6175 26.1721 88.4206
1.236 29.03 18000 2.3635 32.0371 11.7357 21.7101 26.387 87.5665
1.2356 29.84 18500 2.3694 32.4209 11.4981 21.558 26.5013 91.9614
1.2239 30.65 19000 2.3739 32.2042 11.6382 21.6439 26.3635 88.5107
1.2158 31.45 19500 2.3792 32.6755 11.8155 21.7073 26.7322 89.9871
1.2084 32.26 20000 2.3922 33.1023 11.7153 21.9296 27.1142 92.3906
1.1994 33.06 20500 2.3991 32.6802 11.4579 21.5642 26.6404 93.0215
1.2011 33.87 21000 2.3956 32.9197 11.8239 21.8725 26.8542 92.1803
1.1993 34.68 21500 2.4024 32.1903 11.579 21.597 26.5418 91.4335
1.1688 35.48 22000 2.3975 32.4983 11.6353 21.5989 26.5309 89.3648
1.1969 36.29 22500 2.4042 32.8631 11.8492 21.8471 26.847 90.3433
1.1595 37.1 23000 2.4141 32.708 11.7882 21.7535 26.6902 90.6609
1.1755 37.9 23500 2.4188 32.552 11.8842 21.8309 26.8171 91.3305
1.1613 38.71 24000 2.4159 32.3059 11.6832 21.7439 26.5204 89.7639
1.1549 39.52 24500 2.4133 32.5092 11.7441 21.6511 26.5277 89.5536

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.1.0
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for emilstabil/mt5-base_V25775_V44105_V53874

Base model

google/mt5-base
Finetuned
(1)
this model