metadata
license: apache-2.0
base_model: emilstabil/mt5-base_V25775_V44105
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-base_V25775_V44105_V53874
results: []
mt5-base_V25775_V44105_V53874
This model is a fine-tuned version of emilstabil/mt5-base_V25775_V44105 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.4133
- Rouge1: 32.5092
- Rouge2: 11.7441
- Rougel: 21.6511
- Rougelsum: 26.5277
- Gen Len: 89.5536
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 3
- eval_batch_size: 3
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 40
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.7944 | 0.81 | 500 | 2.1910 | 29.7996 | 11.2696 | 20.869 | 24.5115 | 82.6953 |
1.7465 | 1.61 | 1000 | 2.1442 | 29.7799 | 10.6877 | 20.6595 | 24.4396 | 80.9399 |
1.7379 | 2.42 | 1500 | 2.1823 | 30.3924 | 11.0181 | 20.9591 | 25.0604 | 87.3562 |
1.6977 | 3.23 | 2000 | 2.1876 | 29.3696 | 10.6016 | 20.5417 | 24.0967 | 78.1373 |
1.6613 | 4.03 | 2500 | 2.1891 | 29.777 | 10.8733 | 20.5695 | 24.636 | 77.4635 |
1.6424 | 4.84 | 3000 | 2.1925 | 30.5398 | 11.2902 | 21.0876 | 25.0424 | 82.794 |
1.6131 | 5.65 | 3500 | 2.2061 | 30.4751 | 11.2886 | 21.0148 | 24.9771 | 79.9099 |
1.6193 | 6.45 | 4000 | 2.2357 | 30.8465 | 11.0227 | 21.1036 | 25.1891 | 82.6738 |
1.5806 | 7.26 | 4500 | 2.2180 | 31.4661 | 11.5008 | 21.4756 | 26.0325 | 86.1202 |
1.5742 | 8.06 | 5000 | 2.2132 | 31.3554 | 11.3481 | 21.259 | 25.7304 | 86.2146 |
1.5653 | 8.87 | 5500 | 2.2133 | 32.3515 | 11.5784 | 21.9243 | 26.6567 | 90.4635 |
1.5532 | 9.68 | 6000 | 2.2253 | 31.1892 | 11.2645 | 21.1858 | 25.8852 | 87.5408 |
1.5142 | 10.48 | 6500 | 2.2360 | 30.1483 | 10.9003 | 20.9238 | 24.7488 | 78.4335 |
1.5105 | 11.29 | 7000 | 2.2462 | 31.1562 | 11.3171 | 21.3149 | 25.5669 | 85.0 |
1.5068 | 12.1 | 7500 | 2.2288 | 30.1954 | 11.2925 | 20.9437 | 24.9113 | 76.6094 |
1.483 | 12.9 | 8000 | 2.2445 | 30.4498 | 11.3156 | 21.0888 | 25.0539 | 79.2704 |
1.4544 | 13.71 | 8500 | 2.2285 | 31.6744 | 11.7017 | 21.964 | 26.3215 | 85.2146 |
1.4833 | 14.52 | 9000 | 2.2336 | 31.2326 | 11.3786 | 21.2688 | 25.5345 | 83.176 |
1.4305 | 15.32 | 9500 | 2.2555 | 31.1458 | 11.109 | 21.1361 | 25.4995 | 86.5408 |
1.4607 | 16.13 | 10000 | 2.2693 | 31.2104 | 11.5511 | 21.4548 | 25.669 | 84.133 |
1.4181 | 16.94 | 10500 | 2.2606 | 32.0839 | 11.4895 | 21.353 | 26.02 | 90.1888 |
1.4191 | 17.74 | 11000 | 2.2547 | 32.0803 | 11.6566 | 21.7206 | 26.3547 | 86.5494 |
1.4009 | 18.55 | 11500 | 2.2888 | 31.2863 | 11.5981 | 21.498 | 25.6535 | 82.5665 |
1.3916 | 19.35 | 12000 | 2.2781 | 31.6163 | 11.2085 | 21.253 | 25.8589 | 90.6009 |
1.3915 | 20.16 | 12500 | 2.2871 | 31.398 | 11.2152 | 21.41 | 25.7807 | 84.1245 |
1.3778 | 20.97 | 13000 | 2.2808 | 31.9543 | 11.5922 | 21.6471 | 26.1187 | 87.9871 |
1.3398 | 21.77 | 13500 | 2.3114 | 32.5911 | 11.7559 | 21.6985 | 26.4832 | 90.2618 |
1.3669 | 22.58 | 14000 | 2.3005 | 32.2284 | 11.8151 | 21.8298 | 26.256 | 89.2532 |
1.3159 | 23.39 | 14500 | 2.3152 | 32.189 | 11.6752 | 21.6752 | 26.4623 | 89.6524 |
1.3231 | 24.19 | 15000 | 2.3172 | 32.2582 | 11.7664 | 21.7995 | 26.5449 | 88.6524 |
1.3014 | 25.0 | 15500 | 2.3247 | 32.3611 | 11.6169 | 21.7312 | 26.5212 | 89.176 |
1.2752 | 25.81 | 16000 | 2.3349 | 32.0774 | 11.8314 | 21.7343 | 26.6137 | 88.4077 |
1.2787 | 26.61 | 16500 | 2.3302 | 31.7149 | 11.4065 | 21.3784 | 26.1065 | 88.1202 |
1.2728 | 27.42 | 17000 | 2.3484 | 32.359 | 11.7853 | 21.8351 | 26.4675 | 88.4807 |
1.2524 | 28.23 | 17500 | 2.3529 | 32.1259 | 11.8012 | 21.6175 | 26.1721 | 88.4206 |
1.236 | 29.03 | 18000 | 2.3635 | 32.0371 | 11.7357 | 21.7101 | 26.387 | 87.5665 |
1.2356 | 29.84 | 18500 | 2.3694 | 32.4209 | 11.4981 | 21.558 | 26.5013 | 91.9614 |
1.2239 | 30.65 | 19000 | 2.3739 | 32.2042 | 11.6382 | 21.6439 | 26.3635 | 88.5107 |
1.2158 | 31.45 | 19500 | 2.3792 | 32.6755 | 11.8155 | 21.7073 | 26.7322 | 89.9871 |
1.2084 | 32.26 | 20000 | 2.3922 | 33.1023 | 11.7153 | 21.9296 | 27.1142 | 92.3906 |
1.1994 | 33.06 | 20500 | 2.3991 | 32.6802 | 11.4579 | 21.5642 | 26.6404 | 93.0215 |
1.2011 | 33.87 | 21000 | 2.3956 | 32.9197 | 11.8239 | 21.8725 | 26.8542 | 92.1803 |
1.1993 | 34.68 | 21500 | 2.4024 | 32.1903 | 11.579 | 21.597 | 26.5418 | 91.4335 |
1.1688 | 35.48 | 22000 | 2.3975 | 32.4983 | 11.6353 | 21.5989 | 26.5309 | 89.3648 |
1.1969 | 36.29 | 22500 | 2.4042 | 32.8631 | 11.8492 | 21.8471 | 26.847 | 90.3433 |
1.1595 | 37.1 | 23000 | 2.4141 | 32.708 | 11.7882 | 21.7535 | 26.6902 | 90.6609 |
1.1755 | 37.9 | 23500 | 2.4188 | 32.552 | 11.8842 | 21.8309 | 26.8171 | 91.3305 |
1.1613 | 38.71 | 24000 | 2.4159 | 32.3059 | 11.6832 | 21.7439 | 26.5204 | 89.7639 |
1.1549 | 39.52 | 24500 | 2.4133 | 32.5092 | 11.7441 | 21.6511 | 26.5277 | 89.5536 |
Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3