emilstabil
commited on
Commit
·
6d87c30
1
Parent(s):
a32f264
End of training
Browse files- README.md +58 -33
- pytorch_model.bin +1 -1
README.md
CHANGED
@@ -6,23 +6,23 @@ tags:
|
|
6 |
metrics:
|
7 |
- rouge
|
8 |
model-index:
|
9 |
-
- name: mt5-
|
10 |
results: []
|
11 |
---
|
12 |
|
13 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
should probably proofread and complete it, then remove this comment. -->
|
15 |
|
16 |
-
# mt5-
|
17 |
|
18 |
This model is a fine-tuned version of [emilstabil/mt5-base_V25775_V44105](https://huggingface.co/emilstabil/mt5-base_V25775_V44105) on the None dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 2.
|
21 |
-
- Rouge1:
|
22 |
-
- Rouge2: 11.
|
23 |
-
- Rougel: 21.
|
24 |
-
- Rougelsum: 26.
|
25 |
-
- Gen Len:
|
26 |
|
27 |
## Model description
|
28 |
|
@@ -47,36 +47,61 @@ The following hyperparameters were used during training:
|
|
47 |
- seed: 42
|
48 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
- lr_scheduler_type: linear
|
50 |
-
- num_epochs:
|
51 |
|
52 |
### Training results
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
55 |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
56 |
-
| 1.
|
57 |
-
| 1.
|
58 |
-
| 1.
|
59 |
-
| 1.
|
60 |
-
| 1.
|
61 |
-
| 1.
|
62 |
-
| 1.
|
63 |
-
| 1.
|
64 |
-
| 1.
|
65 |
-
| 1.
|
66 |
-
| 1.
|
67 |
-
| 1.
|
68 |
-
| 1.
|
69 |
-
| 1.
|
70 |
-
| 1.
|
71 |
-
| 1.
|
72 |
-
| 1.
|
73 |
-
| 1.
|
74 |
-
| 1.
|
75 |
-
| 1.
|
76 |
-
| 1.
|
77 |
-
| 1.
|
78 |
-
| 1.
|
79 |
-
| 1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
|
82 |
### Framework versions
|
|
|
6 |
metrics:
|
7 |
- rouge
|
8 |
model-index:
|
9 |
+
- name: mt5-base_V25775_V44105_V53874
|
10 |
results: []
|
11 |
---
|
12 |
|
13 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
should probably proofread and complete it, then remove this comment. -->
|
15 |
|
16 |
+
# mt5-base_V25775_V44105_V53874
|
17 |
|
18 |
This model is a fine-tuned version of [emilstabil/mt5-base_V25775_V44105](https://huggingface.co/emilstabil/mt5-base_V25775_V44105) on the None dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 2.4133
|
21 |
+
- Rouge1: 32.5092
|
22 |
+
- Rouge2: 11.7441
|
23 |
+
- Rougel: 21.6511
|
24 |
+
- Rougelsum: 26.5277
|
25 |
+
- Gen Len: 89.5536
|
26 |
|
27 |
## Model description
|
28 |
|
|
|
47 |
- seed: 42
|
48 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
- lr_scheduler_type: linear
|
50 |
+
- num_epochs: 40
|
51 |
|
52 |
### Training results
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
55 |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
56 |
+
| 1.7944 | 0.81 | 500 | 2.1910 | 29.7996 | 11.2696 | 20.869 | 24.5115 | 82.6953 |
|
57 |
+
| 1.7465 | 1.61 | 1000 | 2.1442 | 29.7799 | 10.6877 | 20.6595 | 24.4396 | 80.9399 |
|
58 |
+
| 1.7379 | 2.42 | 1500 | 2.1823 | 30.3924 | 11.0181 | 20.9591 | 25.0604 | 87.3562 |
|
59 |
+
| 1.6977 | 3.23 | 2000 | 2.1876 | 29.3696 | 10.6016 | 20.5417 | 24.0967 | 78.1373 |
|
60 |
+
| 1.6613 | 4.03 | 2500 | 2.1891 | 29.777 | 10.8733 | 20.5695 | 24.636 | 77.4635 |
|
61 |
+
| 1.6424 | 4.84 | 3000 | 2.1925 | 30.5398 | 11.2902 | 21.0876 | 25.0424 | 82.794 |
|
62 |
+
| 1.6131 | 5.65 | 3500 | 2.2061 | 30.4751 | 11.2886 | 21.0148 | 24.9771 | 79.9099 |
|
63 |
+
| 1.6193 | 6.45 | 4000 | 2.2357 | 30.8465 | 11.0227 | 21.1036 | 25.1891 | 82.6738 |
|
64 |
+
| 1.5806 | 7.26 | 4500 | 2.2180 | 31.4661 | 11.5008 | 21.4756 | 26.0325 | 86.1202 |
|
65 |
+
| 1.5742 | 8.06 | 5000 | 2.2132 | 31.3554 | 11.3481 | 21.259 | 25.7304 | 86.2146 |
|
66 |
+
| 1.5653 | 8.87 | 5500 | 2.2133 | 32.3515 | 11.5784 | 21.9243 | 26.6567 | 90.4635 |
|
67 |
+
| 1.5532 | 9.68 | 6000 | 2.2253 | 31.1892 | 11.2645 | 21.1858 | 25.8852 | 87.5408 |
|
68 |
+
| 1.5142 | 10.48 | 6500 | 2.2360 | 30.1483 | 10.9003 | 20.9238 | 24.7488 | 78.4335 |
|
69 |
+
| 1.5105 | 11.29 | 7000 | 2.2462 | 31.1562 | 11.3171 | 21.3149 | 25.5669 | 85.0 |
|
70 |
+
| 1.5068 | 12.1 | 7500 | 2.2288 | 30.1954 | 11.2925 | 20.9437 | 24.9113 | 76.6094 |
|
71 |
+
| 1.483 | 12.9 | 8000 | 2.2445 | 30.4498 | 11.3156 | 21.0888 | 25.0539 | 79.2704 |
|
72 |
+
| 1.4544 | 13.71 | 8500 | 2.2285 | 31.6744 | 11.7017 | 21.964 | 26.3215 | 85.2146 |
|
73 |
+
| 1.4833 | 14.52 | 9000 | 2.2336 | 31.2326 | 11.3786 | 21.2688 | 25.5345 | 83.176 |
|
74 |
+
| 1.4305 | 15.32 | 9500 | 2.2555 | 31.1458 | 11.109 | 21.1361 | 25.4995 | 86.5408 |
|
75 |
+
| 1.4607 | 16.13 | 10000 | 2.2693 | 31.2104 | 11.5511 | 21.4548 | 25.669 | 84.133 |
|
76 |
+
| 1.4181 | 16.94 | 10500 | 2.2606 | 32.0839 | 11.4895 | 21.353 | 26.02 | 90.1888 |
|
77 |
+
| 1.4191 | 17.74 | 11000 | 2.2547 | 32.0803 | 11.6566 | 21.7206 | 26.3547 | 86.5494 |
|
78 |
+
| 1.4009 | 18.55 | 11500 | 2.2888 | 31.2863 | 11.5981 | 21.498 | 25.6535 | 82.5665 |
|
79 |
+
| 1.3916 | 19.35 | 12000 | 2.2781 | 31.6163 | 11.2085 | 21.253 | 25.8589 | 90.6009 |
|
80 |
+
| 1.3915 | 20.16 | 12500 | 2.2871 | 31.398 | 11.2152 | 21.41 | 25.7807 | 84.1245 |
|
81 |
+
| 1.3778 | 20.97 | 13000 | 2.2808 | 31.9543 | 11.5922 | 21.6471 | 26.1187 | 87.9871 |
|
82 |
+
| 1.3398 | 21.77 | 13500 | 2.3114 | 32.5911 | 11.7559 | 21.6985 | 26.4832 | 90.2618 |
|
83 |
+
| 1.3669 | 22.58 | 14000 | 2.3005 | 32.2284 | 11.8151 | 21.8298 | 26.256 | 89.2532 |
|
84 |
+
| 1.3159 | 23.39 | 14500 | 2.3152 | 32.189 | 11.6752 | 21.6752 | 26.4623 | 89.6524 |
|
85 |
+
| 1.3231 | 24.19 | 15000 | 2.3172 | 32.2582 | 11.7664 | 21.7995 | 26.5449 | 88.6524 |
|
86 |
+
| 1.3014 | 25.0 | 15500 | 2.3247 | 32.3611 | 11.6169 | 21.7312 | 26.5212 | 89.176 |
|
87 |
+
| 1.2752 | 25.81 | 16000 | 2.3349 | 32.0774 | 11.8314 | 21.7343 | 26.6137 | 88.4077 |
|
88 |
+
| 1.2787 | 26.61 | 16500 | 2.3302 | 31.7149 | 11.4065 | 21.3784 | 26.1065 | 88.1202 |
|
89 |
+
| 1.2728 | 27.42 | 17000 | 2.3484 | 32.359 | 11.7853 | 21.8351 | 26.4675 | 88.4807 |
|
90 |
+
| 1.2524 | 28.23 | 17500 | 2.3529 | 32.1259 | 11.8012 | 21.6175 | 26.1721 | 88.4206 |
|
91 |
+
| 1.236 | 29.03 | 18000 | 2.3635 | 32.0371 | 11.7357 | 21.7101 | 26.387 | 87.5665 |
|
92 |
+
| 1.2356 | 29.84 | 18500 | 2.3694 | 32.4209 | 11.4981 | 21.558 | 26.5013 | 91.9614 |
|
93 |
+
| 1.2239 | 30.65 | 19000 | 2.3739 | 32.2042 | 11.6382 | 21.6439 | 26.3635 | 88.5107 |
|
94 |
+
| 1.2158 | 31.45 | 19500 | 2.3792 | 32.6755 | 11.8155 | 21.7073 | 26.7322 | 89.9871 |
|
95 |
+
| 1.2084 | 32.26 | 20000 | 2.3922 | 33.1023 | 11.7153 | 21.9296 | 27.1142 | 92.3906 |
|
96 |
+
| 1.1994 | 33.06 | 20500 | 2.3991 | 32.6802 | 11.4579 | 21.5642 | 26.6404 | 93.0215 |
|
97 |
+
| 1.2011 | 33.87 | 21000 | 2.3956 | 32.9197 | 11.8239 | 21.8725 | 26.8542 | 92.1803 |
|
98 |
+
| 1.1993 | 34.68 | 21500 | 2.4024 | 32.1903 | 11.579 | 21.597 | 26.5418 | 91.4335 |
|
99 |
+
| 1.1688 | 35.48 | 22000 | 2.3975 | 32.4983 | 11.6353 | 21.5989 | 26.5309 | 89.3648 |
|
100 |
+
| 1.1969 | 36.29 | 22500 | 2.4042 | 32.8631 | 11.8492 | 21.8471 | 26.847 | 90.3433 |
|
101 |
+
| 1.1595 | 37.1 | 23000 | 2.4141 | 32.708 | 11.7882 | 21.7535 | 26.6902 | 90.6609 |
|
102 |
+
| 1.1755 | 37.9 | 23500 | 2.4188 | 32.552 | 11.8842 | 21.8309 | 26.8171 | 91.3305 |
|
103 |
+
| 1.1613 | 38.71 | 24000 | 2.4159 | 32.3059 | 11.6832 | 21.7439 | 26.5204 | 89.7639 |
|
104 |
+
| 1.1549 | 39.52 | 24500 | 2.4133 | 32.5092 | 11.7441 | 21.6511 | 26.5277 | 89.5536 |
|
105 |
|
106 |
|
107 |
### Framework versions
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2329703026
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e889e36647186a17a240dc6a41ee395ca4793e56e6d94343040ff9abfea2a65c
|
3 |
size 2329703026
|