UrukHan
/

t5-russian-spell

@@ -15,12 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [sberbank-ai/ruT5-base](https://huggingface.co/sberbank-ai/ruT5-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4066
-- Rouge1: 44.2214
-- Rouge2: 21.688
-- Rougel: 44.2793
-- Rougelsum: 44.0781
-- Gen Len: 60.87
 ## Model description
@@ -45,17 +45,32 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
-| 0.2958        | 0.2   | 2500  | 0.4393          | 43.9635 | 21.3982 | 43.9784 | 43.8423   | 61.338  |
-| 0.2427        | 0.4   | 5000  | 0.4460          | 44.609  | 22.1448 | 44.6314 | 44.4817   | 61.028  |
-| 0.5326        | 0.6   | 7500  | 0.4100          | 44.7071 | 21.9365 | 44.7491 | 44.5944   | 60.844  |
-| 0.5262        | 0.8   | 10000 | 0.4066          | 44.2214 | 21.688  | 44.2793 | 44.0781   | 60.87   |
 ### Framework versions

 This model is a fine-tuned version of [sberbank-ai/ruT5-base](https://huggingface.co/sberbank-ai/ruT5-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3676
+- Rouge1: 45.1151
+- Rouge2: 22.4675
+- Rougel: 45.0866
+- Rougelsum: 44.9917
+- Gen Len: 60.922
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 0.1709        | 0.2   | 2500  | 0.4521          | 44.2702 | 21.5514 | 44.2338 | 44.0689   | 61.032  |
+| 0.1557        | 0.4   | 5000  | 0.4639          | 44.2613 | 21.8757 | 44.2442 | 44.0816   | 60.914  |
+| 0.4276        | 0.6   | 7500  | 0.4139          | 45.2125 | 22.268  | 45.1434 | 44.9713   | 60.494  |
+| 0.4675        | 0.8   | 10000 | 0.4016          | 44.2872 | 22.2143 | 44.27   | 44.0862   | 61.018  |
+| 0.5048        | 1.0   | 12500 | 0.3923          | 44.9732 | 22.2551 | 45.0251 | 44.822    | 60.952  |
+| 0.4362        | 1.21  | 15000 | 0.3920          | 44.8982 | 21.9817 | 44.8949 | 44.7051   | 61.29   |
+| 0.426         | 1.41  | 17500 | 0.3879          | 45.4473 | 22.5263 | 45.4284 | 45.2483   | 60.674  |
+| 0.4174        | 1.61  | 20000 | 0.3832          | 45.4006 | 22.2695 | 45.382  | 45.2161   | 60.92   |
+| 0.4229        | 1.81  | 22500 | 0.3774          | 45.2545 | 22.2894 | 45.2335 | 45.065    | 60.722  |
+| 0.4071        | 2.01  | 25000 | 0.3782          | 45.2875 | 22.4234 | 45.2902 | 45.1445   | 61.138  |
+| 0.3966        | 2.21  | 27500 | 0.3782          | 45.1692 | 22.197  | 45.2311 | 45.0222   | 60.68   |
+| 0.389         | 2.41  | 30000 | 0.3744          | 45.6209 | 22.5031 | 45.6023 | 45.4973   | 60.878  |
+| 0.3896        | 2.61  | 32500 | 0.3718          | 45.2454 | 22.4507 | 45.2479 | 45.1446   | 60.76   |
+| 0.3961        | 2.81  | 35000 | 0.3711          | 45.2779 | 22.4165 | 45.2661 | 45.1617   | 60.984  |
+| 0.3765        | 3.01  | 37500 | 0.3705          | 45.1666 | 22.6603 | 45.0916 | 44.9798   | 60.994  |
+| 0.3757        | 3.22  | 40000 | 0.3709          | 45.1587 | 22.4539 | 45.1129 | 45.0461   | 60.828  |
+| 0.3776        | 3.42  | 42500 | 0.3688          | 45.247  | 22.6266 | 45.2351 | 45.1111   | 60.93   |
+| 0.3691        | 3.62  | 45000 | 0.3693          | 45.3799 | 22.5152 | 45.3839 | 45.2705   | 60.846  |
+| 0.3786        | 3.82  | 47500 | 0.3676          | 45.1151 | 22.4675 | 45.0866 | 44.9917   | 60.922  |
 ### Framework versions