n3wtou
/

mt5-swatf

@@ -16,12 +16,12 @@ model-index:
       name: xlsum
       type: xlsum
       config: swahili
-      split: test
       args: swahili
     metrics:
     - name: Rouge1
       type: rouge
-      value: 9.6904
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,11 +32,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the xlsum dataset.
 It achieves the following results on the evaluation set:
 - Loss: nan
-- Rouge1: 9.6904
-- Rouge2: 1.3302
-- Rougel: 8.4948
-- Rougelsum: 8.497
-- Gen Len: 685.8156
 ## Model description
@@ -55,24 +55,31 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len  |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:--------:|
-| No log        | 1.0   | 188  | nan             | 9.6904 | 1.3302 | 8.4948 | 8.497     | 685.8156 |
-| No log        | 2.0   | 376  | nan             | 9.6904 | 1.3302 | 8.4948 | 8.497     | 685.8156 |
-| 0.0           | 3.0   | 564  | nan             | 9.6904 | 1.3302 | 8.4948 | 8.497     | 685.8156 |
-| 0.0           | 4.0   | 752  | nan             | 9.6904 | 1.3302 | 8.4948 | 8.497     | 685.8156 |
-| 0.0           | 5.0   | 940  | nan             | 9.6904 | 1.3302 | 8.4948 | 8.497     | 685.8156 |
 ### Framework versions

       name: xlsum
       type: xlsum
       config: swahili
+      split: validation
       args: swahili
     metrics:
     - name: Rouge1
       type: rouge
+      value: 9.7053
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the xlsum dataset.
 It achieves the following results on the evaluation set:
 - Loss: nan
+- Rouge1: 9.7053
+- Rouge2: 1.3021
+- Rougel: 8.4306
+- Rougelsum: 8.4159
+- Gen Len: 683.08
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 4e-05
+- train_batch_size: 4
+- eval_batch_size: 3
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
+| 0.0           | 0.8   | 500  | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 1.6   | 1000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 2.4   | 1500 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 3.2   | 2000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 4.0   | 2500 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 4.8   | 3000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 5.6   | 3500 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 6.4   | 4000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 7.2   | 4500 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 8.0   | 5000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 8.8   | 5500 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
+| 0.0           | 9.6   | 6000 | nan             | 9.7053 | 1.3021 | 8.4306 | 8.4159    | 683.08  |
 ### Framework versions