File size: 3,086 Bytes

c15720e

---
license: apache-2.0
tags:
- summarization
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-small-text-sum-5
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-small-text-sum-5

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 2.3673
- Rouge1: 21.51
- Rouge2: 6.94
- Rougel: 20.94

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 11
- eval_batch_size: 11
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 40

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|
| 4.5176        | 1.77  | 500   | 2.6172          | 16.23  | 5.35   | 16.14  |
| 3.073         | 3.55  | 1000  | 2.4755          | 17.77  | 5.53   | 17.67  |
| 2.8478        | 5.32  | 1500  | 2.4330          | 18.56  | 5.28   | 18.32  |
| 2.7152        | 7.09  | 2000  | 2.4423          | 18.31  | 5.1    | 18.14  |
| 2.6003        | 8.87  | 2500  | 2.3905          | 19.46  | 5.52   | 19.17  |
| 2.5218        | 10.64 | 3000  | 2.3660          | 19.58  | 5.93   | 19.07  |
| 2.4172        | 12.41 | 3500  | 2.3595          | 19.89  | 6.42   | 19.5   |
| 2.3841        | 14.18 | 4000  | 2.3564          | 20.38  | 6.67   | 19.99  |
| 2.3049        | 15.96 | 4500  | 2.3730          | 20.21  | 6.41   | 19.79  |
| 2.2596        | 17.73 | 5000  | 2.3532          | 20.27  | 6.38   | 19.95  |
| 2.2155        | 19.5  | 5500  | 2.3539          | 19.6   | 6.41   | 19.24  |
| 2.1657        | 21.28 | 6000  | 2.3511          | 21.13  | 6.19   | 20.79  |
| 2.1343        | 23.05 | 6500  | 2.3378          | 20.59  | 6.45   | 20.18  |
| 2.1032        | 24.82 | 7000  | 2.3510          | 19.91  | 6.28   | 19.6   |
| 2.068         | 26.6  | 7500  | 2.3452          | 19.37  | 6.11   | 19.1   |
| 2.0438        | 28.37 | 8000  | 2.3513          | 20.86  | 6.43   | 20.49  |
| 2.0191        | 30.14 | 8500  | 2.3673          | 21.51  | 6.94   | 20.94  |
| 2.0085        | 31.91 | 9000  | 2.3519          | 20.65  | 6.61   | 20.2   |
| 1.9797        | 33.69 | 9500  | 2.3728          | 21.01  | 6.33   | 20.6   |
| 1.9808        | 35.46 | 10000 | 2.3663          | 21.22  | 6.48   | 20.82  |
| 1.9605        | 37.23 | 10500 | 2.3581          | 20.45  | 6.41   | 20.06  |
| 1.9599        | 39.01 | 11000 | 2.3608          | 21.07  | 6.57   | 20.6   |


### Framework versions

- Transformers 4.26.1
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2