metadata

language:
  - en
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - samsum
metrics:
  - rouge
pipeline_tag: summarization
model-index:
  - name: stacked-summaries/flan-t5-large-samsum
    results:
      - task:
          type: summarization
          name: Summarization
        dataset:
          name: samsum
          type: samsum
          config: samsum
          split: test
        metrics:
          - type: rouge
            value: 49.0095
            name: ROUGE-1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGNhY2RhOTg5ZmU4ZGJiMjI1NjUyYWMwYmM2Mzk4MGEwMjk0NDg2OWYxZDdmM2I4NzBmODNiM2JmNTg1MDJhYSIsInZlcnNpb24iOjF9.YinJDLeqzoU_x5uJbGIgq8ZEs36oC3Pzre_vk2juxngBoXCEw54XWjpvVhKKZXeIgc47otucJFtFwAOPEmt9Bw
          - type: rouge
            value: 25.681
            name: ROUGE-2
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDBmNDc4NGMzZGEzYzMzMTFiNzliNjUyYmY0MzNjMmRlMTk4ZTRmZmUxODE0MmY1MjEzOWQ2MGQxMmZmZmQ5MSIsInZlcnNpb24iOjF9.UmRHCmQR5CR-JklBTY1JnjD_Gqz_qMYwdVXhMMvnAynMwAgXkoJZeoxT--usUfdkbqaQ-mLeEvLw7mgNE-NQAw
          - type: rouge
            value: 41.4474
            name: ROUGE-L
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODdiM2IxZTU4NTEyMzlmZDEzYTliZWNjMjM1NTAzMjE5MDY1MDZiZDc2YmE2NzUxOWJhMmQ0NTM5MjRjZjQyMSIsInZlcnNpb24iOjF9.PeJ41sirLWf3HTiJXlSMNoleENJT_X2u4VMkgQTmXMmGkbrONTFbUYwO4qjoQkvyjy8pLA2eQ3Fjm5yAvKrTCQ
          - type: rouge
            value: 45.1556
            name: ROUGE-LSUM
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGFiMGNkZDYxZmVhMDFlNDRlNmQ4YWVlMTk3ODI0ZWQ2MmIzNWFkYjkwOWRlNzkyNGVmYmY5ZTczZDAxYTk3NiIsInZlcnNpb24iOjF9.dsicHh5W4ba8t8eBBcSRUm-HLPlMoRc57XixiOHBCk-82De5u8hH8fsRWbMmaLpobdJ7b3xlIaVfTfMMRoLvBw
          - type: loss
            value: 1.2201015949249268
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzliODE2YzY2MzMyZDQ4YzdjZmFjNTc2NDU3ZjQwNjYwNTdhZjY1NWViM2VhNDc1MjQzNDkxMDI2MTM5ZjFkYiIsInZlcnNpb24iOjF9.2QdP4Zj2oHCo0HCoGgZy6YdqNJaQ0ri0E2kD7lzYbVmyg35wyGutvRUaXVR6O833gTbsCvM86Gp77qNT9CTyDA
          - type: gen_len
            value: 17.326
            name: gen_len
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWJlNjhjMmUxNWU3MDZlMTUzOWRmM2UwNmU3MjBjODhmMGYxZTUyMmFmMmE0MmU3ZTVkYWY0MDhkMWQ3NTk2MSIsInZlcnNpb24iOjF9.wFaw7DOpESjPu_uW6liHc4XaTwF36ReLLYd-BBFhnZXemE_lGQxmp0O0Vl2DgZz3SSbXonyS4D01G2hYze8qCA

flan-t5-large-samsum

This model is a fine-tuned version of google/flan-t5-large on the samsum dataset.

It achieves the following results on the evaluation set:

Loss: 1.1754
Rouge1: 54.1595
Rouge2: 29.1081
Rougel: 45.4989
Rougelsum: 49.1026
Gen Len: 28.72

Note: the stacked version of this model technically does evaluation on a different validation set (the stacked one) while this just uses samsum.

Model description

More information needed

Intended uses & limitations

Intended for comparison(s) to the stacked version of this model
1024 token input max

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 4
seed: 17868
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.04
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.2106	0.43	50	1.1889	52.5898	26.9967	43.6944	47.9656	24.5167
1.213	0.87	100	1.1760	52.4279	27.4689	43.7873	48.0581	25.0533
1.0726	1.3	150	1.1731	52.8246	26.9524	43.7429	48.0345	25.55
1.0784	1.74	200	1.1708	53.1291	27.9056	44.2609	48.6883	26.03
1.0215	2.17	250	1.1755	53.1512	27.9475	44.1442	48.4619	27.57
1.0294	2.61	300	1.1711	53.4402	28.0126	44.5454	48.6432	25.9033
1.0016	3.04	350	1.1718	53.9395	28.3087	45.191	49.2773	26.6133
0.9576	3.48	400	1.1741	53.9004	28.3243	45.0911	48.9182	26.33
0.9739	3.91	450	1.1754	53.7049	28.419	44.8946	48.8708	27.2433
0.9505	4.35	500	1.1781	53.7142	28.1758	44.8324	48.9498	26.8667
0.9993	4.78	550	1.1784	53.87	28.2211	44.893	49.1074	27.2167