pszemraj's picture
Add evaluation results on the samsum config and test split of samsum (#1)
03cfed4
|
raw
history blame
5.18 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - samsum
metrics:
  - rouge
pipeline_tag: summarization
model-index:
  - name: stacked-summaries/flan-t5-large-samsum
    results:
      - task:
          type: summarization
          name: Summarization
        dataset:
          name: samsum
          type: samsum
          config: samsum
          split: test
        metrics:
          - type: rouge
            value: 49.0095
            name: ROUGE-1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGNhY2RhOTg5ZmU4ZGJiMjI1NjUyYWMwYmM2Mzk4MGEwMjk0NDg2OWYxZDdmM2I4NzBmODNiM2JmNTg1MDJhYSIsInZlcnNpb24iOjF9.YinJDLeqzoU_x5uJbGIgq8ZEs36oC3Pzre_vk2juxngBoXCEw54XWjpvVhKKZXeIgc47otucJFtFwAOPEmt9Bw
          - type: rouge
            value: 25.681
            name: ROUGE-2
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDBmNDc4NGMzZGEzYzMzMTFiNzliNjUyYmY0MzNjMmRlMTk4ZTRmZmUxODE0MmY1MjEzOWQ2MGQxMmZmZmQ5MSIsInZlcnNpb24iOjF9.UmRHCmQR5CR-JklBTY1JnjD_Gqz_qMYwdVXhMMvnAynMwAgXkoJZeoxT--usUfdkbqaQ-mLeEvLw7mgNE-NQAw
          - type: rouge
            value: 41.4474
            name: ROUGE-L
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODdiM2IxZTU4NTEyMzlmZDEzYTliZWNjMjM1NTAzMjE5MDY1MDZiZDc2YmE2NzUxOWJhMmQ0NTM5MjRjZjQyMSIsInZlcnNpb24iOjF9.PeJ41sirLWf3HTiJXlSMNoleENJT_X2u4VMkgQTmXMmGkbrONTFbUYwO4qjoQkvyjy8pLA2eQ3Fjm5yAvKrTCQ
          - type: rouge
            value: 45.1556
            name: ROUGE-LSUM
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGFiMGNkZDYxZmVhMDFlNDRlNmQ4YWVlMTk3ODI0ZWQ2MmIzNWFkYjkwOWRlNzkyNGVmYmY5ZTczZDAxYTk3NiIsInZlcnNpb24iOjF9.dsicHh5W4ba8t8eBBcSRUm-HLPlMoRc57XixiOHBCk-82De5u8hH8fsRWbMmaLpobdJ7b3xlIaVfTfMMRoLvBw
          - type: loss
            value: 1.2201015949249268
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzliODE2YzY2MzMyZDQ4YzdjZmFjNTc2NDU3ZjQwNjYwNTdhZjY1NWViM2VhNDc1MjQzNDkxMDI2MTM5ZjFkYiIsInZlcnNpb24iOjF9.2QdP4Zj2oHCo0HCoGgZy6YdqNJaQ0ri0E2kD7lzYbVmyg35wyGutvRUaXVR6O833gTbsCvM86Gp77qNT9CTyDA
          - type: gen_len
            value: 17.326
            name: gen_len
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWJlNjhjMmUxNWU3MDZlMTUzOWRmM2UwNmU3MjBjODhmMGYxZTUyMmFmMmE0MmU3ZTVkYWY0MDhkMWQ3NTk2MSIsInZlcnNpb24iOjF9.wFaw7DOpESjPu_uW6liHc4XaTwF36ReLLYd-BBFhnZXemE_lGQxmp0O0Vl2DgZz3SSbXonyS4D01G2hYze8qCA

flan-t5-large-samsum

This model is a fine-tuned version of google/flan-t5-large on the samsum dataset.

It achieves the following results on the evaluation set:

  • Loss: 1.1754
  • Rouge1: 54.1595
  • Rouge2: 29.1081
  • Rougel: 45.4989
  • Rougelsum: 49.1026
  • Gen Len: 28.72

Note: the stacked version of this model technically does evaluation on a different validation set (the stacked one) while this just uses samsum.

Model description

More information needed

Intended uses & limitations

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 17868
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.04
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2106 0.43 50 1.1889 52.5898 26.9967 43.6944 47.9656 24.5167
1.213 0.87 100 1.1760 52.4279 27.4689 43.7873 48.0581 25.0533
1.0726 1.3 150 1.1731 52.8246 26.9524 43.7429 48.0345 25.55
1.0784 1.74 200 1.1708 53.1291 27.9056 44.2609 48.6883 26.03
1.0215 2.17 250 1.1755 53.1512 27.9475 44.1442 48.4619 27.57
1.0294 2.61 300 1.1711 53.4402 28.0126 44.5454 48.6432 25.9033
1.0016 3.04 350 1.1718 53.9395 28.3087 45.191 49.2773 26.6133
0.9576 3.48 400 1.1741 53.9004 28.3243 45.0911 48.9182 26.33
0.9739 3.91 450 1.1754 53.7049 28.419 44.8946 48.8708 27.2433
0.9505 4.35 500 1.1781 53.7142 28.1758 44.8324 48.9498 26.8667
0.9993 4.78 550 1.1784 53.87 28.2211 44.893 49.1074 27.2167