File size: 5,209 Bytes
9f81b75
03cfed4
 
9f81b75
30e999f
 
 
 
 
 
 
7672b7b
03cfed4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f81b75
30e999f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
language:
- en
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- samsum
metrics:
- rouge
pipeline_tag: summarization
base_model: google/flan-t5-large
model-index:
- name: stacked-summaries/flan-t5-large-samsum
  results:
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: samsum
      type: samsum
      config: samsum
      split: test
    metrics:
    - type: rouge
      value: 49.0095
      name: ROUGE-1
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGNhY2RhOTg5ZmU4ZGJiMjI1NjUyYWMwYmM2Mzk4MGEwMjk0NDg2OWYxZDdmM2I4NzBmODNiM2JmNTg1MDJhYSIsInZlcnNpb24iOjF9.YinJDLeqzoU_x5uJbGIgq8ZEs36oC3Pzre_vk2juxngBoXCEw54XWjpvVhKKZXeIgc47otucJFtFwAOPEmt9Bw
    - type: rouge
      value: 25.681
      name: ROUGE-2
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDBmNDc4NGMzZGEzYzMzMTFiNzliNjUyYmY0MzNjMmRlMTk4ZTRmZmUxODE0MmY1MjEzOWQ2MGQxMmZmZmQ5MSIsInZlcnNpb24iOjF9.UmRHCmQR5CR-JklBTY1JnjD_Gqz_qMYwdVXhMMvnAynMwAgXkoJZeoxT--usUfdkbqaQ-mLeEvLw7mgNE-NQAw
    - type: rouge
      value: 41.4474
      name: ROUGE-L
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODdiM2IxZTU4NTEyMzlmZDEzYTliZWNjMjM1NTAzMjE5MDY1MDZiZDc2YmE2NzUxOWJhMmQ0NTM5MjRjZjQyMSIsInZlcnNpb24iOjF9.PeJ41sirLWf3HTiJXlSMNoleENJT_X2u4VMkgQTmXMmGkbrONTFbUYwO4qjoQkvyjy8pLA2eQ3Fjm5yAvKrTCQ
    - type: rouge
      value: 45.1556
      name: ROUGE-LSUM
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGFiMGNkZDYxZmVhMDFlNDRlNmQ4YWVlMTk3ODI0ZWQ2MmIzNWFkYjkwOWRlNzkyNGVmYmY5ZTczZDAxYTk3NiIsInZlcnNpb24iOjF9.dsicHh5W4ba8t8eBBcSRUm-HLPlMoRc57XixiOHBCk-82De5u8hH8fsRWbMmaLpobdJ7b3xlIaVfTfMMRoLvBw
    - type: loss
      value: 1.2201015949249268
      name: loss
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzliODE2YzY2MzMyZDQ4YzdjZmFjNTc2NDU3ZjQwNjYwNTdhZjY1NWViM2VhNDc1MjQzNDkxMDI2MTM5ZjFkYiIsInZlcnNpb24iOjF9.2QdP4Zj2oHCo0HCoGgZy6YdqNJaQ0ri0E2kD7lzYbVmyg35wyGutvRUaXVR6O833gTbsCvM86Gp77qNT9CTyDA
    - type: gen_len
      value: 17.326
      name: gen_len
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWJlNjhjMmUxNWU3MDZlMTUzOWRmM2UwNmU3MjBjODhmMGYxZTUyMmFmMmE0MmU3ZTVkYWY0MDhkMWQ3NTk2MSIsInZlcnNpb24iOjF9.wFaw7DOpESjPu_uW6liHc4XaTwF36ReLLYd-BBFhnZXemE_lGQxmp0O0Vl2DgZz3SSbXonyS4D01G2hYze8qCA
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-large-samsum

This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the samsum dataset.

It achieves the following results on the evaluation set:
- Loss: 1.1754
- Rouge1: 54.1595
- Rouge2: 29.1081
- Rougel: 45.4989
- Rougelsum: 49.1026
- Gen Len: 28.72

> Note: the stacked version of this model technically does evaluation on a **different** validation set (the stacked one) while this just uses `samsum`.
## Model description

More information needed

## Intended uses & limitations

- Intended for comparison(s) to the [stacked version of this model](https://huggingface.co/stacked-summaries/flan-t5-large-stacked-samsum-1024)
- 1024 token input max

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 4
- seed: 17868
- distributed_type: multi-GPU
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.04
- num_epochs: 5.0

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 1.2106        | 0.43  | 50   | 1.1889          | 52.5898 | 26.9967 | 43.6944 | 47.9656   | 24.5167 |
| 1.213         | 0.87  | 100  | 1.1760          | 52.4279 | 27.4689 | 43.7873 | 48.0581   | 25.0533 |
| 1.0726        | 1.3   | 150  | 1.1731          | 52.8246 | 26.9524 | 43.7429 | 48.0345   | 25.55   |
| 1.0784        | 1.74  | 200  | 1.1708          | 53.1291 | 27.9056 | 44.2609 | 48.6883   | 26.03   |
| 1.0215        | 2.17  | 250  | 1.1755          | 53.1512 | 27.9475 | 44.1442 | 48.4619   | 27.57   |
| 1.0294        | 2.61  | 300  | 1.1711          | 53.4402 | 28.0126 | 44.5454 | 48.6432   | 25.9033 |
| 1.0016        | 3.04  | 350  | 1.1718          | 53.9395 | 28.3087 | 45.191  | 49.2773   | 26.6133 |
| 0.9576        | 3.48  | 400  | 1.1741          | 53.9004 | 28.3243 | 45.0911 | 48.9182   | 26.33   |
| 0.9739        | 3.91  | 450  | 1.1754          | 53.7049 | 28.419  | 44.8946 | 48.8708   | 27.2433 |
| 0.9505        | 4.35  | 500  | 1.1781          | 53.7142 | 28.1758 | 44.8324 | 48.9498   | 26.8667 |
| 0.9993        | 4.78  | 550  | 1.1784          | 53.87   | 28.2211 | 44.893  | 49.1074   | 27.2167 |