File size: 5,125 Bytes
933dad0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
license: apache-2.0
library_name: peft
tags:
- Summarization
- generated_from_trainer
datasets:
- cnn_dailymail
metrics:
- rouge
base_model: google/flan-t5-base
model-index:
- name: flan-t5-base-prompt_tuning-cnn-dailymail
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-base-prompt_tuning-cnn-dailymail

This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the cnn_dailymail dataset.
It achieves the following results on the evaluation set:
- Loss: 19.3074
- Rouge1: 0.0787
- Rouge2: 0.0088
- Rougel: 0.0609
- Rougelsum: 0.0733

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.03
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 40

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 20.9307       | 1.0   | 188  | 19.4344         | 0.1471 | 0.0433 | 0.1103 | 0.1337    |
| 20.4274       | 2.0   | 376  | 20.1245         | 0.1199 | 0.0299 | 0.0953 | 0.1135    |
| 20.1641       | 3.0   | 564  | 19.5964         | 0.1178 | 0.024  | 0.0909 | 0.1072    |
| 20.5294       | 4.0   | 752  | 19.2955         | 0.1164 | 0.0213 | 0.0882 | 0.1055    |
| 20.6452       | 5.0   | 940  | 19.4288         | 0.1179 | 0.0239 | 0.0895 | 0.1072    |
| 20.6916       | 6.0   | 1128 | 19.1208         | 0.0997 | 0.0186 | 0.0795 | 0.093     |
| 20.8065       | 7.0   | 1316 | 18.9300         | 0.0865 | 0.0116 | 0.0688 | 0.08      |
| 20.1431       | 8.0   | 1504 | 19.7751         | 0.1118 | 0.0247 | 0.0869 | 0.1023    |
| 20.5281       | 9.0   | 1692 | 20.0590         | 0.1216 | 0.0278 | 0.0923 | 0.1118    |
| 20.1805       | 10.0  | 1880 | 19.3949         | 0.1025 | 0.0145 | 0.0818 | 0.0948    |
| 20.4289       | 11.0  | 2068 | 19.1645         | 0.0844 | 0.0086 | 0.0656 | 0.0753    |
| 20.1469       | 12.0  | 2256 | 19.4850         | 0.0905 | 0.0062 | 0.0697 | 0.0831    |
| 20.9285       | 13.0  | 2444 | 19.3351         | 0.0853 | 0.0077 | 0.067  | 0.0785    |
| 20.1419       | 14.0  | 2632 | 19.1241         | 0.0886 | 0.0097 | 0.0684 | 0.0822    |
| 20.5547       | 15.0  | 2820 | 19.1532         | 0.0897 | 0.0077 | 0.0704 | 0.0804    |
| 19.5719       | 16.0  | 3008 | 19.2346         | 0.0885 | 0.0107 | 0.0659 | 0.0794    |
| 20.3043       | 17.0  | 3196 | 19.3873         | 0.105  | 0.0188 | 0.0829 | 0.0949    |
| 20.5935       | 18.0  | 3384 | 19.3345         | 0.1132 | 0.0203 | 0.0874 | 0.1025    |
| 20.413        | 19.0  | 3572 | 18.8964         | 0.0751 | 0.0065 | 0.0593 | 0.0686    |
| 19.9286       | 20.0  | 3760 | 18.8474         | 0.0813 | 0.0082 | 0.0648 | 0.0725    |
| 19.9246       | 21.0  | 3948 | 19.3425         | 0.0844 | 0.0096 | 0.0694 | 0.0765    |
| 20.4844       | 22.0  | 4136 | 19.4680         | 0.1012 | 0.0143 | 0.0782 | 0.0923    |
| 20.1571       | 23.0  | 4324 | 19.5483         | 0.0808 | 0.0093 | 0.0665 | 0.0762    |
| 20.0099       | 24.0  | 4512 | 18.5052         | 0.056  | 0.0029 | 0.0479 | 0.0516    |
| 19.6279       | 25.0  | 4700 | 18.7629         | 0.0735 | 0.0082 | 0.0603 | 0.0649    |
| 19.303        | 26.0  | 4888 | 19.3608         | 0.1015 | 0.0124 | 0.0766 | 0.0885    |
| 20.8774       | 27.0  | 5076 | 19.3038         | 0.1008 | 0.013  | 0.0807 | 0.0932    |
| 20.1431       | 28.0  | 5264 | 19.3426         | 0.0991 | 0.0156 | 0.078  | 0.0918    |
| 20.4304       | 29.0  | 5452 | 19.3918         | 0.0905 | 0.0102 | 0.0734 | 0.0812    |
| 19.6689       | 30.0  | 5640 | 19.3527         | 0.088  | 0.0105 | 0.0669 | 0.0785    |
| 20.661        | 31.0  | 5828 | 19.4042         | 0.0996 | 0.0149 | 0.0767 | 0.0887    |
| 20.2962       | 32.0  | 6016 | 19.3871         | 0.0758 | 0.0101 | 0.0617 | 0.0702    |
| 20.5865       | 33.0  | 6204 | 19.3255         | 0.0786 | 0.0106 | 0.064  | 0.0733    |
| 21.4763       | 34.0  | 6392 | 19.3113         | 0.0755 | 0.0087 | 0.0623 | 0.0688    |
| 21.3826       | 35.0  | 6580 | 19.3089         | 0.075  | 0.0076 | 0.0609 | 0.0689    |
| 20.8869       | 36.0  | 6768 | 19.3614         | 0.0906 | 0.0143 | 0.0692 | 0.0812    |
| 20.527        | 37.0  | 6956 | 19.3784         | 0.0874 | 0.0099 | 0.0686 | 0.0797    |
| 19.5026       | 38.0  | 7144 | 19.4145         | 0.0888 | 0.0111 | 0.068  | 0.0823    |
| 19.3852       | 39.0  | 7332 | 19.3794         | 0.0815 | 0.0093 | 0.0616 | 0.0742    |
| 20.5347       | 40.0  | 7520 | 19.3074         | 0.0787 | 0.0088 | 0.0609 | 0.0733    |


### Framework versions

- PEFT 0.8.2
- Transformers 4.37.0
- Pytorch 2.1.2
- Datasets 2.1.0
- Tokenizers 0.15.1