RMWeerasinghe's picture
Training complete
933dad0 verified
metadata
license: apache-2.0
library_name: peft
tags:
  - Summarization
  - generated_from_trainer
datasets:
  - cnn_dailymail
metrics:
  - rouge
base_model: google/flan-t5-base
model-index:
  - name: flan-t5-base-prompt_tuning-cnn-dailymail
    results: []

flan-t5-base-prompt_tuning-cnn-dailymail

This model is a fine-tuned version of google/flan-t5-base on the cnn_dailymail dataset. It achieves the following results on the evaluation set:

  • Loss: 19.3074
  • Rouge1: 0.0787
  • Rouge2: 0.0088
  • Rougel: 0.0609
  • Rougelsum: 0.0733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
20.9307 1.0 188 19.4344 0.1471 0.0433 0.1103 0.1337
20.4274 2.0 376 20.1245 0.1199 0.0299 0.0953 0.1135
20.1641 3.0 564 19.5964 0.1178 0.024 0.0909 0.1072
20.5294 4.0 752 19.2955 0.1164 0.0213 0.0882 0.1055
20.6452 5.0 940 19.4288 0.1179 0.0239 0.0895 0.1072
20.6916 6.0 1128 19.1208 0.0997 0.0186 0.0795 0.093
20.8065 7.0 1316 18.9300 0.0865 0.0116 0.0688 0.08
20.1431 8.0 1504 19.7751 0.1118 0.0247 0.0869 0.1023
20.5281 9.0 1692 20.0590 0.1216 0.0278 0.0923 0.1118
20.1805 10.0 1880 19.3949 0.1025 0.0145 0.0818 0.0948
20.4289 11.0 2068 19.1645 0.0844 0.0086 0.0656 0.0753
20.1469 12.0 2256 19.4850 0.0905 0.0062 0.0697 0.0831
20.9285 13.0 2444 19.3351 0.0853 0.0077 0.067 0.0785
20.1419 14.0 2632 19.1241 0.0886 0.0097 0.0684 0.0822
20.5547 15.0 2820 19.1532 0.0897 0.0077 0.0704 0.0804
19.5719 16.0 3008 19.2346 0.0885 0.0107 0.0659 0.0794
20.3043 17.0 3196 19.3873 0.105 0.0188 0.0829 0.0949
20.5935 18.0 3384 19.3345 0.1132 0.0203 0.0874 0.1025
20.413 19.0 3572 18.8964 0.0751 0.0065 0.0593 0.0686
19.9286 20.0 3760 18.8474 0.0813 0.0082 0.0648 0.0725
19.9246 21.0 3948 19.3425 0.0844 0.0096 0.0694 0.0765
20.4844 22.0 4136 19.4680 0.1012 0.0143 0.0782 0.0923
20.1571 23.0 4324 19.5483 0.0808 0.0093 0.0665 0.0762
20.0099 24.0 4512 18.5052 0.056 0.0029 0.0479 0.0516
19.6279 25.0 4700 18.7629 0.0735 0.0082 0.0603 0.0649
19.303 26.0 4888 19.3608 0.1015 0.0124 0.0766 0.0885
20.8774 27.0 5076 19.3038 0.1008 0.013 0.0807 0.0932
20.1431 28.0 5264 19.3426 0.0991 0.0156 0.078 0.0918
20.4304 29.0 5452 19.3918 0.0905 0.0102 0.0734 0.0812
19.6689 30.0 5640 19.3527 0.088 0.0105 0.0669 0.0785
20.661 31.0 5828 19.4042 0.0996 0.0149 0.0767 0.0887
20.2962 32.0 6016 19.3871 0.0758 0.0101 0.0617 0.0702
20.5865 33.0 6204 19.3255 0.0786 0.0106 0.064 0.0733
21.4763 34.0 6392 19.3113 0.0755 0.0087 0.0623 0.0688
21.3826 35.0 6580 19.3089 0.075 0.0076 0.0609 0.0689
20.8869 36.0 6768 19.3614 0.0906 0.0143 0.0692 0.0812
20.527 37.0 6956 19.3784 0.0874 0.0099 0.0686 0.0797
19.5026 38.0 7144 19.4145 0.0888 0.0111 0.068 0.0823
19.3852 39.0 7332 19.3794 0.0815 0.0093 0.0616 0.0742
20.5347 40.0 7520 19.3074 0.0787 0.0088 0.0609 0.0733

Framework versions

  • PEFT 0.8.2
  • Transformers 4.37.0
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.1