Zekunli's picture
update model card README.md
edb1538
|
raw
history blame
2.29 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-large-extraction-cnndm_4000-all
    results: []

flan-t5-large-extraction-cnndm_4000-all

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7290
  • Rouge1: 35.0775
  • Rouge2: 15.2209
  • Rougel: 30.1796
  • Rougelsum: 30.1599
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 24
  • seed: 1799
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.1464 0.4 200 1.8323 35.2242 15.3495 30.142 30.1331 19.0
1.9817 0.8 400 1.7729 34.3798 14.7287 29.5447 29.6052 18.986
1.8842 1.2 600 1.7602 34.5807 15.1707 29.7768 29.8081 18.986
1.8129 1.6 800 1.7629 34.5103 15.231 29.9182 29.9333 19.0
1.8238 2.0 1000 1.7290 35.0775 15.2209 30.1796 30.1599 19.0
1.7199 2.4 1200 1.7354 34.6552 15.7256 30.1894 30.2207 18.998
1.7128 2.8 1400 1.7407 34.7198 15.5771 30.0585 30.0442 19.0
1.6816 3.2 1600 1.7508 34.9611 15.5792 30.3518 30.3638 19.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.5.1
  • Tokenizers 0.12.1