metadata

license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - pszemraj/fleece2instructions
metrics:
  - rouge
model-index:
  - name: flan-t5-xl-instructiongen
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: pszemraj/fleece2instructions
          type: pszemraj/fleece2instructions
          split: validation
        metrics:
          - name: Rouge1
            type: rouge
            value: 65.3267

flan-t5-xl-instructiongen

This model is a fine-tuned version of google/flan-t5-xl on the pszemraj/fleece2instructions dataset. It achieves the following results on the evaluation set:

Loss: 0.8251
Rouge1: 65.3267
Rouge2: 48.8374
Rougel: 63.4223
Rougelsum: 63.5355
Gen Len: 13.6842

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 4
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 2.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.9615	1.0	362	0.8353	63.9163	47.0456	61.9554	62.0549	13.3737
0.809	2.0	724	0.8251	64.5398	47.9107	62.5928	62.7278	13.4763

Framework versions

Transformers 4.28.0.dev0
Pytorch 2.0.0.dev20230130+cu118
Datasets 2.9.0
Tokenizers 0.12.1