Peter Szemraj
update
f75abc0
|
raw
history blame
2.13 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - pszemraj/fleece2instructions
metrics:
  - rouge
model-index:
  - name: flan-t5-xl-instructiongen
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: pszemraj/fleece2instructions
          type: pszemraj/fleece2instructions
          split: validation
        metrics:
          - name: Rouge1
            type: rouge
            value: 65.3267

flan-t5-xl-instructiongen

This model is a fine-tuned version of google/flan-t5-xl on the pszemraj/fleece2instructions dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8251
  • Rouge1: 65.3267
  • Rouge2: 48.8374
  • Rougel: 63.4223
  • Rougelsum: 63.5355
  • Gen Len: 13.6842

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.9615 1.0 362 0.8353 63.9163 47.0456 61.9554 62.0549 13.3737
0.809 2.0 724 0.8251 64.5398 47.9107 62.5928 62.7278 13.4763

Framework versions

  • Transformers 4.28.0.dev0
  • Pytorch 2.0.0.dev20230130+cu118
  • Datasets 2.9.0
  • Tokenizers 0.12.1