flan-t5-large-mawpnli-calcx-pt

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1463
  • Rouge1: 95.656
  • Rouge2: 88.9634
  • Rougel: 95.5287
  • Rougelsum: 95.5315
  • Gen Len: 12.6808

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 426 0.1577 94.7455 86.4792 94.3396 94.3571 12.5962
0.2457 2.0 852 0.1341 95.3972 88.4764 95.1413 95.1511 12.7207
0.0797 3.0 1278 0.1313 95.4361 88.078 95.221 95.2056 12.5692
0.0573 4.0 1704 0.1483 95.3949 88.188 95.1917 95.2055 12.5974
0.0404 5.0 2130 0.1463 95.656 88.9634 95.5287 95.5315 12.6808

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.12.1+cu113
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
23
Safetensors
Model size
783M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vishwa27/flan-t5-large-mawpnli-calcx-pt

Finetuned
(106)
this model