checkpoint-6500-finetuned2-2010-2016

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0000
  • Rouge1: 1.0
  • Rouge2: 0.96
  • Rougel: 1.0
  • Rougelsum: 1.0
  • Gen Len: 6.4

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 1
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 22 0.5473 0.2919 0.2095 0.2929 0.29 6.11
No log 2.0 44 0.3837 0.3426 0.2692 0.3434 0.3433 6.2
No log 3.0 66 0.2416 0.5336 0.4842 0.532 0.533 6.25
No log 4.0 88 0.1266 0.7338 0.6501 0.7334 0.7306 6.28
0.6352 5.0 110 0.0454 0.8998 0.8382 0.9002 0.899 6.26
0.6352 6.0 132 0.0125 0.9864 0.9423 0.9864 0.9868 6.39
0.6352 7.0 154 0.0029 1.0 0.96 1.0 1.0 6.4
0.6352 8.0 176 0.0011 1.0 0.96 1.0 1.0 6.4
0.6352 9.0 198 0.0013 0.9922 0.95 0.9922 0.9922 6.43
0.0586 10.0 220 0.0067 0.995 0.95 0.995 0.995 6.4
0.0586 11.0 242 0.0004 1.0 0.96 1.0 1.0 6.4
0.0586 12.0 264 0.0002 1.0 0.96 1.0 1.0 6.4
0.0586 13.0 286 0.0142 0.9983 0.956 0.9983 0.9983 6.43
0.0149 14.0 308 0.0001 1.0 0.96 1.0 1.0 6.4
0.0149 15.0 330 0.0000 1.0 0.96 1.0 1.0 6.4
0.0149 16.0 352 0.0000 1.0 0.96 1.0 1.0 6.4
0.0149 17.0 374 0.0000 1.0 0.96 1.0 1.0 6.4
0.0149 18.0 396 0.0000 1.0 0.96 1.0 1.0 6.4
0.0064 19.0 418 0.0000 1.0 0.96 1.0 1.0 6.4
0.0064 20.0 440 0.0000 1.0 0.96 1.0 1.0 6.4
0.0064 21.0 462 0.0000 1.0 0.96 1.0 1.0 6.4
0.0064 22.0 484 0.0002 1.0 0.96 1.0 1.0 6.4
0.0048 23.0 506 0.0003 1.0 0.96 1.0 1.0 6.4
0.0048 24.0 528 0.0000 1.0 0.96 1.0 1.0 6.4
0.0048 25.0 550 0.0000 1.0 0.96 1.0 1.0 6.4
0.0048 26.0 572 0.0000 1.0 0.96 1.0 1.0 6.4
0.0048 27.0 594 0.0000 1.0 0.96 1.0 1.0 6.4
0.0032 28.0 616 0.0000 1.0 0.96 1.0 1.0 6.4
0.0032 29.0 638 0.0000 1.0 0.96 1.0 1.0 6.4
0.0032 30.0 660 0.0000 1.0 0.96 1.0 1.0 6.4
0.0032 31.0 682 0.0000 1.0 0.96 1.0 1.0 6.4
0.0016 32.0 704 0.0000 1.0 0.96 1.0 1.0 6.4
0.0016 33.0 726 0.0000 1.0 0.96 1.0 1.0 6.4
0.0016 34.0 748 0.0000 1.0 0.96 1.0 1.0 6.4
0.0016 35.0 770 0.0000 1.0 0.96 1.0 1.0 6.4
0.0016 36.0 792 0.0000 1.0 0.96 1.0 1.0 6.4
0.0019 37.0 814 0.0000 1.0 0.96 1.0 1.0 6.4
0.0019 38.0 836 0.0000 1.0 0.96 1.0 1.0 6.4
0.0019 39.0 858 0.0000 1.0 0.96 1.0 1.0 6.4
0.0019 40.0 880 0.0000 1.0 0.96 1.0 1.0 6.4
0.0014 41.0 902 0.0000 1.0 0.96 1.0 1.0 6.4
0.0014 42.0 924 0.0000 1.0 0.96 1.0 1.0 6.4
0.0014 43.0 946 0.0000 1.0 0.96 1.0 1.0 6.4
0.0014 44.0 968 0.0000 1.0 0.96 1.0 1.0 6.4
0.0014 45.0 990 0.0000 1.0 0.96 1.0 1.0 6.4
0.0013 46.0 1012 0.0000 1.0 0.96 1.0 1.0 6.4
0.0013 47.0 1034 0.0000 1.0 0.96 1.0 1.0 6.4
0.0013 48.0 1056 0.0000 1.0 0.96 1.0 1.0 6.4
0.0013 49.0 1078 0.0000 1.0 0.96 1.0 1.0 6.4
0.0011 50.0 1100 0.0000 1.0 0.96 1.0 1.0 6.4

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.0.1
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.