led-large-16384-govreport

This model is a fine-tuned version of allenai/led-base-16384 on the govreport-summarization dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1142
  • Rouge1: 0.5445
  • Rouge2: 0.2225
  • Rougel: 0.2578
  • Rougelsum: 0.2579

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
1.8152 3.65 500 1.7956 0.5095 0.2040 0.2382 0.2381
1.6981 3.66 1000 1.7624 0.5194 0.2107 0.2437 0.2437
1.7048 5.49 1500 1.7448 0.5253 0.2149 0.2467 0.2467
1.6469 7.32 2000 1.7416 0.5299 0.2177 0.2499 0.2500
1.6465 9.15 2500 1.7318 0.5299 0.2160 0.2476 0.2478
1.578 10.98 3000 1.7254 0.5321 0.2192 0.2529 0.2530
1.5631 12.81 3500 1.7189 0.5309 0.2170 0.2520 0.2520
1.5641 14.63 4000 1.7152 0.5343 0.2198 0.2550 0.2550
1.4753 16.48 4500 1.7181 0.5305 0.2179 0.2539 0.2542
1.4792 18.3 5000 1.7152 0.5375 0.2258 0.2586 0.2588
1.4206 20.13 5500 1.7142 0.5366 0.2216 0.2555 0.2556
1.4273 21.96 6000 1.7128 0.5364 0.2232 0.2573 0.2573
1.4078 23.78 6500 1.7114 0.5344 0.2200 0.2562 0.2563
1.355 25.61 7000 1.7153 0.5354 0.2212 0.2564 0.2564
1.409 27.44 7500 1.7119 0.5363 0.2217 0.2568 0.2570
1.3817 29.26 8000 1.7166 0.5369 0.2229 0.2582 0.2582
1.3072 31.13 8500 1.7302 0.5379 0.2249 0.2604 0.2603
1.3172 32.96 9000 1.7121 0.5377 0.2236 0.2588 0.2587
1.277 34.78 9500 1.7255 0.5368 0.2221 0.2584 0.2583
1.1849 36.61 10000 1.7438 0.5382 0.2244 0.2611 0.2612
1.1565 38.44 10500 1.7540 0.5414 0.2258 0.2612 0.2612
1.1415 40.26 11000 1.7707 0.5401 0.2251 0.2618 0.2618
1.085 42.09 11500 1.7791 0.5401 0.2235 0.2595 0.2595
1.088 43.92 12000 1.7869 0.5422 0.2265 0.2616 0.2615
1.0678 45.74 12500 1.8058 0.5420 0.2253 0.2607 0.2607
1.0815 47.57 13000 1.8186 0.5405 0.2248 0.2615 0.2615
1.0456 49.4 13500 1.8346 0.5430 0.2262 0.2619 0.2618
0.9553 51.22 14000 1.8449 0.5387 0.2239 0.2614 0.2613
0.958 53.05 14500 1.8716 0.5438 0.2274 0.2618 0.2618
0.9213 54.88 15000 1.8780 0.5438 0.2249 0.2612 0.2612
0.876 56.77 15500 1.8904 0.5439 0.2253 0.2621 0.2621
0.8967 58.6 16000 1.9085 0.5439 0.2264 0.2634 0.2633
0.9138 60.43 16500 1.9089 0.5428 0.2242 0.2597 0.2597
0.848 62.25 17000 1.9153 0.5441 0.2242 0.2600 0.2599
0.7804 64.08 17500 1.9311 0.5422 0.2241 0.2603 0.2604
0.8326 65.91 18000 1.9391 0.5446 0.2242 0.2604 0.2602
0.8164 67.73 18500 1.9607 0.5430 0.2245 0.2607 0.2607
0.8129 69.56 19000 1.9731 0.5456 0.2277 0.2633 0.2633
0.8049 71.39 19500 1.9804 0.5433 0.2248 0.2618 0.2619
0.7605 73.21 20000 2.0060 0.5449 0.2256 0.2607 0.2606
0.7595 75.04 20500 2.0085 0.5425 0.2227 0.2590 0.2590
0.7837 76.87 21000 2.0073 0.5441 0.2243 0.2608 0.2609
0.7458 78.69 21500 2.0210 0.5447 0.2260 0.2619 0.2621
0.7235 80.52 22000 2.0273 0.5445 0.2253 0.2610 0.2611
0.7405 82.35 22500 2.0405 0.5438 0.2243 0.2600 0.2599
0.7323 84.17 23000 2.0385 0.5466 0.2256 0.2607 0.2608
0.7333 86.0 23500 2.0386 0.5447 0.2248 0.2608 0.2609
0.7067 87.83 24000 2.0582 0.5449 0.2243 0.2601 0.2600
0.7073 89.65 24500 2.0615 0.5455 0.2253 0.2604 0.2603
0.6903 91.48 25000 2.0657 0.5482 0.2273 0.2627 0.2626
0.7203 93.31 25500 2.0574 0.5452 0.2241 0.2596 0.2597
0.6765 95.13 26000 2.0692 0.5437 0.2249 0.2608 0.2608
0.6959 96.96 26500 2.0696 0.5442 0.2246 0.2614 0.2614
0.6918 98.79 27000 2.0701 0.5444 0.2252 0.2615 0.2615

Framework versions

  • Transformers 4.30.2
  • Pytorch 1.10.0+cu102
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results