ldos's picture
End of training
8cc0b9a
metadata
license: mit
base_model: facebook/bart-large-xsum
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: text_shortening_model_v41
    results: []

text_shortening_model_v41

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7205
  • Rouge1: 0.4471
  • Rouge2: 0.2088
  • Rougel: 0.3939
  • Rougelsum: 0.3941
  • Bert precision: 0.8647
  • Bert recall: 0.8624
  • Average word count: 8.6517
  • Max word count: 18
  • Min word count: 4
  • Average token count: 16.5045
  • % shortened texts with length > 12: 5.7057

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bert precision Bert recall Average word count Max word count Min word count Average token count % shortened texts with length > 12
2.424 1.0 73 2.2763 0.4466 0.2286 0.3969 0.3973 0.8628 0.8607 8.4805 17 5 14.6967 3.6036
1.331 2.0 146 2.1237 0.4671 0.2385 0.4119 0.4124 0.86 0.8702 9.7117 20 4 16.7838 14.4144
0.9725 3.0 219 1.9947 0.448 0.2384 0.4004 0.4025 0.8603 0.8627 8.8649 16 5 15.8709 5.7057
0.7753 4.0 292 2.2302 0.4435 0.2201 0.3983 0.3991 0.8653 0.8588 8.1141 16 5 15.5526 1.8018
0.6017 5.0 365 2.1392 0.4293 0.2142 0.383 0.3836 0.8593 0.8604 8.6156 17 4 14.1982 3.3033
0.4911 6.0 438 2.4747 0.4166 0.1882 0.365 0.3668 0.8582 0.8556 8.4234 14 5 14.4024 3.6036
0.6947 7.0 511 2.6372 0.3894 0.1904 0.3527 0.3534 0.8471 0.8477 8.5165 14 4 16.6607 4.2042
0.5839 8.0 584 2.6038 0.3641 0.1627 0.3272 0.3276 0.8464 0.8402 7.7508 13 4 15.2342 0.6006
0.4668 9.0 657 2.7711 0.4015 0.1904 0.3627 0.3626 0.8537 0.8517 8.8889 17 4 16.2402 3.9039
0.4539 10.0 730 2.8819 0.4 0.1903 0.353 0.3538 0.8526 0.8519 8.6156 15 5 16.1652 3.9039
0.4018 11.0 803 2.8273 0.3799 0.1764 0.3404 0.3407 0.8432 0.8454 8.7177 17 4 17.0661 3.6036
0.2764 12.0 876 2.9767 0.3888 0.1825 0.3504 0.3509 0.8526 0.8475 8.4354 13 5 16.015 2.1021
0.2338 13.0 949 2.8883 0.4184 0.202 0.3714 0.3714 0.852 0.8585 9.3754 17 5 15.8709 8.4084
0.1878 14.0 1022 3.1069 0.4302 0.1966 0.3782 0.3791 0.8616 0.8573 8.4324 15 4 16.2492 3.3033
0.1608 15.0 1095 2.8510 0.4461 0.2151 0.392 0.3925 0.8627 0.8625 8.7598 19 4 16.1471 5.7057
0.1416 16.0 1168 3.0792 0.4246 0.1983 0.3735 0.3735 0.8591 0.8568 8.6637 16 5 16.3303 7.5075
0.1507 17.0 1241 3.2058 0.4336 0.2016 0.379 0.3796 0.8593 0.8589 8.9129 17 5 16.6697 5.1051
0.108 18.0 1314 3.0551 0.4485 0.2248 0.4002 0.4006 0.8645 0.8608 8.2492 14 5 15.967 3.6036
0.0756 19.0 1387 3.1943 0.4439 0.2167 0.3919 0.3925 0.8652 0.8608 8.4865 15 5 15.8919 3.9039
0.104 20.0 1460 3.1156 0.4411 0.2035 0.3894 0.3903 0.8644 0.8612 8.5135 16 5 16.4294 6.006
0.0716 21.0 1533 3.4040 0.4389 0.201 0.3824 0.3838 0.8632 0.8614 8.7508 16 4 16.5075 6.006
0.0576 22.0 1606 3.4264 0.4476 0.2104 0.3902 0.391 0.8657 0.8629 8.5405 16 4 16.4144 6.6066
0.041 23.0 1679 3.5711 0.447 0.2108 0.3931 0.393 0.8639 0.8619 8.5976 18 4 16.4264 7.2072
0.0355 24.0 1752 3.6294 0.4509 0.215 0.3981 0.3989 0.8652 0.8632 8.6186 18 4 16.4985 6.006
0.0313 25.0 1825 3.7205 0.4471 0.2088 0.3939 0.3941 0.8647 0.8624 8.6517 18 4 16.5045 5.7057

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3