BART-CNN-Convosumm

Model description

This model is a fine-tuned version of facebook/bart-large-cnn on the arg-filtered reddit part of Convosumm dataset. Model is trained for multilanguage telegram-bot summarizer.

Intended uses & limitations

Input expected: unstructured set of concatenated messages without nickname-message indexing.

Training and evaluation data

More information needed

Training procedure

Wandb logged results.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 20
  • total_train_batch_size: 20
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 7
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.207 1.0 10 4.2651 32.3341 7.812 20.0411 29.4849 77.38
4.0248 1.99 20 3.9903 36.0787 11.0447 21.3596 33.2903 130.58
3.5933 2.99 30 3.9020 34.2931 11.2036 20.7935 30.8361 140.02
3.3086 3.98 40 3.8712 38.4842 11.9947 23.4913 34.4347 85.78
3.112 4.98 50 3.8700 38.652 11.8315 23.5208 34.5998 76.2
2.9933 5.97 60 3.8809 38.66 12.3337 23.4394 35.1976 83.26
2.834 6.97 70 3.8797 38.6252 12.2556 23.902 34.6324 81.28

It achieves the following results on the evaluation set (50 data points):

  • Loss: 3.8797
  • Rouge1: 38.6252
  • Rouge2: 12.2556
  • Rougel: 23.902
  • Rougelsum: 34.6324
  • Gen Len: 81.28

It achieves the following results on the test set (250 data points):

  • Loss: 3.8343
  • Rouge1: 38.3642
  • Rouge2: 12.2056
  • Rougel: 23.7782
  • Rougelsum: 34.3959
  • Gen Len: 84.132

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0
Downloads last month
21
Safetensors
Model size
408M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Remeris/BART-CNN-Convosumm

Finetuned
(320)
this model

Dataset used to train Remeris/BART-CNN-Convosumm

Evaluation results

  • Validation ROGUE-1 on Reddit arg-filtered part of Convosumm
    self-reported
    38.625
  • Validation ROGUE-L on Reddit arg-filtered part of Convosumm
    self-reported
    23.902
  • Test ROGUE-1 on Reddit arg-filtered part of Convosumm
    self-reported
    38.364
  • Test ROGUE-L on Reddit arg-filtered part of Convosumm
    self-reported
    23.778