File size: 4,351 Bytes
c2812e5
80302d2
c2812e5
 
 
c126915
 
 
 
 
 
c2812e5
c126915
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92c4303
 
 
c2812e5
 
 
 
 
 
 
 
 
18cdcff
 
c2812e5
 
 
18cdcff
c2812e5
 
 
 
 
 
 
539db81
 
c2812e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8fb2731
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c2812e5
 
 
 
 
 
92c4303
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
license: cc
base_model: facebook/bart-large-cnn
tags:
- generated_from_trainer
datasets:
  - cnn_dailymail
  - Convosumm
widget:
  - text: >
      Can we say that among the Pythagoreans the “body” of the concept was number? What do you mean by "concept body"? shell. What then is hidden behind this shell? Definition of a concept) what definition of a concept is ultimately hidden behind the body in the form of a number? All those that the Pythagoreans indicated. I want to say that numbers were their very concept. They thought in numbers as in concepts. Shape maybe?) you can say yes, but it will need to be developed on a mug. The definitions of thought are subject to numbers. On the one hand, numbers are pure abstraction, which gives initial freedom of thought for the derivation of abstract, embryonic definitions, but then for the derivation, description of reality, more specific concepts, the abstractness of numbers, on the contrary, limits, “leads into the darkness.” One is the object, “in itself”;'
model-index:
  - name: BART-CNN-Convosumm
    results:
      - task:
          name: Abstractive Dialogue Summarization
          type: abstractive-text-summarization
        dataset:
          name: Reddit arg-filtered part of Convosumm
          type: Convosumm
        metrics:
          - name: Validation ROGUE-1
            type: rogue-1
            value: 38.6252
          - name: Validation ROGUE-L
            type: rogue-l
            value: 23.902
          - name: Test ROGUE-1
            type: rogue-1
            value: 38.3642
          - name: Test ROGUE-L
            type: rogue-l
            value: 23.7782
language:
- en
pipeline_tag: summarization
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# BART-CNN-Convosumm

## Model description

This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the arg-filtered reddit part of [Convosumm](https://github.com/Yale-LILY/ConvoSumm) dataset.
Model is trained for [multilanguage telegram-bot summarizer](https://github.com/akaRemeris/XLConvosumm-bot). 

## Intended uses & limitations

Input expected: unstructured set of concatenated messages without nickname-message indexing.

## Training and evaluation data

More information needed

## Training procedure

Wandb logged [results](https://wandb.ai/remeris/BART-CNN-Convosumm/runs/68syxthd).

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 20
- total_train_batch_size: 20
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 1
- num_epochs: 7
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 6.207         | 1.0   | 10   | 4.2651          | 32.3341 | 7.812   | 20.0411 | 29.4849   | 77.38   |
| 4.0248        | 1.99  | 20   | 3.9903          | 36.0787 | 11.0447 | 21.3596 | 33.2903   | 130.58  |
| 3.5933        | 2.99  | 30   | 3.9020          | 34.2931 | 11.2036 | 20.7935 | 30.8361   | 140.02  |
| 3.3086        | 3.98  | 40   | 3.8712          | 38.4842 | 11.9947 | 23.4913 | 34.4347   | 85.78   |
| 3.112         | 4.98  | 50   | 3.8700          | 38.652  | 11.8315 | 23.5208 | 34.5998   | 76.2    |
| 2.9933        | 5.97  | 60   | 3.8809          | 38.66   | 12.3337 | 23.4394 | 35.1976   | 83.26   |
| 2.834         | 6.97  | 70   | 3.8797          | 38.6252 | 12.2556 | 23.902  | 34.6324   | 81.28   |

It achieves the following results on the evaluation set (50 data points):
- Loss: 3.8797
- Rouge1: 38.6252
- Rouge2: 12.2556
- Rougel: 23.902
- Rougelsum: 34.6324
- Gen Len: 81.28

It achieves the following results on the test set (250 data points):
- Loss: 3.8343
- Rouge1: 38.3642
- Rouge2: 12.2056
- Rougel: 23.7782
- Rougelsum: 34.3959
- Gen Len: 84.132

### Framework versions

- Transformers 4.35.2
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.15.0