yousefg commited on
Commit
e1e757c
·
verified ·
1 Parent(s): 8f31361

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -9
README.md CHANGED
@@ -1,12 +1,22 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for Model ID
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
-
10
 
11
 
12
  ## Model Details
@@ -17,13 +27,13 @@ tags: []
17
 
18
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
20
- - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
22
  - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
  - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
@@ -41,6 +51,8 @@ This is the model card of a 🤗 transformers model that has been pushed on the
41
 
42
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
 
 
 
44
  [More Information Needed]
45
 
46
  ### Downstream Use [optional]
@@ -65,12 +77,35 @@ This is the model card of a 🤗 transformers model that has been pushed on the
65
 
66
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
  ## How to Get Started with the Model
71
 
72
  Use the code below to get started with the model.
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  [More Information Needed]
75
 
76
  ## Training Details
@@ -92,8 +127,11 @@ Use the code below to get started with the model.
92
 
93
  #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
 
 
 
97
  #### Speeds, Sizes, Times [optional]
98
 
99
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
@@ -103,11 +141,13 @@ Use the code below to get started with the model.
103
  ## Evaluation
104
 
105
  <!-- This section describes the evaluation protocols and provides the results. -->
 
106
 
107
  ### Testing Data, Factors & Metrics
108
 
109
  #### Testing Data
110
 
 
111
  <!-- This should link to a Dataset Card if possible. -->
112
 
113
  [More Information Needed]
@@ -129,6 +169,7 @@ Use the code below to get started with the model.
129
  [More Information Needed]
130
 
131
  #### Summary
 
132
 
133
 
134
 
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - lecture
5
+ - college
6
+ - university
7
+ - summarization
8
+ license: mit
9
+ language:
10
+ - en
11
+ metrics:
12
+ - rouge
13
+ pipeline_tag: summarization
14
  ---
15
 
16
  # Model Card for Model ID
17
 
18
  <!-- Provide a quick summary of what the model is/does. -->
19
+ Academ is a fine-tuned BART model for summarizing academic lectures.
20
 
21
 
22
  ## Model Details
 
27
 
28
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
29
 
30
+ - **Developed by:** Yousef Gamaleldin
31
  - **Funded by [optional]:** [More Information Needed]
32
  - **Shared by [optional]:** [More Information Needed]
33
+ - **Model type:** Summarization
34
+ - **Language(s) (NLP):** English
35
  - **License:** [More Information Needed]
36
+ - **Finetuned from model [optional]:** BART Large
37
 
38
  ### Model Sources [optional]
39
 
 
51
 
52
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
53
 
54
+
55
+
56
  [More Information Needed]
57
 
58
  ### Downstream Use [optional]
 
77
 
78
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
79
 
80
+ Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. More information needed for further recommendations.
81
 
82
  ## How to Get Started with the Model
83
 
84
  Use the code below to get started with the model.
85
 
86
+ def get_summary(input_ids, attention_mask, context_length):
87
+
88
+ summaries = []
89
+ for i in range(0, input_ids.shape[1], context_length):
90
+
91
+ input_slice = input_ids[:, i:i + context_length] if i + context_length <= input_ids.size(1) else input_ids[:, i:]
92
+ attention_mask_slice = attention_mask[:, i:i + context_length] if i + context_length <= attention_mask.size(1) else attention_mask[:, i:]
93
+
94
+ summary = model.generate(input_slice, attention_mask = attention_mask_slice, max_new_tokens = 1654, min_new_tokens = 250, do_sample = True, renormalize_logits = True)
95
+ summaries.extend(summary[0].tolist())
96
+
97
+ summaries = tokenizer.decode(summaries, skip_special_tokens = True)
98
+
99
+ return summaries
100
+
101
+ batch = tokenizer(texts, truncation = False) # make sure to get the transcript from the lecture
102
+
103
+ input_ids = torch.tensor(batch['input_ids']).unsqueeze(0).to(device)
104
+ attention_mask = torch.tensor(batch['attention_mask']).unsqueeze(0).to(device)
105
+
106
+ summary = get_summary(input_ids, attention_mask, 1654)
107
+ print(summary)
108
+
109
  [More Information Needed]
110
 
111
  ## Training Details
 
127
 
128
  #### Training Hyperparameters
129
 
130
+ - **Training regime:** [More Information Needed] bf16 non-mixed precision<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
131
+ - **Learning Rate:** 0.001
132
+ - **Weight Decay:** 0.01
133
+ - **Epochs:** 4
134
+ - **Batch Size:** 16
135
  #### Speeds, Sizes, Times [optional]
136
 
137
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
141
  ## Evaluation
142
 
143
  <!-- This section describes the evaluation protocols and provides the results. -->
144
+ The evaluation is based on ROUGE 1 with a change of discounting padding tokens.
145
 
146
  ### Testing Data, Factors & Metrics
147
 
148
  #### Testing Data
149
 
150
+ The model's test dataset had 289 lectures, mainly from MIT OpenCourseWare.
151
  <!-- This should link to a Dataset Card if possible. -->
152
 
153
  [More Information Needed]
 
169
  [More Information Needed]
170
 
171
  #### Summary
172
+ Academ is a summarization model trained on 2307 lectures, mainly from MIT OpenCourseWare. The model has a max sequence length of 1654, an increase of 630 tokens from the original model.
173
 
174
 
175