jakobcassiman commited on
Commit
0414d68
·
1 Parent(s): 40a54e2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - nl
4
+ tags:
5
+ - mbart
6
+ - bart
7
+ - summarization
8
+ datasets:
9
+ - ml6team/cnn_dailymail_nl
10
+ ---
11
+
12
+ # mbart-large-cc25-cnn-dailymail-nl
13
+
14
+ ## Model description
15
+ Finetuned version of [mbart](https://huggingface.co/facebook/mbart-large-cc25). We also wrote a blog post about this model [here](https://blog.ml6.eu/)
16
+
17
+ ## Intended uses & limitations
18
+ It's meant for summarizing Dutch news articles.
19
+
20
+ #### How to use
21
+
22
+ ```python
23
+ import transformers
24
+ undisputed_best_model = transformers.MBartForConditionalGeneration.from_pretrained('ml6team/mbart-large-cc25-cnn-dailymail-nl')
25
+ tokenizer = transformers.MBartTokenizer.from_pretrained('facebook/mbart-large-cc25')
26
+ summarization_pipeline = transformers.pipeline(
27
+ task='summarization',
28
+ model=undisputed_best_model,
29
+ tokenizer=tokenizer,
30
+ )
31
+ summ_pipeline_mbart.model.config.decoder_start_token_id=tokenizer.lang_code_to_id["nl_XX"]
32
+
33
+ article = 'Kan je dit even samenvatten alsjeblief.' # Dutch
34
+ summ_pipeline_mbart(
35
+ article,
36
+ do_sample=True,
37
+ top_p=0.75,
38
+ top_k=50,
39
+ # num_beams=4,
40
+ min_length=50,
41
+ early_stopping=True,
42
+ truncation=True,
43
+ )[0]['summary_text']
44
+ ```
45
+
46
+ ## Training data
47
+ Finetuned [mbart](https://huggingface.co/facebook/mbart-large-cc25) with [this dataset](https://huggingface.co/datasets/ml6team/cnn_dailymail_nl)