lbourdois commited on
Commit
ff7d040
·
1 Parent(s): e431334

Add multilingual to the language tag

Browse files

Hi! A PR to add multilingual to the language tag to improve the referencing.

Files changed (1) hide show
  1. README.md +66 -75
README.md CHANGED
@@ -2,135 +2,126 @@
2
  language:
3
  - de
4
  - es
5
-
 
6
  tags:
7
  - translation
8
  - opus-mt-tc
9
-
10
- license: cc-by-4.0
11
  model-index:
12
  - name: opus-mt-tc-big-de-es
13
  results:
14
  - task:
15
- name: Translation deu-spa
16
  type: translation
17
- args: deu-spa
18
  dataset:
19
  name: flores101-devtest
20
  type: flores_101
21
  args: deu spa devtest
22
  metrics:
23
- - name: BLEU
24
- type: bleu
25
- value: 24.9
26
- - name: chr-F
27
- type: chrf
28
- value: 0.53208
29
  - task:
30
- name: Translation deu-spa
31
  type: translation
32
- args: deu-spa
33
  dataset:
34
  name: news-test2008
35
  type: news-test2008
36
  args: deu-spa
37
  metrics:
38
- - name: BLEU
39
- type: bleu
40
- value: 26.6
41
- - name: chr-F
42
- type: chrf
43
- value: 0.54400
44
  - task:
45
- name: Translation deu-spa
46
  type: translation
47
- args: deu-spa
48
  dataset:
49
  name: tatoeba-test-v2021-08-07
50
  type: tatoeba_mt
51
  args: deu-spa
52
  metrics:
53
- - name: BLEU
54
- type: bleu
55
- value: 50.8
56
- - name: chr-F
57
- type: chrf
58
- value: 0.69105
59
  - task:
60
- name: Translation deu-spa
61
  type: translation
62
- args: deu-spa
63
  dataset:
64
  name: newstest2009
65
  type: wmt-2009-news
66
  args: deu-spa
67
  metrics:
68
- - name: BLEU
69
- type: bleu
70
- value: 25.9
71
- - name: chr-F
72
- type: chrf
73
- value: 0.53934
74
  - task:
75
- name: Translation deu-spa
76
  type: translation
77
- args: deu-spa
78
  dataset:
79
  name: newstest2010
80
  type: wmt-2010-news
81
  args: deu-spa
82
  metrics:
83
- - name: BLEU
84
- type: bleu
85
- value: 33.8
86
- - name: chr-F
87
- type: chrf
88
- value: 0.60102
89
  - task:
90
- name: Translation deu-spa
91
  type: translation
92
- args: deu-spa
93
  dataset:
94
  name: newstest2011
95
  type: wmt-2011-news
96
  args: deu-spa
97
  metrics:
98
- - name: BLEU
99
- type: bleu
100
- value: 31.3
101
- - name: chr-F
102
- type: chrf
103
- value: 0.57133
104
  - task:
105
- name: Translation deu-spa
106
  type: translation
107
- args: deu-spa
108
  dataset:
109
  name: newstest2012
110
  type: wmt-2012-news
111
  args: deu-spa
112
  metrics:
113
- - name: BLEU
114
- type: bleu
115
- value: 32.6
116
- - name: chr-F
117
- type: chrf
118
- value: 0.58119
119
  - task:
120
- name: Translation deu-spa
121
  type: translation
122
- args: deu-spa
123
  dataset:
124
  name: newstest2013
125
  type: wmt-2013-news
126
  args: deu-spa
127
  metrics:
128
- - name: BLEU
129
- type: bleu
130
- value: 32.4
131
- - name: chr-F
132
- type: chrf
133
- value: 0.57559
134
  ---
135
  # opus-mt-tc-big-de-es
136
 
@@ -184,8 +175,8 @@ A short example code:
184
  from transformers import MarianMTModel, MarianTokenizer
185
 
186
  src_text = [
187
- "Ich verstehe nicht, worüber ihr redet.",
188
- "Die Vögel singen in den Bäumen."
189
  ]
190
 
191
  model_name = "pytorch-models/opus-mt-tc-big-de-es"
@@ -197,8 +188,8 @@ for t in translated:
197
  print( tokenizer.decode(t, skip_special_tokens=True) )
198
 
199
  # expected output:
200
- # No entiendo de qué están hablando.
201
- # Los pájaros cantan en los árboles.
202
  ```
203
 
204
  You can also use OPUS-MT models with the transformers pipelines, for example:
@@ -206,9 +197,9 @@ You can also use OPUS-MT models with the transformers pipelines, for example:
206
  ```python
207
  from transformers import pipeline
208
  pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-de-es")
209
- print(pipe("Ich verstehe nicht, worüber ihr redet."))
210
 
211
- # expected output: No entiendo de qué están hablando.
212
  ```
213
 
214
  ## Training
@@ -240,7 +231,7 @@ print(pipe("Ich verstehe nicht, worüber ihr redet."))
240
 
241
  ## Citation Information
242
 
243
- * Publications: [OPUS-MT Building open translation services for the World](https://aclanthology.org/2020.eamt-1.61/) and [The Tatoeba Translation Challenge Realistic Data Sets for Low Resource and Multilingual MT](https://aclanthology.org/2020.wmt-1.139/) (Please, cite if you use this model.)
244
 
245
  ```
246
  @inproceedings{tiedemann-thottingal-2020-opus,
@@ -270,7 +261,7 @@ print(pipe("Ich verstehe nicht, worüber ihr redet."))
270
 
271
  ## Acknowledgements
272
 
273
- The work is supported by the [European Language Grid](https://www.european-language-grid.eu/) as [pilot project 2866](https://live.european-language-grid.eu/catalogue/#/resource/projects/2866), by the [FoTran project](https://www.helsinki.fi/en/researchgroups/natural-language-understanding-with-cross-lingual-grounding), funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No 771113), and the [MeMAD project](https://memad.eu/), funded by the European Unions Horizon 2020 Research and Innovation Programme under grant agreement No 780069. We are also grateful for the generous computational resources and IT infrastructure provided by [CSC -- IT Center for Science](https://www.csc.fi/), Finland.
274
 
275
  ## Model conversion info
276
 
 
2
  language:
3
  - de
4
  - es
5
+ - multilingual
6
+ license: cc-by-4.0
7
  tags:
8
  - translation
9
  - opus-mt-tc
 
 
10
  model-index:
11
  - name: opus-mt-tc-big-de-es
12
  results:
13
  - task:
 
14
  type: translation
15
+ name: Translation deu-spa
16
  dataset:
17
  name: flores101-devtest
18
  type: flores_101
19
  args: deu spa devtest
20
  metrics:
21
+ - type: bleu
22
+ value: 24.9
23
+ name: BLEU
24
+ - type: chrf
25
+ value: 0.53208
26
+ name: chr-F
27
  - task:
 
28
  type: translation
29
+ name: Translation deu-spa
30
  dataset:
31
  name: news-test2008
32
  type: news-test2008
33
  args: deu-spa
34
  metrics:
35
+ - type: bleu
36
+ value: 26.6
37
+ name: BLEU
38
+ - type: chrf
39
+ value: 0.544
40
+ name: chr-F
41
  - task:
 
42
  type: translation
43
+ name: Translation deu-spa
44
  dataset:
45
  name: tatoeba-test-v2021-08-07
46
  type: tatoeba_mt
47
  args: deu-spa
48
  metrics:
49
+ - type: bleu
50
+ value: 50.8
51
+ name: BLEU
52
+ - type: chrf
53
+ value: 0.69105
54
+ name: chr-F
55
  - task:
 
56
  type: translation
57
+ name: Translation deu-spa
58
  dataset:
59
  name: newstest2009
60
  type: wmt-2009-news
61
  args: deu-spa
62
  metrics:
63
+ - type: bleu
64
+ value: 25.9
65
+ name: BLEU
66
+ - type: chrf
67
+ value: 0.53934
68
+ name: chr-F
69
  - task:
 
70
  type: translation
71
+ name: Translation deu-spa
72
  dataset:
73
  name: newstest2010
74
  type: wmt-2010-news
75
  args: deu-spa
76
  metrics:
77
+ - type: bleu
78
+ value: 33.8
79
+ name: BLEU
80
+ - type: chrf
81
+ value: 0.60102
82
+ name: chr-F
83
  - task:
 
84
  type: translation
85
+ name: Translation deu-spa
86
  dataset:
87
  name: newstest2011
88
  type: wmt-2011-news
89
  args: deu-spa
90
  metrics:
91
+ - type: bleu
92
+ value: 31.3
93
+ name: BLEU
94
+ - type: chrf
95
+ value: 0.57133
96
+ name: chr-F
97
  - task:
 
98
  type: translation
99
+ name: Translation deu-spa
100
  dataset:
101
  name: newstest2012
102
  type: wmt-2012-news
103
  args: deu-spa
104
  metrics:
105
+ - type: bleu
106
+ value: 32.6
107
+ name: BLEU
108
+ - type: chrf
109
+ value: 0.58119
110
+ name: chr-F
111
  - task:
 
112
  type: translation
113
+ name: Translation deu-spa
114
  dataset:
115
  name: newstest2013
116
  type: wmt-2013-news
117
  args: deu-spa
118
  metrics:
119
+ - type: bleu
120
+ value: 32.4
121
+ name: BLEU
122
+ - type: chrf
123
+ value: 0.57559
124
+ name: chr-F
125
  ---
126
  # opus-mt-tc-big-de-es
127
 
 
175
  from transformers import MarianMTModel, MarianTokenizer
176
 
177
  src_text = [
178
+ "Ich verstehe nicht, wor�ber ihr redet.",
179
+ "Die V�gel singen in den B�umen."
180
  ]
181
 
182
  model_name = "pytorch-models/opus-mt-tc-big-de-es"
 
188
  print( tokenizer.decode(t, skip_special_tokens=True) )
189
 
190
  # expected output:
191
+ # No entiendo de qu� est�n hablando.
192
+ # Los p�jaros cantan en los �rboles.
193
  ```
194
 
195
  You can also use OPUS-MT models with the transformers pipelines, for example:
 
197
  ```python
198
  from transformers import pipeline
199
  pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-de-es")
200
+ print(pipe("Ich verstehe nicht, wor�ber ihr redet."))
201
 
202
+ # expected output: No entiendo de qu� est�n hablando.
203
  ```
204
 
205
  ## Training
 
231
 
232
  ## Citation Information
233
 
234
+ * Publications: [OPUS-MT Building open translation services for the World](https://aclanthology.org/2020.eamt-1.61/) and [The Tatoeba Translation Challenge Realistic Data Sets for Low Resource and Multilingual MT](https://aclanthology.org/2020.wmt-1.139/) (Please, cite if you use this model.)
235
 
236
  ```
237
  @inproceedings{tiedemann-thottingal-2020-opus,
 
261
 
262
  ## Acknowledgements
263
 
264
+ The work is supported by the [European Language Grid](https://www.european-language-grid.eu/) as [pilot project 2866](https://live.european-language-grid.eu/catalogue/#/resource/projects/2866), by the [FoTran project](https://www.helsinki.fi/en/researchgroups/natural-language-understanding-with-cross-lingual-grounding), funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No 771113), and the [MeMAD project](https://memad.eu/), funded by the European Unions Horizon 2020 Research and Innovation Programme under grant agreement No 780069. We are also grateful for the generous computational resources and IT infrastructure provided by [CSC -- IT Center for Science](https://www.csc.fi/), Finland.
265
 
266
  ## Model conversion info
267