metadata
language:
- en
- de
tags:
- translation
- opus-mt
license: cc-by-4.0
model-index:
- name: opus-mt-eng-deu
results:
- task:
name: Translation eng-deu
type: translation
args: eng-deu
dataset:
name: tatoeba-test-v2021-02-22
type: tatoeba_mt
args: eng-deu
metrics:
- name: BLEU
type: bleu
value: 45.8
Opus Tatoeba English-German
*This model was obtained by running the script convert_marian_to_pytorch.py - Instruction available here. The original models were trained by Jörg Tiedemann using the MarianNMT library. See all available MarianMTModel
models on the profile of the Helsinki NLP group.
This is the conversion of checkpoint opus-2021-02-22.zip *
eng-deu
source language name: English
target language name: German
OPUS readme: README.md
model: transformer
source language code: en
target language code: de
dataset: opus
release date: 2021-02-22
pre-processing: normalization + SentencePiece (spm32k,spm32k)
download original weights: opus-2021-02-22.zip
Training data:
- deu-eng: Tatoeba-train (86845165)
Validation data:
- deu-eng: Tatoeba-dev, 284809
- total-size-shuffled: 284809
- devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
Test data:
- newssyscomb2009.eng-deu: 502/11271
- news-test2008.eng-deu: 2051/47427
- newstest2009.eng-deu: 2525/62816
- newstest2010.eng-deu: 2489/61511
- newstest2011.eng-deu: 3003/72981
- newstest2012.eng-deu: 3003/72886
- newstest2013.eng-deu: 3000/63737
- newstest2014-deen.eng-deu: 3003/62964
- newstest2015-ende.eng-deu: 2169/44260
- newstest2016-ende.eng-deu: 2999/62670
- newstest2017-ende.eng-deu: 3004/61291
- newstest2018-ende.eng-deu: 2998/64276
- newstest2019-ende.eng-deu: 1997/48969
- Tatoeba-test.eng-deu: 10000/83347
test set translations file: test.txt
test set scores file: eval.txt
BLEU-scores
Test set score newstest2018-ende.eng-deu 46.4 Tatoeba-test.eng-deu 45.8 newstest2019-ende.eng-deu 42.4 newstest2016-ende.eng-deu 37.9 newstest2015-ende.eng-deu 32.0 newstest2017-ende.eng-deu 30.6 newstest2014-deen.eng-deu 29.6 newstest2013.eng-deu 27.6 newstest2010.eng-deu 25.9 news-test2008.eng-deu 23.9 newstest2012.eng-deu 23.8 newssyscomb2009.eng-deu 23.3 newstest2011.eng-deu 22.9 newstest2009.eng-deu 22.7 chr-F-scores
Test set score newstest2018-ende.eng-deu 0.697 newstest2019-ende.eng-deu 0.664 Tatoeba-test.eng-deu 0.655 newstest2016-ende.eng-deu 0.644 newstest2015-ende.eng-deu 0.601 newstest2014-deen.eng-deu 0.595 newstest2017-ende.eng-deu 0.593 newstest2013.eng-deu 0.558 newstest2010.eng-deu 0.55 newssyscomb2009.eng-deu 0.539 news-test2008.eng-deu 0.533 newstest2009.eng-deu 0.533 newstest2012.eng-deu 0.53 newstest2011.eng-deu 0.528