File size: 1,889 Bytes
68a88fe 3aea04f 68a88fe 3aea04f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
license: apache-2.0
library_name: JoeyNMT
task: Machine-translation
tags:
- JoeyNMT
- Machine-translation
language:
- en
- de
- fr
- multilingual
datasets:
- may-ohta/iwslt14
metrics:
- bleu
---
# JoeyNMT: iwslt14 de-en-fr multilingual
This is a JoeyNMT model for multilingual MT with language tags, built for a demo purpose.
The model is trained on iwslt14 de-en / en-fr parallel data using DDP.
Install [JoeyNMT](https://github.com/joeynmt/joeynmt) v2.3:
```
$ pip install git+https://github.com/joeynmt/joeynmt.git
```
## Translation
Torch hub interface:
```python
import torch
iwslt14 = torch.hub.load("joeynmt/joeynmt", "iwslt14_prompt")
translation = iwslt14.translate(
src=["Hello world!"], # src sentence
src_prompt=["<en>"], # src language code
trg_prompt=["<de>"], # trg language code
beam_size=1,
)
print(translation) # ["Hallo Welt!"]
```
(See [jupyter notebook](https://github.com/joeynmt/joeynmt/blob/main/notebooks/torchhub.ipynb) for details)
## Training
```
$ python -m joeynmt train iwslt14_prompt/config.yaml --use-ddp --skip-test
```
(See `train.log` for details)
## Evaluation
```
$ git clone https://huggingface.co/may-ohta/iwslt14_prompt
$ python -m joeynmt test iwslt14_prompt/config.yaml --output-path iwslt14_prompt/hyp
```
direction | bleu
--------- | :----
en->de | 28.88
de->en | 35.28
en->fr | 38.86
fr->en | 40.35
- beam_size: 5
- beam_alpha: 1.0
- sacrebleu signature `nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.4.0`
(See `test.log` for details)
## Data Format
We downloaded IWSLT14 de-en and en-fr from [https://wit3.fbk.eu/2014-01](https://wit3.fbk.eu/2014-01) and created `{train|dev|test}.tsv` files in the following format:
|src_prompt|src|trg_prompt|trg|
|:---------|:--|:---------|:--|
|`<en>`|Hello.|`<de>`|Hallo.|
|`<de>`|Vielen Dank!|`<en>`|Thank you!|
(See `test.ref.de-en.tsv`)
|