Spaces:

chinmaydan
/

S2SCascadeDemo

Runtime error

App Files Files Community

chinmaydan commited on Sep 7, 2023

Commit

95a3ca6

1 Parent(s): 4b4ab36

Initial commit

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitignore +9 -0
12e12d_last.pt +3 -0
LICENSE +21 -0
README.md +162 -11
__pycache__/ConSTtester.cpython-310.pyc +0 -0
__pycache__/ConSTtester.cpython-311.pyc +0 -0
__pycache__/mRASPloader.cpython-310.pyc +0 -0
__pycache__/mRASPloader.cpython-311.pyc +0 -0
app +0 -0
app.py +420 -0
bpe_vocab +0 -0
cfg.txt +0 -0
cfgDefault.txt +1 -0
codes.bpe.32000 +0 -0
data-bin/dict.ar.txt +0 -0
data-bin/dict.en.txt +0 -0
data-bin/dict.es.txt +0 -0
data-bin/dict.fr.txt +0 -0
data-bin/dict.ru.txt +0 -0
data-bin/dict.zh.txt +0 -0
data-bin/preprocess.log +106 -0
data-bin/test.en-ar.ar +1 -0
data-bin/test.en-ar.en +1 -0
data-bin/test.en-en.en +1 -0
data-bin/test.en-es.en +1 -0
data-bin/test.en-es.es +1 -0
data-bin/test.en-fr.en +1 -0
data-bin/test.en-fr.fr +1 -0
data-bin/test.en-ru.en +1 -0
data-bin/test.en-ru.ru +1 -0
data-bin/test.en-zh.en +1 -0
data-bin/test.en-zh.zh +1 -0
data-bin/test.es-en.en +1 -0
data-bin/test.es-en.es +1 -0
data-bin/test.es-ru.es +1 -0
data-bin/test.es-ru.ru +1 -0
data-bin/test.es-zh.es +1 -0
data-bin/test.es-zh.zh +1 -0
data-bin/test.zh-en.en +1 -0
data-bin/test.zh-en.zh +1 -0
docs/img.png +0 -0
eval.sh +166 -0
examples/configs/eval_benchmarks.yml +80 -0
examples/configs/parallel_mono_12e12d_contrastive.yml +44 -0
fairseq +1 -0
hubconf.py +69 -0
input.ar +1 -0
input.en +1 -0
input.es +1 -0
input.fr +1 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,9 @@

+12e12d_no_mono.pt
+6e6d_no_mono.pt
+ConST/
+ConSTtester.py
+appCopy.py
+bpe_vocab.1
+codes.bpe.32000.1
+constcaller.py
+requirements.apt.txt

12e12d_last.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f4e378559de68a6b82c7d2e0a48ce4054fbb943305682b6ad389378013e888e2
+size 5586588620

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) Facebook, Inc. and its affiliates.
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,13 +1,164 @@
 ---
-title: S2SCascadeDemo
-emoji: 🏆
-colorFrom: indigo
-colorTo: blue
-sdk: gradio
-sdk_version: 3.42.0
-app_file: app.py
-pinned: false
-license: openrail
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021
+The code for training mCOLT/mRASP2, a multilingual neural machine translation training method, implemented based on [fairseq](https://github.com/pytorch/fairseq).
+**mRASP2**: [paper](https://arxiv.org/abs/2105.09501) [blog](https://medium.com/@panxiao1994/mrasp2-multilingual-nmt-advances-via-contrastive-learning-ac8c4c35d63)
+**mRASP**: [paper](https://www.aclweb.org/anthology/2020.emnlp-main.210.pdf),
+[code](https://github.com/linzehui/mRASP)
 ---
+## News
+We have released two versions, this version is the original one. In this implementation:
+- You should first merge all data, by pre-pending language token before each sentence to indicate the language.
+- AA/RAS muse be done off-line (before binarize), check [this toolkit](https://github.com/linzehui/mRASP/blob/master/preprocess).
+**New implementation**: https://github.com/PANXiao1994/mRASP2/tree/new_impl
+* Acknowledgement: This work is supported by [Bytedance](https://bytedance.com). We thank [Chengqi](https://github.com/zhaocq-nlp) for uploading all files and checkpoints.
+## Introduction
+mRASP2/mCOLT, representing multilingual Contrastive Learning for Transformer, is a multilingual neural machine translation model that supports complete many-to-many multilingual machine translation. It employs both parallel corpora and multilingual corpora in a unified training framework. For detailed information please refer to the paper.
+![img.png](docs/img.png)
+## Pre-requisite
+```bash
+pip install -r requirements.txt
+# install fairseq
+git clone https://github.com/pytorch/fairseq
+cd fairseq
+pip install --editable ./
+```
+## Training Data and Checkpoints
+We release our preprocessed training data and checkpoints in the following.
+### Dataset
+We merge 32 English-centric language pairs, resulting in 64 directed translation pairs in total. The original 32 language pairs corpus contains about 197M pairs of sentences. We get about 262M pairs of sentences after applying RAS, since we keep both the original sentences and the substituted sentences. We release both the original dataset and dataset after applying RAS.
+| Dataset | #Pair |
+| --- | --- |
+| [32-lang-pairs-TRAIN](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_parallel/download.sh) | 197603294 |
+| [32-lang-pairs-RAS-TRAIN](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_parallel_ras/download.sh) | 262662792 |
+| [mono-split-a](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_split_a/download.sh) | - |
+| [mono-split-b](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_split_b/download.sh) | - |
+| [mono-split-c](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_split_c/download.sh) | - |
+| [mono-split-d](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_split_d/download.sh) | - |
+| [mono-split-e](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_split_e/download.sh) | - |
+| [mono-split-de-fr-en](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_de_fr_en/download.sh) | - |
+| [mono-split-nl-pl-pt](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_mono_nl_pl_pt/download.sh) | - |
+| [32-lang-pairs-DEV-en-centric](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_dev_en_centric/download.sh) | - |
+| [32-lang-pairs-DEV-many-to-many](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bin_dev_m2m/download.sh) | - |
+| [Vocab](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/bpe_vocab) | - |
+| [BPE Code](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/emnlp2020/mrasp/pretrain/dataset/codes.bpe.32000) | - |
+### Checkpoints & Results
+* **Please note that the provided checkpoint is sightly different from that in the paper.** In the following sections, we report the results of the provided checkpoints.
+#### English-centric Directions
+We report **tokenized BLEU** in the following table. Please click the model links to download. It is in pytorch format. (check eval.sh for details)
+|Models  | [6e6d-no-mono](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/6e6d_no_mono.pt) | [12e12d-no-mono](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/12e12d_no_mono.pt) | [12e12d](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/12e12d_last.pt) |
+| --- | --- | --- | --- |
+| en2cs/wmt16 | 21.0 | 22.3 | 23.8 |
+| cs2en/wmt16 | 29.6 | 32.4 | 33.2 |
+| en2fr/wmt14 | 42.0 | 43.3 | 43.4 |
+| fr2en/wmt14 | 37.8 | 39.3 | 39.5 |
+| en2de/wmt14 | 27.4 | 29.2 | 29.5 |
+| de2en/wmt14 | 32.2 | 34.9 | 35.2 |
+| en2zh/wmt17 | 33.0 | 34.9 | 34.1 |
+| zh2en/wmt17 | 22.4 | 24.0 | 24.4 |
+| en2ro/wmt16 | 26.6 | 28.1 | 28.7 |
+| ro2en/wmt16 | 36.8 | 39.0 | 39.1 |
+| en2tr/wmt16 | 18.6 | 20.3 | 21.2 |
+| tr2en/wmt16 | 22.2 | 25.5 | 26.1 |
+| en2ru/wmt19 | 17.4 | 18.5 | 19.2 |
+| ru2en/wmt19 | 22.0 | 23.2 | 23.6 |
+| en2fi/wmt17 | 20.2 | 22.1 | 22.9 |
+| fi2en/wmt17 | 26.1 | 29.5 | 29.7 |
+| en2es/wmt13 | 32.8 | 34.1 | 34.6 |
+| es2en/wmt13 | 32.8 | 34.6 | 34.7 |
+| en2it/wmt09 | 28.9 | 30.0 | 30.8 |
+| it2en/wmt09 | 31.4 | 32.7 | 32.8 |
+#### Unsupervised Directions
+We report **tokenized BLEU** in the following table. (check eval.sh for details)
+| | 12e12d |
+| --- | --- |
+| en2pl/wmt20 | 6.2 |
+| pl2en/wmt20 | 13.5 |
+| en2nl/iwslt14 | 8.8 |
+| nl2en/iwslt14 | 27.1 |
+| en2pt/opus100 | 18.9 |
+| pt2en/opus100 | 29.2 |
+#### Zero-shot Directions
+* row: source language
+* column: target language
+We report **[sacreBLEU](https://github.com/mozilla/sacreBLEU)** in the following table.
+| 12e12d  | ar | zh | nl | fr | de | ru |
+| --- | --- | --- | --- | --- | --- | --- |
+| ar | - | 32.5 | 3.2 | 22.8 | 11.2 | 16.7 |
+| zh | 6.5 | - | 1.9 | 32.9 | 7.6 | 23.7 |
+| nl | 1.7 | 8.2 | - | 7.5 | 10.2 | 2.9 |
+| fr | 6.2 | 42.3 | 7.5 | - | 18.9 | 24.4 |
+| de | 4.9 | 21.6 | 9.2 | 24.7 | - | 14.4 |
+| ru | 7.1 | 40.6 | 4.5 | 29.9 | 13.5 | - |
+## Training
+```bash
+export NUM_GPU=4 && bash train_w_mono.sh ${model_config}
+```
+* We give example of `${model_config}` in `${PROJECT_REPO}/examples/configs/parallel_mono_12e12d_contrastive.yml`
+## Inference
+* You must pre-pend the corresponding language token to the source side before binarize the test data.
+```bash
+fairseq-generate ${test_path} \
+    --user-dir ${repo_dir}/mcolt \
+    -s ${src} \
+    -t ${tgt} \
+    --skip-invalid-size-inputs-valid-test \
+    --path ${ckpts} \
+    --max-tokens ${batch_size} \
+    --task translation_w_langtok \
+    ${options} \
+    --lang-prefix-tok "LANG_TOK_"`echo "${tgt} " | tr '[a-z]' '[A-Z]'` \
+    --max-source-positions ${max_source_positions} \
+    --max-target-positions ${max_target_positions} \
+    --nbest 1 | grep -E '[S|H|P|T]-[0-9]+' > ${final_res_file}
+python3 ${repo_dir}/scripts/utils.py ${res_file} ${ref_file} || exit 1;
+```
+## Synonym dictionaries
+We use the bilingual synonym dictionaries provised by [MUSE](https://github.com/facebookresearch/MUSE).
+We generate multilingual synonym dictionaries using [this script](https://github.com/linzehui/mRASP/blob/master/preprocess/tools/ras/multi_way_word_graph.py), and apply
+RAS using [this script](https://github.com/linzehui/mRASP/blob/master/preprocess/tools/ras/random_alignment_substitution_w_multi.sh).
+| Description | File | Size |
+| --- | --- | --- |
+| dep=1 | [synonym_dict_raw_dep1](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/synonym_dict_raw_dep1) | 138.0 M |
+| dep=2 | [synonym_dict_raw_dep2](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/synonym_dict_raw_dep2) | 1.6 G |
+| dep=3 | [synonym_dict_raw_dep3](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/acl2021/mrasp2/synonym_dict_raw_dep3) | 2.2 G |
+## Contact
+Please contact me via e-mail `[email protected]` or via [wechat/zhihu](https://fork-ball-95c.notion.site/mRASP2-4e9b3450d5aa4137ae1a2c46d5f3c1fa) or join [the slack group](https://mrasp2.slack.com/join/shared_invite/zt-10k9710mb-MbDHzDboXfls2Omd8cuWqA)!
+## Citation
+Please cite as:
+```
+@inproceedings{mrasp2,
+  title = {Contrastive Learning for Many-to-many Multilingual Neural Machine Translation},
+  author= {Xiao Pan and
+           Mingxuan Wang and
+           Liwei Wu and
+           Lei Li},
+  booktitle = {Proceedings of ACL 2021},
+  year = {2021},
+}
+```

__pycache__/ConSTtester.cpython-310.pyc ADDED Viewed

Binary file (1.1 kB). View file

__pycache__/ConSTtester.cpython-311.pyc ADDED Viewed

Binary file (1.03 kB). View file

__pycache__/mRASPloader.cpython-310.pyc ADDED Viewed

Binary file (7.61 kB). View file

__pycache__/mRASPloader.cpython-311.pyc ADDED Viewed

Binary file (14.6 kB). View file

app ADDED Viewed

File without changes

app.py ADDED Viewed

	@@ -0,0 +1,420 @@

+# imports
+import os
+import sys
+import gradio as gr
+import whisper
+import torch
+import traceback
+import shutil
+import yaml
+import re
+from pydub import AudioSegment
+from huggingface_hub import snapshot_download
+import json
+import requests
+import wave
+from pynvml import *
+import time
+import mRASPloader
+torch.cuda.empty_cache()
+# TTS header and url
+headers = {"Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoiYTI5NDFhMmEtYzA5ZS00YTcyLWI5ZGItODM5ODEzZDIwMGEwIiwidHlwZSI6ImFwaV90b2tlbiJ9.StBap5nQtNqjh1BMz9DledR5tg5FTWdUMVBrDwY6DjY"}
+url ="https://api.edenai.run/v2/audio/text_to_speech"
+# the model we are using for ASR, options are small, medium, large and largev2 (large and largev2 don't fit on huggingface cpu)
+model = whisper.load_model("medium")
+# A table to look up all the languages
+language_id_lookup = {
+            "Arabic"    : "ar",
+            "English"   : "en",
+            "Chinese"   : "zh",
+            "Spanish"   : "es",
+            "Russian"   : "ru",
+            "French"    : "fr",
+            "German"    : "de",
+            "Italian"   : "it",
+            "Netherlands": "nl",
+            "Portuguese": "pt",
+            "Romanian"  : "ro",
+            }
+# A lookup table for ConST
+LANG_GEN_SETUPS = {
+    "de": {"beam": 10, "lenpen": 0.7},
+    "es": {"beam": 10, "lenpen": 0.1},
+    "fr": {"beam": 10, "lenpen": 1.0},
+    "it": {"beam": 10, "lenpen": 0.5},
+    "nl": {"beam": 10, "lenpen": 0.4},
+    "pt": {"beam": 10, "lenpen": 0.9},
+    "ro": {"beam": 10, "lenpen": 1.0},
+    "ru": {"beam": 10, "lenpen": 0.3},
+}
+# A lookup table for TTS (edenai)
+lang2voice = {
+            "Arabic"    : ["ar-XA", "MALE"],
+            "English"   : ["en-US", "FEMALE"],
+            "Chinese"   : ["cmn-TW", "MALE"],
+            "Spanish"   : ["es-ES","MALE"],
+            "Russian"   : ["ru-RU,", "FEMALE"],
+            "French"    : ["fr-FR", "FEMALE"],
+            "German"    : ["de-DE", "MALE"],
+            "Italian"   : ["it-IT", "FEMALE"],
+            "Netherlands": ["nl-NL", "MALE"],
+            "Portuguese": ["pt-BR", "FEMALE"],
+            "Romanian"  : ["ro-RO", "MALE"],
+            }
+# load whisper
+os.system("pip install git+https://github.com/openai/whisper.git")
+# load mRASP2
+# load ConST
+#os.system("git clone https://github.com/ReneeYe/ConST")
+#os.system("mv ConST ConST_git")
+#os.system('mv -n ConST_git/* ./')
+#os.system("rm -rf ConST_git")
+#os.system("pip3 install --editable ./")
+#os.system("mkdir -p data checkpoint")
+huggingface_model_dir = snapshot_download(repo_id="ReneeYe/ConST_en2x_models")
+print(huggingface_model_dir)
+def restrict_src_options(model_type):
+    if model_type == 'Whisper+mRASP2':
+        return gr.Dropdown.update(visible= True), gr.Dropdown.update(visible= True), gr.Dropdown.update(visible= False), gr.Button.update(visible= True)
+    else:
+        return gr.Dropdown.update(visible= False), gr.Dropdown.update(visible= False), gr.Dropdown.update(visible= True), gr.Button.update(visible= False)
+def switchLang(src_lang, tgt_lang):
+    return tgt_lang, src_lang
+# The predict function. audio, language and mic_audio are all parameters directly passed by gradio
+# which means they are user inputted. They are specified in gr.inputs[] block at the bottom. The
+# gr.outputs[] block will specify the output type.
+def predict(audio, src_language, tgt_language_mRASP, tgt_language_ConST, model_type, mic_audio=None):
+     # checks if mic_audio is used, otherwise feeds model uploaded audio
+    start_predict = time.time()
+    if mic_audio is not None:
+        input_audio = mic_audio
+    elif audio is not None:
+        input_audio = audio
+    else:
+        return "(please provide audio)"
+    transcript = "Undefined"
+    translation = "Undefined"
+    if model_type == 'Whisper+mRASP2':
+        transcript, translation = predictWithmRASP2(input_audio, src_language, tgt_language_mRASP)
+        language = tgt_language_mRASP
+    elif model_type == 'ConST':
+        predictWithConST(input_audio, tgt_language_ConST)
+        language = tgt_language_ConST
+    start_tts = time.time()
+    payload={
+    "providers": "google",
+    "language": lang2voice[language][0],
+    "option": lang2voice[language][1],
+    "text": translation,
+    }
+    response = requests.post(url, json=payload, headers=headers)
+    result = json.loads(response.text)
+    os.system('wget -O output.wav "{}"'.format(result['google']['audio_resource_url']))
+    tts_time = time.time() - start_tts
+    print(f"Took {tts_time} to do text to speech")
+    total_time = time.time() - start_predict
+    print(f"Took {total_time} to do entire prediction")
+    return transcript, translation, "output.wav"
+def predictWithmRASP2(input_audio, src_language, tgt_language):
+    print("Called predictWithmRASP2")
+    # Uses the model's preprocessing methods to preprocess audio
+    asr_start = time.time()
+    audio = whisper.load_audio(input_audio)
+    audio = whisper.pad_or_trim(audio)
+    # Calculates the mel frequency spectogram
+    mel = whisper.log_mel_spectrogram(audio).to(model.device)
+    # if model is supposed to detect language, set outLanguage to None
+    # otherwise set to specified language
+    if(src_language == "Detect Language"):
+        src_language = None
+    else:
+        src_language = language_id_lookup[src_language.split()[0]]
+    tgt_language = language_id_lookup[tgt_language.split()[0]]
+    # Runs the audio through the whisper model and gets the DecodingResult object, which has the features:
+    # audio_features (Tensor), language, language_probs, tokens, text, avg_logprob, no_speech_prob, temperature, compression_ratio
+    # asr
+    options = whisper.DecodingOptions(fp16 = True, language = src_language)
+    result = whisper.decode(model, mel, options)
+    if src_language is None:
+        src_language = result.language
+    transcript = result.text
+    asr_time = time.time() - asr_start
+    mt_start_time = time.time()
+    # mt
+    with open("input." + src_language, 'w') as w:
+        w.write(result.text)
+    with open("input." + tgt_language, 'w') as w:
+        w.write('LANG_TOK_' + src_language.upper())
+    #os.system("python3 fairseq/fairseq_cli/preprocess.py --dataset-impl raw \
+    #          --srcdict bpe_vocab --tgtdict bpe_vocab --testpref input -s {} -t {}".format( \
+    #    src_language, tgt_language))
+    #previous way of doing it
+    old_way = """os.system("python3 fairseq/fairseq_cli/interactive.py ./data-bin \
+              --user-dir mcolt \
+              -s zh \
+              -t en \
+              --skip-invalid-size-inputs-valid-test \
+              --path {} \
+              --max-tokens 1024 \
+              --task translation_w_langtok \
+              --lang-prefix-tok \"LANG_TOK_{}\" \
+              --max-source-positions 1024 \
+              --max-target-positions 1024 \
+              --nbest 1 \
+              --bpe subword_nmt \
+              --bpe-codes codes.bpe.32000 \
+              --post-process --tokenizer moses \
+              --input input.{} | grep -E '[D]-[0-9]+' > output".format(
+        model_name, tgt_language.upper(), src_language))"""
+    translation = mRASPloader.infer(cfg, models, task, max_positions, tokenizer, bpe, use_cuda, generator, src_dict, tgt_dict, align_dict, start_time, start_id, src_language, tgt_language)
+    translation = (' '.join(translation.split(' ')[1:])).strip()
+    mt_time = time.time() - mt_start_time
+    print(f"Took {mt_time} to do Machine Translation")
+    #print(model_name)
+    #with open("output", 'r') as r:
+    #    translation = "Undefined"
+    #    translation = (' '.join(r.readline().split(' ')[1:])).strip()
+    #    print(translation)
+    # Returns the text
+    print("returning transcript: " + transcript + " and the translation: " + translation)
+    return transcript, translation
+# Helper methods for ConST (as written in https://huggingface.co/spaces/ReneeYe/ConST-speech2text-translator/blob/main/app.py)
+def convert_audio_to_16k_wav(audio_input):
+    sound = AudioSegment.from_file(audio_input)
+    sample_rate = sound.frame_rate
+    num_channels = sound.channels
+    num_frames = int(sound.frame_count())
+    filename = audio_input.split("/")[-1]
+    print("original file is at:", audio_input)
+    if (num_channels > 1) or (sample_rate != 16000): # convert to mono-channel 16k wav
+        if num_channels > 1:
+            sound = sound.set_channels(1)
+        if sample_rate != 16000:
+            sound = sound.set_frame_rate(16000)
+        num_frames = int(sound.frame_count())
+        filename = filename.replace(".wav", "") + "_16k.wav"
+        sound.export(f"data/{filename}", format="wav")
+    else:
+        shutil.copy(audio_input, f'data/{filename}')
+    return filename, num_frames
+def prepare_tsv(file_name, n_frame, language, task="ST"):
+    tgt_lang = language_id_lookup[language]
+    with open("data/test_case.tsv", "w") as f:
+        f.write("id\taudio\tn_frames\ttgt_text\tspeaker\tsrc_lang\ttgt_lang\tsrc_text\n")
+        f.write(f"sample\t{file_name}\t{n_frame}\tThis is in {tgt_lang}.\tspk.1\ten\t{tgt_lang}\tThis is English.\n")
+def get_vocab_and_yaml(language):
+    tgt_lang = language_id_lookup[language]
+    # get: spm_ende.model and spm_ende.txt, and save to data/xxx
+    # if exist, no need to download
+    shutil.copy(os.path.join(huggingface_model_dir, f"vocabulary/spm_en{tgt_lang}.model"), "./data")
+    shutil.copy(os.path.join(huggingface_model_dir, f"vocabulary/spm_en{tgt_lang}.txt"), "./data")
+    # write yaml file
+    abs_path = os.popen("pwd").read().strip()
+    yaml_dict = LANG_GEN_SETUPS[tgt_lang]
+    yaml_dict["input_channels"] = 1
+    yaml_dict["use_audio_input"] = True
+    yaml_dict["prepend_tgt_lang_tag"] = True
+    yaml_dict["prepend_src_lang_tag"] = True
+    yaml_dict["audio_root"] = os.path.join(abs_path, "data")
+    yaml_dict["vocab_filename"] = f"spm_en{tgt_lang}.txt"
+    yaml_dict["bpe_tokenizer"] = {"bpe": "sentencepiece",
+                                  "sentencepiece_model": os.path.join(abs_path, f"data/spm_en{tgt_lang}.model")}
+    with open("data/config.yaml", "w") as f:
+        yaml.dump(yaml_dict, f)
+def get_model(language):
+    # download models to checkpoint/xxx
+    return os.path.join(huggingface_model_dir, f"models/const_en{language_id_lookup[language]}.pt")
+def generate(model_path):
+    os.system(f"python3 fairseq/fairseq_cli/generate.py data/ --gen-subset test_case --task speech_to_text --prefix-size 1 \
+                 --max-source-positions 4000000 \
+                --config-yaml config.yaml  --path {model_path} | tee temp.txt")
+    print("No problem with 1st line")
+    output = os.popen("grep ^D temp.txt | sort -n -k 2 -t '-' | cut -f 3")
+    return output.read().strip()
+def post_processing(raw_sentence):
+    output_sentence = raw_sentence
+    if ":" in raw_sentence:
+        splited_sent = raw_sentence.split(":")
+        if len(splited_sent) == 2:
+            prefix = splited_sent[0].strip()
+            if len(prefix) <= 3:
+                output_sentence = splited_sent[1].strip()
+            elif ("(" in prefix) and (")" in prefix):
+                bgm = re.findall(r"\(.*?\)", prefix)[0]
+                if len(prefix.replace(bgm, "").strip()) <= 3:
+                    output_sentence = splited_sent[1].strip()
+                elif len(splited_sent[1].strip()) > 8:
+                    output_sentence = splited_sent[1].strip()
+    elif ("(" in raw_sentence) and (")" in raw_sentence):
+        bgm_list = re.findall(r"\(.*?\)", raw_sentence)
+        for bgm in bgm_list:
+            if len(raw_sentence.replace(bgm, "").strip()) > 5:
+                output_sentence = output_sentence.replace(bgm, "").strip()
+        if len(output_sentence) <= 5:
+            output_sentence = raw_sentence
+    return output_sentence
+def remove_temp_files(audio_file):
+    os.remove("temp.txt")
+    os.remove("data/test_case.tsv")
+    os.remove(f"data/{audio_file}")
+def error_output(language):
+    return f"Fail to translate the audio into {language}, you may use the examples I provide."
+# Predicting the translation with ConST
+def predictWithConST(audio_file, language):
+    try:
+        converted_audio_file, n_frame = convert_audio_to_16k_wav(audio_file)
+        prepare_tsv(converted_audio_file, n_frame, language)
+        get_vocab_and_yaml(language)
+        model_path = get_model(language)
+        print("This is the model path: " + model_path)
+        generate_model_path = generate(model_path)
+        print("No problem generating model path")
+        generated_output = post_processing(generate_model_path)
+        print("No problem generating output")
+        remove_temp_files(converted_audio_file)
+        print("No problem removing_temp")
+        return generated_output
+    except:
+        traceback.print_exc()
+        return error_output(language)
+title = "Demo for Speech Translation (Whisper+mRASP2 and ConST)"
+description = """
+<b>How to use:</b> Upload an audio file or record using the microphone. The audio is either processed by being inputted into the openai whisper model for transcription
+and then mRASP2 for translation, or by ConST, which directly takes the audio input and produces text in the desired language. When using Whisper+mRASP2,
+you can ask the model to detect a language, it will tell you what language it detected. ConST only supports translating from English to another language.
+"""
+# The gradio block
+cfg = mRASPloader.createCFG()
+print(cfg)
+models, task, max_positions, tokenizer, bpe, use_cuda, generator, src_dict, tgt_dict, align_dict, start_time, start_id = mRASPloader.loadmRASP2(cfg)
+demo = gr.Blocks()
+with demo:
+    gr.Markdown("# " + title)
+    gr.Markdown("###" + description)
+    with gr.Row():
+        with gr.Column():
+            model_type = gr.Dropdown(['Whisper+mRASP2', 'ConST'], type = "value", value = 'Whisper+mRASP2', label = "Select the model you want to use.")
+            audio_file = gr.Audio(label="Upload Speech", source="upload", type="filepath")
+            src_language = gr.Dropdown(['Arabic',
+                                    'Chinese',
+                                    'English',
+                                    'Spanish',
+                                    'Russian',
+                                    'French',
+                                    'Detect Language'], value = 'English', label="Select the language of input")
+            tgt_language_mRASP = gr.Dropdown(['Arabic',
+                                    'Chinese',
+                                    'English',
+                                    'Spanish',
+                                    'Russian',
+                                    'French'], type="value", value='English', label="Select the language of output")
+            tgt_language_ConST = gr.Dropdown(['German',
+                                              'Spanish',
+                                              'French',
+                                              'Italian',
+                                              'Netherlands',
+                                              'Portugese',
+                                              'Romanian',
+                                              'Russian'], type = 'value', value='German', label="Select the language of output", visible= False)
+            switch_lang_button = gr.Button("Switch input and output languages")
+            mic_audio = gr.Audio(label="Record Speech", source="microphone", type="filepath")
+            model_type.change(fn = restrict_src_options, inputs=[model_type], outputs=[src_language, tgt_language_mRASP, tgt_language_ConST, switch_lang_button])
+            submit_button = gr.Button("Submit")
+        with gr.Column():
+            transcript = gr.Text(label= "Transcription")
+            translate = gr.Text(label= "Translation")
+            translated_speech = gr.Audio(label="Translation Speech")
+    submit_button.click(fn = predict, inputs=[audio_file, src_language, tgt_language_mRASP, tgt_language_ConST, model_type, mic_audio], outputs=[transcript, translate, translated_speech])
+    switch_lang_button.click(switchLang, [src_language, tgt_language_mRASP], [src_language, tgt_language_mRASP])
+demo.launch(share= True)

bpe_vocab ADDED Viewed

The diff for this file is too large to render. See raw diff

cfg.txt ADDED Viewed

Binary file (23.4 kB). View file

cfgDefault.txt ADDED Viewed

	@@ -0,0 +1 @@

+ FairseqConfig(_name=None, common=CommonConfig(_name=None, no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma'), common_eval=CommonEvalConfig(_name=None, path=None, post_process=None, quiet=False, model_overrides='{}', results_path=None), distributed_training=DistributedTrainingConfig(_name=None, distributed_world_size=8, distributed_num_procs=8, distributed_rank=0, distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, device_id=0, distributed_no_spawn=False, ddp_backend='pytorch_ddp', ddp_comm_hook='none', bucket_cap_mb=25, fix_batches_to_gpus=False, find_unused_parameters=False, gradient_as_bucket_view=False, fast_stat_sync=False, heartbeat_timeout=-1, broadcast_buffers=False, slowmo_momentum=None, slowmo_base_algorithm='localsgd', localsgd_frequency=3, nprocs_per_node=8, pipeline_model_parallel=False, pipeline_balance=None, pipeline_devices=None, pipeline_chunks=0, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_checkpoint='never', zero_sharding='none', fp16='${common.fp16}', memory_efficient_fp16='${common.memory_efficient_fp16}', tpu='${common.tpu}', no_reshard_after_forward=False, fp32_reduce_scatter=False, cpu_offload=False, use_sharded_state=False, not_fsdp_flatten_parameters=False), dataset=DatasetConfig(_name=None, num_workers=1, skip_invalid_size_inputs_valid_test=False, max_tokens=None, batch_size=None, required_batch_size_multiple=8, required_seq_len_multiple=1, dataset_impl=None, data_buffer_size=10, train_subset='train', valid_subset='valid', combine_valid_subsets=None, ignore_unused_valid_subsets=False, validate_interval=1, validate_interval_updates=0, validate_after_updates=0, fixed_validation_seed=None, disable_validation=False, max_tokens_valid='${dataset.max_tokens}', batch_size_valid='${dataset.batch_size}', max_valid_steps=None, curriculum=0, gen_subset='test', num_shards=1, shard_id=0, grouped_shuffling=False, update_epoch_batch_itr='${dataset.grouped_shuffling}', update_ordered_indices_seed=False), optimization=OptimizationConfig(_name=None, max_epoch=0, max_update=0, stop_time_hours=0, clip_norm=0.0, sentence_avg=False, update_freq=[1], lr=[0.25], stop_min_lr=-1.0, use_bmuf=False, skip_remainder_batch=False, debug_param_names=False), checkpoint=CheckpointConfig(_name=None, save_dir='checkpoints', restore_file='checkpoint_last.pt', continue_once=None, finetune_from_model=None, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, optimizer_overrides='{}', save_interval=1, save_interval_updates=0, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, keep_best_checkpoints=-1, no_save=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_save_optimizer_state=False, best_checkpoint_metric='loss', maximize_best_checkpoint_metric=False, patience=-1, checkpoint_suffix='', checkpoint_shard_count=1, load_checkpoint_on_all_dp_ranks=False, write_checkpoints_asynchronously=False, model_parallel_size='${common.model_parallel_size}'), bmuf=FairseqBMUFConfig(_name=None, block_lr=1, block_momentum=0.875, global_sync_iter=50, warmup_iterations=500, use_nbm=False, average_sync=False, distributed_world_size='${distributed_training.distributed_world_size}'), generation=GenerationConfig(_name=None, beam=5, beam_mt=0, nbest=1, max_len_a=0, max_len_b=200, max_len_a_mt=0, max_len_b_mt=200, min_len=1, match_source_len=False, unnormalized=False, no_early_stop=False, no_beamable_mm=False, lenpen=1, lenpen_mt=1, unkpen=0, replace_unk=None, sacrebleu=False, score_reference=False, prefix_size=0, no_repeat_ngram_size=0, sampling=False, sampling_topk=-1, sampling_topp=-1.0, constraints=None, temperature=1.0, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, print_alignment=None, print_step=False, lm_path=None, lm_weight=0.0, iter_decode_eos_penalty=0.0, iter_decode_max_iter=10, iter_decode_force_max_iter=False, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, retain_iter_history=False, retain_dropout=False, retain_dropout_modules=None, decoding_format=None, no_seed_provided=False, eos_token=None), eval_lm=EvalLMConfig(_name=None, output_word_probs=False, output_word_stats=False, context_window=0, softmax_batch=9223372036854775807), interactive=InteractiveConfig(_name=None, buffer_size=0, input='-'), model='???', task=None, criterion=None, optimizer=None, lr_scheduler=None, scoring=None, bpe=None, tokenizer=None, ema=EMAConfig(_name=None, store_ema=False, ema_decay=0.9999, ema_start_update=0, ema_seed_model=None, ema_update_freq=1, ema_fp32=False))

codes.bpe.32000 ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.ar.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.en.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.es.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.fr.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.ru.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/dict.zh.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data-bin/preprocess.log ADDED Viewed

	@@ -0,0 +1,106 @@

+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='zh', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='zh', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='fr', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='zh', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='es', target_lang='en', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ar', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='zh', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='es', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='ru', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin
+Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, aim_repo=None, aim_run_hash=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang='en', target_lang='fr', trainpref=None, validpref=None, testpref='input', align_suffix=None, destdir='data-bin', thresholdtgt=0, thresholdsrc=0, tgtdict='bpe_vocab', srcdict='bpe_vocab', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='raw', joined_dictionary=False, only_source=False, padding_factor=8, workers=1, dict_only=False)
+Wrote preprocessed data to data-bin

data-bin/test.en-ar.ar ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.en-ar.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hello.

data-bin/test.en-en.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.en-es.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hello.

data-bin/test.en-es.es ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.en-fr.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hello.

data-bin/test.en-fr.fr ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.en-ru.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hello.

data-bin/test.en-ru.ru ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.en-zh.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hello.

data-bin/test.en-zh.zh ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

data-bin/test.es-en.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_ES

data-bin/test.es-en.es ADDED Viewed

	@@ -0,0 +1 @@


1	+ necesito un carro para ver mi comida

data-bin/test.es-ru.es ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hola, soy Jeff. ¿Cómo estás?

data-bin/test.es-ru.ru ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_ES

data-bin/test.es-zh.es ADDED Viewed

	@@ -0,0 +1 @@


1	+ Hoy vamos a hacer un menú mexicano. Vamos a hacer un apetizer, una entree y un deser. Ahora vamos a hacer el deser, que es la flan. Lo estamos haciendo ahora porque se tiende muy largo. Así que vamos a seguir. Estos son nuestros ingredientes para hacer el flan, nuestro deser. Ahora ponemos el azúcar para que lo caramelizemos y lo ponemos en la flan.

data-bin/test.es-zh.zh ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_ES

data-bin/test.zh-en.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_ZH

data-bin/test.zh-en.zh ADDED Viewed

	@@ -0,0 +1 @@


1	+ 天气真好

docs/img.png ADDED Viewed

eval.sh ADDED Viewed

	@@ -0,0 +1,166 @@

+#!/usr/bin/env bash
+# repo_dir: root directory of the project
+repo_dir="$( cd "$( dirname "$0" )" && pwd )"
+echo "==== Working directory: ====" >&2
+echo "${repo_dir}" >&2
+echo "============================" >&2
+test_config=$1
+source ${repo_dir}/scripts/load_config.sh ${test_config} ${repo_dir}
+model_dir=$2
+choice=$3  # all|best|last
+model_dir=${repo_dir}/model
+data_dir=${repo_dir}/data
+res_path=${model_dir}/results
+mkdir -p ${model_dir} ${data_dir} ${res_path}
+testset_name=data_testset_1_name
+testset_path=data_testset_1_path
+testset_ref=data_testset_1_ref
+testset_direc=data_testset_1_direction
+i=1
+testsets=""
+while [[ ! -z ${!testset_path} && ! -z ${!testset_direc} ]]; do
+    dataname=${!testset_name}
+    mkdir -p ${data_dir}/${!testset_direc}/${dataname} ${data_dir}/ref/${!testset_direc}/${dataname}
+    cp ${!testset_path}/* ${data_dir}/${!testset_direc}/${dataname}/
+    cp ${!testset_ref}/* ${data_dir}/ref/${!testset_direc}/${dataname}/
+    if [[ $testsets == "" ]]; then
+        testsets=${!testset_direc}/${dataname}
+    else
+        testsets=${testsets}:${!testset_direc}/${dataname}
+    fi
+    i=$((i+1))
+    testset_name=testset_${i}_name
+    testset_path=testset_${i}_path
+    testset_ref=testset_${i}_ref
+    testset_direc=testset_${i}_direction
+done
+IFS=':' read -r -a testset_list <<< ${testsets}
+bleu () {
+    src=$1
+    tgt=$2
+    res_file=$3
+    ref_file=$4
+    if [[ -f ${res_file} ]]; then
+        f_dirname=`dirname ${res_file}`
+        python3 ${repo_dir}/scripts/utils.py ${res_file} ${ref_file} || exit 1;
+        input_file="${f_dirname}/hypo.out.nobpe"
+        output_file="${f_dirname}/hypo.out.nobpe.final"
+        # form command
+        cmd="cat ${input_file}"
+        lang_token="LANG_TOK_"`echo "${tgt} " | tr '[a-z]' '[A-Z]'`
+        if [[ $tgt == "fr" ]]; then
+            cmd=$cmd" | sed -Ee 's/\"([^\"]*)\"/« \1 »/g'"
+        elif [[ $tgt == "zh" ]]; then
+            tokenizer="zh"
+        elif [[ $tgt == "ja" ]]; then
+            tokenizer="ja-mecab"
+        fi
+        [[ -z $tokenizer ]] && tokenizer="none"
+        cmd=$cmd" | sed -e s'|${lang_token} ||g' > ${output_file}"
+        eval $cmd || { echo "$cmd FAILED !"; exit 1; }
+        cat ${output_file} | sacrebleu -l ${src}-${tgt} -tok $tokenizer --short "${f_dirname}/ref.out" | awk '{print $3}'
+    else
+        echo "${res_file} not exist!" >&2 && exit 1;
+    fi
+}
+# monitor
+# ${ckptname}/${direction}/${testname}/orig.txt
+(inotifywait -r -m -e close_write ${res_path} |
+while read path action file; do
+    if [[ "$file" =~ .*txt$ ]]; then
+        tmp_str="${path%/*}"
+        testname="${tmp_str##*/}"
+        tmp_str="${tmp_str%/*}"
+        direction="${tmp_str##*/}"
+        tmp_str="${tmp_str%/*}"
+        ckptname="${tmp_str##*/}"
+        src_lang="${direction%2*}"
+        tgt_lang="${direction##*2}"
+        res_file=$path$file
+        ref_file=${data_dir}/ref/${direction}/${testname}/dev.${tgt_lang}
+        bleuscore=`bleu ${src_lang} ${tgt_lang} ${res_file} ${ref_file}`
+        bleu_str="$(date "+%Y-%m-%d %H:%M:%S")\t${ckptname}\t${direction}/${testname}\t$bleuscore"
+        echo -e ${bleu_str}  # to stdout
+        echo -e ${bleu_str} >> ${model_dir}/summary.log
+    fi
+done) &
+if [[ ${choice} == "all" ]]; then
+    filelist=`ls -la ${model_dir} | sort -k6,7 -r | awk '{print $NF}' | grep .pt$ | tr '\n' ' '`
+elif [[ ${choice} == "best" ]]; then
+    filelist="${model_dir}/checkpoint_best.pt"
+elif [[ ${choice} == "last" ]]; then
+    filelist="${model_dir}/checkpoint_last.pt"
+else
+    echo "invalid choice!" && exit 2;
+fi
+N=${NUM_GPU}
+#export CUDA_VISIBLE_DEVICES=$(seq -s ',' 0 $(($N - 1)) )
+infer_test () {
+    test_path=$1
+    ckpts=$2
+    gpu=$3
+    final_res_file=$4
+    src=$5
+    tgt=$6
+    gpu_cmd="CUDA_VISIBLE_DEVICES=$gpu "
+    lang_token="LANG_TOK_"`echo "${tgt} " | tr '[a-z]' '[A-Z]'`
+    [[ -z ${max_source_positions} ]] && max_source_positions=1024
+    [[ -z ${max_target_positions} ]] && max_target_positions=1024
+    command=${gpu_cmd}"fairseq-generate ${test_path} \
+    --user-dir ${repo_dir}/mcolt \
+    -s ${src} \
+    -t ${tgt} \
+    --skip-invalid-size-inputs-valid-test \
+    --path ${ckpts} \
+    --max-tokens 1024 \
+    --task translation_w_langtok \
+    ${options} \
+    --lang-prefix-tok ${lang_token} \
+    --max-source-positions ${max_source_positions} \
+    --max-target-positions ${max_target_positions} \
+    --nbest 1 | grep -E '[S|H|P|T]-[0-9]+' > ${final_res_file}
+    "
+    echo "$command"
+}
+export -f infer_test
+i=0
+(for ckpt in ${filelist}
+do
+    for testset in "${testset_list[@]}"
+    do
+        ckptbase=`basename $ckpt`
+        ckptname="${ckptbase%.*}"
+        direction="${testset%/*}"
+        testname="${testset##*/}"
+        src_lang="${direction%2*}"
+        tgt_lang="${direction##*2}"
+        ((i=i%N)); ((i++==0)) && wait
+        test_path=${data_dir}/${testset}
+        echo "-----> "${ckptname}" | "${direction}/$testname" <-----" >&2
+        if [[ ! -d ${res_path}/${ckptname}/${direction}/${testname} ]]; then
+            mkdir -p ${res_path}/${ckptname}/${direction}/${testname}
+        fi
+        final_res_file="${res_path}/${ckptname}/${direction}/${testname}/orig.txt"
+        command=`infer_test ${test_path} ${model_dir}/${ckptname}.pt $((i-1)) ${final_res_file} ${src_lang} ${tgt_lang}`
+        echo "${command}"
+        eval $command &
+    done
+done)

examples/configs/eval_benchmarks.yml ADDED Viewed

	@@ -0,0 +1,80 @@

+data_testset_1:
+  direction: en2de
+  name: wmt14
+  path: data/binarized/en_de/en2de/wmt14
+  ref: data/dev/en2de/wmt14
+data_testset_10:
+  direction: ru2en
+  name: newstest2019
+  path: data/binarized/en_ru/ru2en/newstest2019
+  ref: data/dev/ru2en/newstest2019
+data_testset_11:
+  direction: en2fi
+  name: newstest2017
+  path: data/binarized/en_fi/en2fi/newstest2017
+  ref: data/dev/en2fi/newstest2017
+data_testset_12:
+  direction: fi2en
+  name: newstest2017
+  path: data/binarized/en_fi/fi2en/newstest2017
+  ref: data/dev/fi2en/newstest2017
+data_testset_13:
+  direction: en2cs
+  name: newstest2016
+  path: data/binarized/en_cs/en2cs/newstest2016
+  ref: data/dev/en2cs/newstest2016
+data_testset_14:
+  direction: cs2en
+  name: newstest2016
+  path: data/binarized/en_cs/cs2en/newstest2016
+  ref: data/dev/cs2en/newstest2016
+data_testset_15:
+  direction: en2et
+  name: newstest2018
+  path: data/binarized/en_et/en2et/newstest2018
+  ref: data/dev/en2et/newstest2018
+data_testset_16:
+  direction: et2en
+  name: newstest2018
+  path: data/binarized/en_et/et2en/newstest2018
+  ref: data/dev/et2en/newstest2018
+data_testset_2:
+  direction: de2en
+  name: wmt14
+  path: data/binarized/en_de/de2en/wmt14
+  ref: data/dev/de2en/wmt14
+data_testset_3:
+  direction: en2fr
+  name: newstest2014
+  path: data/binarized/en_fr/en2fr/newstest2014
+  ref: data/dev/en2fr/newstest2014
+data_testset_4:
+  direction: fr2en
+  name: newstest2014
+  path: data/binarized/en_fr/fr2en/newstest2014
+  ref: data/dev/fr2en/newstest2014
+data_testset_5:
+  direction: en2ro
+  name: wmt16
+  path: data/binarized/en_ro/en_ro/wmt16
+  ref: data/dev/en_ro/wmt16
+data_testset_6:
+  direction: ro2en
+  name: wmt16
+  path: data/binarized/en_ro/en_ro/wmt16
+  ref: data/dev/en_ro/wmt16
+data_testset_7:
+  direction: en2zh
+  name: wmt17
+  path: data/binarized/en_zh/en2zh/wmt17
+  ref: data/dev/en2zh/wmt17
+data_testset_8:
+  direction: zh2en
+  name: wmt17
+  path: data/binarized/en_zh/zh2en/wmt17
+  ref: data/dev/zh2en/wmt17
+data_testset_9:
+  direction: en2ru
+  name: newstest2019
+  path: data/binarized/en_ru/en2ru/newstest2019
+  ref: data/dev/en2ru/newstest2019

examples/configs/parallel_mono_12e12d_contrastive.yml ADDED Viewed

	@@ -0,0 +1,44 @@

+model_dir: model/pretrain/lab/multilingual/l2r/multi_bpe32k/parallel_mono_contrastive_1/transformer_big_t2t_12e12d
+data_1: data/multilingual/bin/merged_deduped_ras
+data_mono_1: data/multilingual/bin/mono_only/splitaa
+data_mono_2: data/multilingual/bin/mono_only/splitab
+data_mono_3: data/multilingual/bin/mono_only/splitac
+data_mono_4: data/multilingual/bin/mono_only/splitad
+data_mono_5: data/multilingual/bin/mono_only/splitae
+data_mono_6: data/multilingual/bin/mono_only/mono_de_fr_en
+data_mono_7: data/multilingual/bin/mono_only/mono_nl_pl_pt
+source_lang: src
+target_lang: trg
+task: translation_w_mono
+parallel_ratio: 0.2
+mono_ratio: 0.07
+arch: transformer_big_t2t_12e12d
+share_all_embeddings: true
+encoder_learned_pos: true
+decoder_learned_pos: true
+max_source_positions: 1024
+max_target_positions: 1024
+dropout: 0.1
+criterion: label_smoothed_cross_entropy_with_contrastive
+contrastive_lambda: 1.0
+temperature: 0.1
+lr: 0.0003
+clip_norm: 10.0
+optimizer: adam
+adam_eps: 1e-06
+weight_decay: 0.01
+warmup_updates: 10000
+label_smoothing: 0.1
+lr_scheduler: polynomial_decay
+min_lr: -1
+max_tokens: 1536
+update_freq: 30
+max_update: 5000000
+no_scale_embedding: true
+layernorm_embedding: true
+save_interval_updates: 2000
+skip_invalid_size_inputs_valid_test: true
+log_interval: 500
+num_workers: 1
+fp16: true
+seed: 33122

fairseq ADDED Viewed

	@@ -0,0 +1 @@


1	+ Subproject commit 4db264940f281a6f47558d17387b1455d4abd8d9

hubconf.py ADDED Viewed

	@@ -0,0 +1,69 @@

+# Copyright (c) Facebook, Inc. and its affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+import functools
+import importlib
+from fairseq.hub_utils import (  # noqa; noqa
+    BPEHubInterface as bpe,
+    TokenizerHubInterface as tokenizer,
+)
+from fairseq.models import MODEL_REGISTRY  # noqa
+dependencies = [
+    "dataclasses",
+    "hydra",
+    "numpy",
+    "regex",
+    "requests",
+    "torch",
+]
+# Check for required dependencies and raise a RuntimeError if any are missing.
+missing_deps = []
+for dep in dependencies:
+    try:
+        importlib.import_module(dep)
+    except ImportError:
+        # Hack: the hydra package is provided under the "hydra-core" name in
+        # pypi. We don't want the user mistakenly calling `pip install hydra`
+        # since that will install an unrelated package.
+        if dep == "hydra":
+            dep = "hydra-core"
+        missing_deps.append(dep)
+if len(missing_deps) > 0:
+    raise RuntimeError("Missing dependencies: {}".format(", ".join(missing_deps)))
+# torch.hub doesn't build Cython components, so if they are not found then try
+# to build them here
+try:
+    import fairseq.data.token_block_utils_fast  # noqa
+except ImportError:
+    try:
+        import cython  # noqa
+        import os
+        from setuptools import sandbox
+        sandbox.run_setup(
+            os.path.join(os.path.dirname(__file__), "setup.py"),
+            ["build_ext", "--inplace"],
+        )
+    except ImportError:
+        print(
+            "Unable to build Cython components. Please make sure Cython is "
+            "installed if the torch.hub model you are loading depends on it."
+        )
+# automatically expose models defined in FairseqModel::hub_models
+for _model_type, _cls in MODEL_REGISTRY.items():
+    for model_name in _cls.hub_models().keys():
+        globals()[model_name] = functools.partial(
+            _cls.from_pretrained,
+            model_name,
+        )

input.ar ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

input.en ADDED Viewed

	@@ -0,0 +1 @@


1	+ Why are you such a sussy fucker?

input.es ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN

input.fr ADDED Viewed

	@@ -0,0 +1 @@


1	+ LANG_TOK_EN