# Masakhane - Machine Translation for African Languages (Using JoeyNMT)

## Note before beginning:
### - The idea is that you should be able to make minimal changes to this in order to get SOME result for your own translation corpus. 

### - The tl;dr: Go to the **"TODO"** comments which will tell you what to update to get up and running

### - If you actually want to have a clue what you're doing, read the text and peek at the links

### - With 100 epochs, it should take around 7 hours to run in Google Colab

### - Once you've gotten a result for your language, please attach and email your notebook that generated it to masakhanetranslation@gmail.com

### - If you care enough and get a chance, doing a brief background on your language would be amazing. See examples in  [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)

## Retrieve your data & make a parallel corpus

If you are wanting to use the JW300 data referenced on the Masakhane website or in our GitHub repo, you can use `opus-tools` to convert the data into a convenient format. `opus_read` from that package provides a convenient tool for reading the native aligned XML files and to convert them to TMX format. The tool can also be used to fetch relevant files from OPUS on the fly and to filter the data as necessary. [Read the documentation](https://pypi.org/project/opustools-pkg/) for more details.

Once you have your corpus files in TMX format (an xml structure which will include the sentences in your target language and your source language in a single file), we recommend reading them into a pandas dataframe. Thankfully, Jade wrote a silly `tmx2dataframe` package which converts your tmx file to a pandas dataframe. 

In [0]:
"""from google.colab import drive
drive.mount('/content/drive')"""

In [1]:
# TODO: Set your source and target languages. Keep in mind, these traditionally use language codes as found here:
# These will also become the suffix's of all vocab and corpus files used throughout
import os
source_language = "en"
target_language = "tiv" 
lc = False  # If True, lowercase the data.
seed = 42  # Random seed for shuffling.
tag = "jw300-baseline" # Give a unique name to your folder - this is to ensure you don't rewrite any models you've already submitted

os.environ["src"] = source_language # Sets them in bash as well, since we often use bash scripts
os.environ["tgt"] = target_language
os.environ["tag"] = tag


# This will save it to a folder in our gdrive instead!
!mkdir -p "en_tiv/$src-$tgt-$tag"
os.environ["experiment_path"] = "en_tiv/%s-%s-%s" % (source_language, target_language, tag)

In [2]:
!mkdir -p "en_tiv/$src-$tgt-$tag"

In [3]:
!echo $experiment_path

en_tiv/en-tiv-jw300-baseline


In [4]:
# Install opus-tools
! pip install opustools-pkg

[33mYou are using pip version 10.0.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [5]:
# Downloading our corpus
! opus_read -d JW300 -s $src -t $tgt -wm moses -w jw300.$src jw300.$tgt -q

# extract the corpus file
! gunzip JW300_latest_xml_$src-$tgt.xml.gz


Alignment file /proj/nlpl/data/OPUS/JW300/latest/xml/en-tiv.xml.gz not found. The following files are available for downloading:

   2 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en-tiv.xml.gz
 263 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en.zip
  25 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/tiv.zip

 290 MB Total size
./JW300_latest_xml_en-tiv.xml.gz ... 100% of 2 MB
./JW300_latest_xml_en.zip ... 100% of 263 MB
./JW300_latest_xml_tiv.zip ... 100% of 25 MB


In [6]:
# Download the global test set.
! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en
  
# And the specific test set for this language pair.
os.environ["trg"] = target_language 
os.environ["src"] = source_language 

! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.en 
! mv test.en-$trg.en test.en
! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.$trg 
! mv test.en-$trg.$trg test.$trg

--2020-02-12 15:17:58--  https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.232.24.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.232.24.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 277791 (271K) [text/plain]
Saving to: ‘test.en-any.en’


2020-02-12 15:17:58 (26.3 MB/s) - ‘test.en-any.en’ saved [277791/277791]

--2020-02-12 15:17:58--  https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-tiv.en
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.232.24.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.232.24.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26584 (26K) [text/plain]
Saving to: ‘test.en-tiv.en’


2020-02-12 15:17:59 (22.7 MB/s) - ‘test.en-tiv.en’ saved [26584/26584]

--2020-02-12 15:17:5

In [7]:
# Read the test data to filter from train and dev splits.
# Store english portion in set for quick filtering checks.
en_test_sents = set()
filter_test_sents = "test.en-any.en"
j = 0
with open(filter_test_sents) as f:
  for line in f:
    en_test_sents.add(line.strip())
    j += 1
print('Loaded {} global test sentences to filter from the training/dev data.'.format(j))

Loaded 3571 global test sentences to filter from the training/dev data.


In [8]:
import pandas as pd

# TMX file to dataframe
source_file = 'jw300.' + source_language
target_file = 'jw300.' + target_language

source = []
target = []
skip_lines = []  # Collect the line numbers of the source portion to skip the same lines for the target portion.
with open(source_file) as f:
    for i, line in enumerate(f):
        # Skip sentences that are contained in the test set.
        if line.strip() not in en_test_sents:
            source.append(line.strip())
        else:
            skip_lines.append(i)             
with open(target_file) as f:
    for j, line in enumerate(f):
        # Only add to corpus if corresponding source was not skipped.
        if j not in skip_lines:
            target.append(line.strip())
    
print('Loaded data and skipped {}/{} lines since contained in test set.'.format(len(skip_lines), i))
    
df = pd.DataFrame(zip(source, target), columns=['source_sentence', 'target_sentence'])
# if you get TypeError: data argument can't be an iterator is because of your zip version run this below
#df = pd.DataFrame(list(zip(source, target)), columns=['source_sentence', 'target_sentence'])
df.head(3)

Loaded data and skipped 2535/209778 lines since contained in test set.


Unnamed: 0,source_sentence,target_sentence
0,Questions From Readers,Mbampin Mba Mbaôron Takerada Ne Ve Pin la
1,How seriously should Christians view an engage...,Gba u ityendezwa i nomsoor vea kwase ve er u v...
2,An engagement to marry is a cause for happines...,Ityendezwa i̱ nomsoor vea kwase ve er ér vea v...


## Pre-processing and export

It is generally a good idea to remove duplicate translations and conflicting translations from the corpus. In practice, these public corpora include some number of these that need to be cleaned.

In addition we will split our data into dev/test/train and export to the filesystem.

In [9]:
# drop duplicate translations
df_pp = df.drop_duplicates()

# drop conflicting translations
# (this is optional and something that you might want to comment out 
# depending on the size of your corpus)
df_pp.drop_duplicates(subset='source_sentence', inplace=True)
df_pp.drop_duplicates(subset='target_sentence', inplace=True)

# Shuffle the data to remove bias in dev set selection.
df_pp = df_pp.sample(frac=1, random_state=seed).reset_index(drop=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


In [10]:
# Install fuzzy wuzzy to remove "almost duplicate" sentences in the
# test and training sets.
! pip install fuzzywuzzy
! pip install python-Levenshtein
import time
from fuzzywuzzy import process
import numpy as np
from os import cpu_count
from functools import partial
from multiprocessing import Pool


# reset the index of the training set after previous filtering
df_pp.reset_index(drop=False, inplace=True)

# Remove samples from the training data set if they "almost overlap" with the
# samples in the test set.

# Filtering function. Adjust pad to narrow down the candidate matches to
# within a certain length of characters of the given sample.
def fuzzfilter(sample, candidates, pad):
  candidates = [x for x in candidates if len(x) <= len(sample)+pad and len(x) >= len(sample)-pad] 
  if len(candidates) > 0:
    return process.extractOne(sample, candidates)[1]
  else:
    return np.nan

[33mYou are using pip version 10.0.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 10.0.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [11]:
start_time = time.time()
### iterating over pandas dataframe rows is not recomended, let use multi processing to apply the function

with Pool(cpu_count()-1) as pool:
    scores = pool.map(partial(fuzzfilter, candidates=list(en_test_sents), pad=5), df_pp['source_sentence'])
hours, rem = divmod(time.time() - start_time, 3600)
minutes, seconds = divmod(rem, 60)
print("done in {}h:{}min:{}seconds".format(hours, minutes, seconds))

# Filter out "almost overlapping samples"
df_pp = df_pp.assign(scores=scores)
df_pp = df_pp[df_pp['scores'] < 95]



done in 0.0h:18.0min:34.256463050842285seconds


In [12]:
# This section does the split between train/dev for the parallel corpora then saves them as separate files
# We use 1000 dev test and the given test set.
import csv

# Do the split between dev/train and create parallel corpora
num_dev_patterns = 1000

# Optional: lower case the corpora - this will make it easier to generalize, but without proper casing.
if lc:  # Julia: making lowercasing optional
    df_pp["source_sentence"] = df_pp["source_sentence"].str.lower()
    df_pp["target_sentence"] = df_pp["target_sentence"].str.lower()

# Julia: test sets are already generated
dev = df_pp.tail(num_dev_patterns) # Herman: Error in original
stripped = df_pp.drop(df_pp.tail(num_dev_patterns).index)

with open("train."+source_language, "w") as src_file, open("train."+target_language, "w") as trg_file:
  for index, row in stripped.iterrows():
    src_file.write(row["source_sentence"]+"\n")
    trg_file.write(row["target_sentence"]+"\n")
    
with open("dev."+source_language, "w") as src_file, open("dev."+target_language, "w") as trg_file:
  for index, row in dev.iterrows():
    src_file.write(row["source_sentence"]+"\n")
    trg_file.write(row["target_sentence"]+"\n")

#stripped[["source_sentence"]].to_csv("train."+source_language, header=False, index=False)  # Herman: Added `header=False` everywhere
#stripped[["target_sentence"]].to_csv("train."+target_language, header=False, index=False)  # Julia: Problematic handling of quotation marks.

#dev[["source_sentence"]].to_csv("dev."+source_language, header=False, index=False)
#dev[["target_sentence"]].to_csv("dev."+target_language, header=False, index=False)

# Doublecheck the format below. There should be no extra quotation marks or weird characters.
! head train.*
! head dev.*

==> train.en <==
How we say something can be as important as what we say .
All of them in wisdom you have made . The earth is full of your productions . ”
The surviving righteous ones will see that God’s prophetic word is true .
Take a lesson from the lilies of the field , how they are growing ; they do not toil , nor do they spin ; but I say to you that not even Solomon in all his glory was arrayed as one of these . ” ​ — Matthew 6 : 28 , 29 .
Now that some time has elapsed since that change was introduced , we can ask ourselves : ‘ Am I using the time made available to have a Family Worship evening or to engage in personal study ?
What adjustments did one young couple make , and with what results ?
Such promises are practical because they enable those who put faith in them to face the future with hope and confidence . ​ — Hebrews 11 : 6 .
A step toward good study habits is regularly to set aside time for Bible study .
For some weeks , the prophet Elijah has been a guest of the widow 



---


## Installation of JoeyNMT

JoeyNMT is a simple, minimalist NMT package which is useful for learning and teaching. Check out the documentation for JoeyNMT [here](https://joeynmt.readthedocs.io)  

In [13]:
# Install JoeyNMT
! git clone https://github.com/joeynmt/joeynmt.git
! cd joeynmt; pip3 install .

fatal: destination path 'joeynmt' already exists and is not an empty directory.
Processing /home/ec2-user/SageMaker/masakhane/joeynmt
Collecting torch>=1.1 (from joeynmt==0.0.1)
[?25l  Downloading https://files.pythonhosted.org/packages/24/19/4804aea17cd136f1705a5e98a00618cb8f6ccc375ad8bfa437408e09d058/torch-1.4.0-cp36-cp36m-manylinux1_x86_64.whl (753.4MB)
[K    100% |████████████████████████████████| 753.4MB 64kB/s  eta 0:00:01    21% |██████▊                         | 158.7MB 57.5MB/s eta 0:00:11    33% |██████████▊                     | 253.7MB 46.6MB/s eta 0:00:11    99% |███████████████████████████████▊| 745.9MB 56.9MB/s eta 0:00:01
[?25hCollecting tensorflow>=1.14 (from joeynmt==0.0.1)
[?25l  Downloading https://files.pythonhosted.org/packages/85/d4/c0cd1057b331bc38b65478302114194bd8e1b9c2bbc06e300935c0e93d90/tensorflow-2.1.0-cp36-cp36m-manylinux2010_x86_64.whl (421.8MB)
[K    100% |████████████████████████████████| 421.8MB 116kB/s eta 0:00:01    47% |███████████████▏       

Collecting werkzeug>=0.11.15 (from tensorboard<2.2.0,>=2.1.0->tensorflow>=1.14->joeynmt==0.0.1)
[?25l  Downloading https://files.pythonhosted.org/packages/ba/a5/d6f8a6e71f15364d35678a4ec8a0186f980b3bd2545f40ad51dd26a87fb1/Werkzeug-1.0.0-py2.py3-none-any.whl (298kB)
[K    100% |████████████████████████████████| 307kB 40.9MB/s ta 0:00:01
[?25hCollecting markdown>=2.6.8 (from tensorboard<2.2.0,>=2.1.0->tensorflow>=1.14->joeynmt==0.0.1)
[?25l  Downloading https://files.pythonhosted.org/packages/d5/16/c5a68ef8c62406b3bbd8f49199bbae56feb390746a284c4cf036c687465f/Markdown-3.2-py2.py3-none-any.whl (88kB)
[K    100% |████████████████████████████████| 92kB 37.9MB/s ta 0:00:01
[?25hCollecting google-auth-oauthlib<0.5,>=0.4.1 (from tensorboard<2.2.0,>=2.1.0->tensorflow>=1.14->joeynmt==0.0.1)
  Downloading https://files.pythonhosted.org/packages/7b/b8/88def36e74bee9fce511c9519571f4e485e890093ab7442284f4ffaef60b/google_auth_oauthlib-0.4.1-py2.py3-none-any.whl
Collecting google-auth<2,>=1.6.3 (

# Preprocessing the Data into Subword BPE Tokens

- One of the most powerful improvements for agglutinative languages (a feature of most Bantu languages) is using BPE tokenization [ (Sennrich, 2015) ](https://arxiv.org/abs/1508.07909).

- It was also shown that by optimizing the umber of BPE codes we significantly improve results for low-resourced languages [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021) [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)

- Below we have the scripts for doing BPE tokenization of our data. We use 4000 tokens as recommended by [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021). You do not need to change anything. Simply running the below will be suitable. 

In [14]:
!sudo pip3 install subword-nmt

[33mYou are using pip version 19.0.2, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [16]:
# One of the huge boosts in NMT performance was to use a different method of tokenizing. 
# Usually, NMT would tokenize by words. However, using a method called BPE gave amazing boosts to performance

# Do subword NMT
from os import path
os.environ["src"] = source_language # Sets them in bash as well, since we often use bash scripts
os.environ["tgt"] = target_language

# Learn BPEs on the training data.
os.environ["data_path"] = path.join("joeynmt", "data", source_language + target_language) # Herman! 
! subword-nmt learn-joint-bpe-and-vocab --input train.$src train.$tgt -s 4000 -o bpe.codes.4000 --write-vocabulary vocab.$src vocab.$tgt

# Apply BPE splits to the development and test data.
! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < train.$src > train.bpe.$src
! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < train.$tgt > train.bpe.$tgt

! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < dev.$src > dev.bpe.$src
! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < dev.$tgt > dev.bpe.$tgt
! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < test.$src > test.bpe.$src
! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < test.$tgt > test.bpe.$tgt

# Create directory, move everyone we care about to the correct location
! mkdir -p $data_path
! cp train.* $data_path
! cp test.* $data_path
! cp dev.* $data_path
! cp bpe.codes.4000 $data_path
! ls $data_path

# Also move everything we care about to a mounted location in google drive (relevant if running in colab) at gdrive_path
! cp train.* "$experiment_path"
! cp test.* "$experiment_path"
! cp dev.* "$experiment_path"
! cp bpe.codes.4000 "$gexperiment_path"
! ls "$experiment_path"

# Create that vocab using build_vocab
! sudo chmod 777 joeynmt/scripts/build_vocab.py
! joeynmt/scripts/build_vocab.py joeynmt/data/$src$tgt/train.bpe.$src joeynmt/data/$src$tgt/train.bpe.$tgt --output_path joeynmt/data/$src$tgt/vocab.txt

# Some output
! echo "BPE Tiv Sentences"
! tail -n 5 test.bpe.$tgt
! echo "Combined BPE Vocab"
! tail -n 10 joeynmt/data/$src$tgt/vocab.txt  # Herman

bpe.codes.4000	dev.en	     test.bpe.tiv    test.tiv	    train.en
dev.bpe.en	dev.tiv      test.en	     train.bpe.en   train.tiv
dev.bpe.tiv	test.bpe.en  test.en-any.en  train.bpe.tiv
cp: cannot create regular file ‘’: No such file or directory
dev.bpe.en   dev.tiv	   test.en	   train.bpe.en   train.tiv
dev.bpe.tiv  test.bpe.en   test.en-any.en  train.bpe.tiv
dev.en	     test.bpe.tiv  test.tiv	   train.en
BPE Tiv Sentences
Er nan ve yange gba u H@@ u@@ sh@@ ai una lu a ishimataver keng ve una fatyô u civir Aôndo sha mimi ?
Er nan ve i gbe u se lu a ishimataver ve se fatyô u civir Aôndo sha mimi ?
M sôn Yehova mer a wasem me taver ishima me er kwagh u m tsough u eren la .
Hegen ve gema inja , nahan ka m z@@ aan ve inya hanma shighe . ” — Ôr Anzaakaa 29 : 25 .
[ 1 ] ( ikyum@@ hi@@ ange i sha 7 ) I gema ati agen .
Combined BPE Vocab
erus@@
isholi@@
š@@
epher@@
ị
î@@
Ó@@
ö
_
ateu


# Creating the JoeyNMT Config

JoeyNMT requires a yaml config. We provide a template below. We've also set a number of defaults with it, that you may play with!

- We used Transformer architecture 
- We set our dropout to reasonably high: 0.3 (recommended in  [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021))

Things worth playing with:
- The batch size (also recommended to change for low-resourced languages)
- The number of epochs (we've set it at 30 just so it runs in about an hour, for testing purposes)
- The decoder options (beam_size, alpha)
- Evaluation metrics (BLEU versus Crhf4)

In [18]:
# This creates the config file for our JoeyNMT system. It might seem overwhelming so we've provided a couple of useful parameters you'll need to update
# (You can of course play with all the parameters if you'd like!)

name = '%s%s' % (source_language, target_language)
gdrive_path = os.environ["experiment_path"]

# Create the config
config = """
name: "{name}_transformer"

data:
    src: "{source_language}"
    trg: "{target_language}"
    train: "data/{name}/train.bpe"
    dev:   "data/{name}/dev.bpe"
    test:  "data/{name}/test.bpe"
    level: "bpe"
    lowercase: False
    max_sent_length: 100
    src_vocab: "data/{name}/vocab.txt"
    trg_vocab: "data/{name}/vocab.txt"

testing:
    beam_size: 5
    alpha: 1.0

training:
    #load_model: "{experiment_path}/models/{name}_transformer/1.ckpt" # if uncommented, load a pre-trained model from this checkpoint
    random_seed: 42
    optimizer: "adam"
    normalization: "tokens"
    adam_betas: [0.9, 0.999] 
    scheduling: "plateau"           # TODO: try switching from plateau to Noam scheduling
    patience: 5                     # For plateau: decrease learning rate by decrease_factor if validation score has not improved for this many validation rounds.
    learning_rate_factor: 0.5       # factor for Noam scheduler (used with Transformer)
    learning_rate_warmup: 1000      # warmup steps for Noam scheduler (used with Transformer)
    decrease_factor: 0.7
    loss: "crossentropy"
    learning_rate: 0.0003
    learning_rate_min: 0.00000001
    weight_decay: 0.0
    label_smoothing: 0.1
    batch_size: 4096
    batch_type: "token"
    eval_batch_size: 3600
    eval_batch_type: "token"
    batch_multiplier: 1
    early_stopping_metric: "ppl"
    epochs: 150                     # TODO: Decrease for when playing around and checking of working. Around 30 is sufficient to check if its working at all
    validation_freq: 1000          # TODO: Set to at least once per epoch.
    logging_freq: 100
    eval_metric: "bleu"
    model_dir: "models/{name}_transformer"
    overwrite: True               # TODO: Set to True if you want to overwrite possibly existing models. 
    shuffle: True
    use_cuda: True
    max_output_length: 100
    print_valid_sents: [0, 1, 2, 3]
    keep_last_ckpts: 5

model:
    initializer: "xavier"
    bias_initializer: "zeros"
    init_gain: 1.0
    embed_initializer: "xavier"
    embed_init_gain: 1.0
    tied_embeddings: True
    tied_softmax: True
    encoder:
        type: "transformer"
        num_layers: 6
        num_heads: 4             # TODO: Increase to 8 for larger data.
        embeddings:
            embedding_dim: 256   # TODO: Increase to 512 for larger data.
            scale: True
            dropout: 0.2
        # typically ff_size = 4 x hidden_size
        hidden_size: 256         # TODO: Increase to 512 for larger data.
        ff_size: 1024            # TODO: Increase to 2048 for larger data.
        dropout: 0.3
    decoder:
        type: "transformer"
        num_layers: 6
        num_heads: 4              # TODO: Increase to 8 for larger data.
        embeddings:
            embedding_dim: 256    # TODO: Increase to 512 for larger data.
            scale: True
            dropout: 0.2
        # typically ff_size = 4 x hidden_size
        hidden_size: 256         # TODO: Increase to 512 for larger data.
        ff_size: 1024            # TODO: Increase to 2048 for larger data.
        dropout: 0.3
""".format(name=name, experiment_path=os.environ["experiment_path"], source_language=source_language, target_language=target_language)
with open("joeynmt/configs/transformer_{name}.yaml".format(name=name),'w') as f:
    f.write(config)

In [19]:
!conda install pytorch torchvision cudatoolkit=10.1 -c pytorch --yes
!conda install tensorboard --yes
!conda install -c pytorch torchtext --yes
!conda install -c powerai sentencepiece --yes
!conda install -c powerai sacrebleu --yes

Solving environment: done


  current version: 4.5.12
  latest version: 4.8.2

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/ec2-user/anaconda3/envs/python3

  added / updated specs: 
    - cudatoolkit=10.1
    - pytorch
    - torchvision


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         156 KB
    ca-certificates-2020.1.1   |                0         132 KB
    openssl-1.0.2u             |       h7b6447c_0         3.1 MB
    pytorch-1.4.0              |py3.6_cuda10.1.243_cudnn7.6.3_0       432.9 MB  pytorch
    torchvision-0.5.0          |       py36_cu101         9.1 MB  pytorch
    ------------------------------------------------------------
                                           Total:       445.4 MB

The following NEW packages will be INSTALLE

sentencepiece-0.1.84 | 3.1 MB    | ##################################### | 100% 
tensorflow-base-2.0. | 100.9 MB  | ##################################### | 100% 
termcolor-1.1.0      | 7 KB      | ##################################### | 100% 
_tflow_select-2.3.0  | 2 KB      | ##################################### | 100% 
google-pasta-0.1.8   | 43 KB     | ##################################### | 100% 
keras-preprocessing- | 36 KB     | ##################################### | 100% 
tensorflow-estimator | 272 KB    | ##################################### | 100% 
tensorflow-2.0.0     | 3 KB      | ##################################### | 100% 
gast-0.2.2           | 138 KB    | ##################################### | 100% 
keras-applications-1 | 33 KB     | ##################################### | 100% 
astor-0.8.0          | 45 KB     | ##################################### | 100% 
opt_einsum-3.1.0     | 54 KB     | ##################################### | 100% 
Preparing transaction: done


# Train the Model

This single line of joeynmt runs the training using the config we made above

In [None]:
# Train the model
# You can press Ctrl-C to stop. And then run the next cell to save your checkpoints! 
!cd joeynmt; python3 -m joeynmt train configs/transformer_$src$tgt.yaml

2020-02-12 15:55:50,862 Hello! This is Joey-NMT.
2020-02-12 15:55:52,201 Total params: 12152320
2020-02-12 15:55:52,202 Trainable parameters: ['decoder.layer_norm.bias', 'decoder.layer_norm.weight', 'decoder.layers.0.dec_layer_norm.bias', 'decoder.layers.0.dec_layer_norm.weight', 'decoder.layers.0.feed_forward.layer_norm.bias', 'decoder.layers.0.feed_forward.layer_norm.weight', 'decoder.layers.0.feed_forward.pwff_layer.0.bias', 'decoder.layers.0.feed_forward.pwff_layer.0.weight', 'decoder.layers.0.feed_forward.pwff_layer.3.bias', 'decoder.layers.0.feed_forward.pwff_layer.3.weight', 'decoder.layers.0.src_trg_att.k_layer.bias', 'decoder.layers.0.src_trg_att.k_layer.weight', 'decoder.layers.0.src_trg_att.output_layer.bias', 'decoder.layers.0.src_trg_att.output_layer.weight', 'decoder.layers.0.src_trg_att.q_layer.bias', 'decoder.layers.0.src_trg_att.q_layer.weight', 'decoder.layers.0.src_trg_att.v_layer.bias', 'decoder.layers.0.src_trg_att.v_layer.weight', 'decoder.layers.0.trg_trg_att.k_l

2020-02-12 15:55:55,584 cfg.name                           : entiv_transformer
2020-02-12 15:55:55,584 cfg.data.src                       : en
2020-02-12 15:55:55,584 cfg.data.trg                       : tiv
2020-02-12 15:55:55,584 cfg.data.train                     : data/entiv/train.bpe
2020-02-12 15:55:55,585 cfg.data.dev                       : data/entiv/dev.bpe
2020-02-12 15:55:55,585 cfg.data.test                      : data/entiv/test.bpe
2020-02-12 15:55:55,585 cfg.data.level                     : bpe
2020-02-12 15:55:55,585 cfg.data.lowercase                 : False
2020-02-12 15:55:55,585 cfg.data.max_sent_length           : 100
2020-02-12 15:55:55,585 cfg.data.src_vocab                 : data/entiv/vocab.txt
2020-02-12 15:55:55,585 cfg.data.trg_vocab                 : data/entiv/vocab.txt
2020-02-12 15:55:55,585 cfg.testing.beam_size              : 5
2020-02-12 15:55:55,585 cfg.testing.alpha                  : 1.0
2020-02-12 15:55:55,585 cfg.training.random_seed           :

2020-02-12 15:58:09,004 Epoch   1 Step:     1100 Batch Loss:     3.681341 Tokens per Sec:    21252, Lr: 0.000300
2020-02-12 15:58:19,066 Epoch   1 Step:     1200 Batch Loss:     3.286341 Tokens per Sec:    22182, Lr: 0.000300
2020-02-12 15:58:29,003 Epoch   1 Step:     1300 Batch Loss:     3.167710 Tokens per Sec:    22510, Lr: 0.000300
2020-02-12 15:58:38,931 Epoch   1 Step:     1400 Batch Loss:     3.563505 Tokens per Sec:    22049, Lr: 0.000300
2020-02-12 15:58:48,935 Epoch   1 Step:     1500 Batch Loss:     3.639344 Tokens per Sec:    22684, Lr: 0.000300
2020-02-12 15:58:58,889 Epoch   1 Step:     1600 Batch Loss:     3.228676 Tokens per Sec:    22208, Lr: 0.000300
2020-02-12 15:59:08,764 Epoch   1 Step:     1700 Batch Loss:     3.194575 Tokens per Sec:    21941, Lr: 0.000300
2020-02-12 15:59:18,664 Epoch   1 Step:     1800 Batch Loss:     3.024248 Tokens per Sec:    22794, Lr: 0.000300
2020-02-12 15:59:28,757 Epoch   1 Step:     1900 Batch Loss:     3.275215 Tokens per Sec:    211

2020-02-12 16:04:19,315 Epoch   2 Step:     4100 Batch Loss:     2.247452 Tokens per Sec:    22159, Lr: 0.000300
2020-02-12 16:04:29,221 Epoch   2 Step:     4200 Batch Loss:     2.779771 Tokens per Sec:    22260, Lr: 0.000300
2020-02-12 16:04:39,354 Epoch   2 Step:     4300 Batch Loss:     2.120739 Tokens per Sec:    21882, Lr: 0.000300
2020-02-12 16:04:49,391 Epoch   2 Step:     4400 Batch Loss:     2.848859 Tokens per Sec:    21879, Lr: 0.000300
2020-02-12 16:04:59,376 Epoch   2 Step:     4500 Batch Loss:     2.804246 Tokens per Sec:    22455, Lr: 0.000300
2020-02-12 16:05:09,519 Epoch   2 Step:     4600 Batch Loss:     2.565914 Tokens per Sec:    21965, Lr: 0.000300
2020-02-12 16:05:19,682 Epoch   2 Step:     4700 Batch Loss:     2.769480 Tokens per Sec:    21692, Lr: 0.000300
2020-02-12 16:05:29,775 Epoch   2 Step:     4800 Batch Loss:     2.822880 Tokens per Sec:    21512, Lr: 0.000300
2020-02-12 16:05:39,817 Epoch   2 Step:     4900 Batch Loss:     2.971874 Tokens per Sec:    217

2020-02-12 16:10:23,420 Epoch   3 Step:     7100 Batch Loss:     2.392289 Tokens per Sec:    22390, Lr: 0.000300
2020-02-12 16:10:33,255 Epoch   3 Step:     7200 Batch Loss:     2.308585 Tokens per Sec:    22352, Lr: 0.000300
2020-02-12 16:10:43,111 Epoch   3 Step:     7300 Batch Loss:     2.353991 Tokens per Sec:    22982, Lr: 0.000300
2020-02-12 16:10:51,099 Epoch   3: total training loss 6184.23
2020-02-12 16:10:51,099 EPOCH 4
2020-02-12 16:10:53,407 Epoch   4 Step:     7400 Batch Loss:     2.594020 Tokens per Sec:    18604, Lr: 0.000300
2020-02-12 16:11:03,264 Epoch   4 Step:     7500 Batch Loss:     2.548170 Tokens per Sec:    22402, Lr: 0.000300
2020-02-12 16:11:13,196 Epoch   4 Step:     7600 Batch Loss:     2.366792 Tokens per Sec:    22935, Lr: 0.000300
2020-02-12 16:11:23,085 Epoch   4 Step:     7700 Batch Loss:     2.471555 Tokens per Sec:    22248, Lr: 0.000300
2020-02-12 16:11:32,991 Epoch   4 Step:     7800 Batch Loss:     2.077398 Tokens per Sec:    22014, Lr: 0.000300
2

2020-02-12 16:16:24,819 Epoch   5 Step:    10100 Batch Loss:     1.982166 Tokens per Sec:    22377, Lr: 0.000300
2020-02-12 16:16:34,695 Epoch   5 Step:    10200 Batch Loss:     2.273010 Tokens per Sec:    22155, Lr: 0.000300
2020-02-12 16:16:44,558 Epoch   5 Step:    10300 Batch Loss:     2.021907 Tokens per Sec:    22818, Lr: 0.000300
2020-02-12 16:16:54,489 Epoch   5 Step:    10400 Batch Loss:     2.374995 Tokens per Sec:    22376, Lr: 0.000300
2020-02-12 16:17:04,457 Epoch   5 Step:    10500 Batch Loss:     2.002748 Tokens per Sec:    22170, Lr: 0.000300
2020-02-12 16:17:14,423 Epoch   5 Step:    10600 Batch Loss:     2.368401 Tokens per Sec:    21915, Lr: 0.000300
2020-02-12 16:17:24,329 Epoch   5 Step:    10700 Batch Loss:     2.217776 Tokens per Sec:    22016, Lr: 0.000300
2020-02-12 16:17:34,280 Epoch   5 Step:    10800 Batch Loss:     1.833345 Tokens per Sec:    22474, Lr: 0.000300
2020-02-12 16:17:44,366 Epoch   5 Step:    10900 Batch Loss:     2.128143 Tokens per Sec:    216

2020-02-12 16:22:25,010 Epoch   6 Step:    13100 Batch Loss:     2.109484 Tokens per Sec:    22544, Lr: 0.000300
2020-02-12 16:22:34,934 Epoch   6 Step:    13200 Batch Loss:     1.922879 Tokens per Sec:    21784, Lr: 0.000300
2020-02-12 16:22:44,877 Epoch   6 Step:    13300 Batch Loss:     2.186911 Tokens per Sec:    22893, Lr: 0.000300
2020-02-12 16:22:54,871 Epoch   6 Step:    13400 Batch Loss:     1.936656 Tokens per Sec:    21614, Lr: 0.000300
2020-02-12 16:23:04,832 Epoch   6 Step:    13500 Batch Loss:     2.064526 Tokens per Sec:    22143, Lr: 0.000300
2020-02-12 16:23:14,760 Epoch   6 Step:    13600 Batch Loss:     2.143647 Tokens per Sec:    22323, Lr: 0.000300
2020-02-12 16:23:24,762 Epoch   6 Step:    13700 Batch Loss:     2.014436 Tokens per Sec:    22092, Lr: 0.000300
2020-02-12 16:23:34,673 Epoch   6 Step:    13800 Batch Loss:     1.766890 Tokens per Sec:    22426, Lr: 0.000300
2020-02-12 16:23:44,490 Epoch   6 Step:    13900 Batch Loss:     1.993079 Tokens per Sec:    219

2020-02-12 16:28:23,312 Epoch   7 Step:    16100 Batch Loss:     2.202749 Tokens per Sec:    22046, Lr: 0.000300
2020-02-12 16:28:33,240 Epoch   7 Step:    16200 Batch Loss:     1.774728 Tokens per Sec:    22458, Lr: 0.000300
2020-02-12 16:28:43,167 Epoch   7 Step:    16300 Batch Loss:     1.727866 Tokens per Sec:    22257, Lr: 0.000300
2020-02-12 16:28:53,155 Epoch   7 Step:    16400 Batch Loss:     1.883956 Tokens per Sec:    22405, Lr: 0.000300
2020-02-12 16:29:03,116 Epoch   7 Step:    16500 Batch Loss:     1.944116 Tokens per Sec:    22219, Lr: 0.000300
2020-02-12 16:29:13,100 Epoch   7 Step:    16600 Batch Loss:     2.092608 Tokens per Sec:    21733, Lr: 0.000300
2020-02-12 16:29:23,135 Epoch   7 Step:    16700 Batch Loss:     2.157441 Tokens per Sec:    21804, Lr: 0.000300
2020-02-12 16:29:33,063 Epoch   7 Step:    16800 Batch Loss:     2.321883 Tokens per Sec:    22356, Lr: 0.000300
2020-02-12 16:29:43,059 Epoch   7 Step:    16900 Batch Loss:     2.048239 Tokens per Sec:    224

2020-02-12 16:34:24,755 Epoch   8 Step:    19100 Batch Loss:     1.704111 Tokens per Sec:    22431, Lr: 0.000300
2020-02-12 16:34:34,710 Epoch   8 Step:    19200 Batch Loss:     2.043770 Tokens per Sec:    22612, Lr: 0.000300
2020-02-12 16:34:44,838 Epoch   8 Step:    19300 Batch Loss:     1.879913 Tokens per Sec:    22152, Lr: 0.000300
2020-02-12 16:34:55,046 Epoch   8 Step:    19400 Batch Loss:     2.035709 Tokens per Sec:    21450, Lr: 0.000300
2020-02-12 16:35:05,213 Epoch   8 Step:    19500 Batch Loss:     1.800448 Tokens per Sec:    21968, Lr: 0.000300
2020-02-12 16:35:15,358 Epoch   8 Step:    19600 Batch Loss:     1.916712 Tokens per Sec:    22098, Lr: 0.000300
2020-02-12 16:35:22,135 Epoch   8: total training loss 4850.83
2020-02-12 16:35:22,135 EPOCH 9
2020-02-12 16:35:25,547 Epoch   9 Step:    19700 Batch Loss:     1.914348 Tokens per Sec:    20597, Lr: 0.000300
2020-02-12 16:35:35,406 Epoch   9 Step:    19800 Batch Loss:     2.093338 Tokens per Sec:    22488, Lr: 0.000300
2

2020-02-12 16:40:23,256 Epoch   9 Step:    22100 Batch Loss:     1.694650 Tokens per Sec:    22721, Lr: 0.000300
2020-02-12 16:40:26,694 Epoch   9: total training loss 4742.62
2020-02-12 16:40:26,694 EPOCH 10
2020-02-12 16:40:33,391 Epoch  10 Step:    22200 Batch Loss:     1.824729 Tokens per Sec:    22062, Lr: 0.000300
2020-02-12 16:40:43,318 Epoch  10 Step:    22300 Batch Loss:     1.830457 Tokens per Sec:    22359, Lr: 0.000300
2020-02-12 16:40:53,228 Epoch  10 Step:    22400 Batch Loss:     1.765829 Tokens per Sec:    21978, Lr: 0.000300
2020-02-12 16:41:03,132 Epoch  10 Step:    22500 Batch Loss:     1.860943 Tokens per Sec:    22552, Lr: 0.000300
2020-02-12 16:41:13,053 Epoch  10 Step:    22600 Batch Loss:     1.998243 Tokens per Sec:    22675, Lr: 0.000300
2020-02-12 16:41:22,940 Epoch  10 Step:    22700 Batch Loss:     1.764805 Tokens per Sec:    22413, Lr: 0.000300
2020-02-12 16:41:32,848 Epoch  10 Step:    22800 Batch Loss:     2.115628 Tokens per Sec:    22279, Lr: 0.000300


2020-02-12 16:46:17,886 Epoch  11 Step:    25100 Batch Loss:     1.992092 Tokens per Sec:    22669, Lr: 0.000300
2020-02-12 16:46:27,791 Epoch  11 Step:    25200 Batch Loss:     1.829686 Tokens per Sec:    22128, Lr: 0.000300
2020-02-12 16:46:37,644 Epoch  11 Step:    25300 Batch Loss:     1.771660 Tokens per Sec:    22004, Lr: 0.000300
2020-02-12 16:46:47,580 Epoch  11 Step:    25400 Batch Loss:     1.828283 Tokens per Sec:    22475, Lr: 0.000300
2020-02-12 16:46:57,458 Epoch  11 Step:    25500 Batch Loss:     1.573067 Tokens per Sec:    21617, Lr: 0.000300
2020-02-12 16:47:07,500 Epoch  11 Step:    25600 Batch Loss:     1.939762 Tokens per Sec:    21885, Lr: 0.000300
2020-02-12 16:47:17,446 Epoch  11 Step:    25700 Batch Loss:     1.876702 Tokens per Sec:    22402, Lr: 0.000300
2020-02-12 16:47:27,453 Epoch  11 Step:    25800 Batch Loss:     1.681799 Tokens per Sec:    22376, Lr: 0.000300
2020-02-12 16:47:37,336 Epoch  11 Step:    25900 Batch Loss:     1.932091 Tokens per Sec:    224

2020-02-12 16:52:15,423 Epoch  12 Step:    28100 Batch Loss:     1.539151 Tokens per Sec:    22240, Lr: 0.000300
2020-02-12 16:52:25,321 Epoch  12 Step:    28200 Batch Loss:     2.161488 Tokens per Sec:    22817, Lr: 0.000300
2020-02-12 16:52:35,323 Epoch  12 Step:    28300 Batch Loss:     1.646714 Tokens per Sec:    22131, Lr: 0.000300
2020-02-12 16:52:45,142 Epoch  12 Step:    28400 Batch Loss:     1.863586 Tokens per Sec:    22631, Lr: 0.000300
2020-02-12 16:52:55,061 Epoch  12 Step:    28500 Batch Loss:     1.740035 Tokens per Sec:    22251, Lr: 0.000300
2020-02-12 16:53:05,019 Epoch  12 Step:    28600 Batch Loss:     1.642973 Tokens per Sec:    22294, Lr: 0.000300
2020-02-12 16:53:15,136 Epoch  12 Step:    28700 Batch Loss:     1.612443 Tokens per Sec:    21869, Lr: 0.000300
2020-02-12 16:53:25,262 Epoch  12 Step:    28800 Batch Loss:     1.866432 Tokens per Sec:    21660, Lr: 0.000300
2020-02-12 16:53:35,381 Epoch  12 Step:    28900 Batch Loss:     1.835360 Tokens per Sec:    217

2020-02-12 16:58:15,541 Epoch  13 Step:    31100 Batch Loss:     1.357395 Tokens per Sec:    22289, Lr: 0.000300
2020-02-12 16:58:25,467 Epoch  13 Step:    31200 Batch Loss:     1.795151 Tokens per Sec:    22656, Lr: 0.000300
2020-02-12 16:58:35,370 Epoch  13 Step:    31300 Batch Loss:     1.561645 Tokens per Sec:    22346, Lr: 0.000300
2020-02-12 16:58:45,341 Epoch  13 Step:    31400 Batch Loss:     1.726090 Tokens per Sec:    22120, Lr: 0.000300
2020-02-12 16:58:55,484 Epoch  13 Step:    31500 Batch Loss:     1.642398 Tokens per Sec:    21676, Lr: 0.000300
2020-02-12 16:59:05,659 Epoch  13 Step:    31600 Batch Loss:     1.837268 Tokens per Sec:    21398, Lr: 0.000300
2020-02-12 16:59:15,825 Epoch  13 Step:    31700 Batch Loss:     2.092063 Tokens per Sec:    22196, Lr: 0.000300
2020-02-12 16:59:25,894 Epoch  13 Step:    31800 Batch Loss:     1.542736 Tokens per Sec:    21513, Lr: 0.000300
2020-02-12 16:59:35,842 Epoch  13 Step:    31900 Batch Loss:     1.935450 Tokens per Sec:    221

2020-02-12 17:04:12,390 Epoch  14 Step:    34100 Batch Loss:     1.889835 Tokens per Sec:    22605, Lr: 0.000300
2020-02-12 17:04:22,256 Epoch  14 Step:    34200 Batch Loss:     1.840223 Tokens per Sec:    22792, Lr: 0.000300
2020-02-12 17:04:32,098 Epoch  14 Step:    34300 Batch Loss:     1.720049 Tokens per Sec:    22295, Lr: 0.000300
2020-02-12 17:04:41,945 Epoch  14 Step:    34400 Batch Loss:     1.546757 Tokens per Sec:    22565, Lr: 0.000300
2020-02-12 17:04:45,193 Epoch  14: total training loss 4325.08
2020-02-12 17:04:45,193 EPOCH 15
2020-02-12 17:04:52,155 Epoch  15 Step:    34500 Batch Loss:     1.768502 Tokens per Sec:    21324, Lr: 0.000300
2020-02-12 17:05:02,101 Epoch  15 Step:    34600 Batch Loss:     1.467681 Tokens per Sec:    22141, Lr: 0.000300
2020-02-12 17:05:12,021 Epoch  15 Step:    34700 Batch Loss:     1.651231 Tokens per Sec:    22103, Lr: 0.000300
2020-02-12 17:05:21,987 Epoch  15 Step:    34800 Batch Loss:     1.513118 Tokens per Sec:    22192, Lr: 0.000300


2020-02-12 17:10:10,405 Epoch  16 Step:    37100 Batch Loss:     1.737724 Tokens per Sec:    21664, Lr: 0.000300
2020-02-12 17:21:09,354 Epoch  18 Step:    42700 Batch Loss:     1.341680 Tokens per Sec:    22421, Lr: 0.000300
2020-02-12 17:21:19,331 Epoch  18 Step:    42800 Batch Loss:     1.568339 Tokens per Sec:    22233, Lr: 0.000300
2020-02-12 17:21:29,473 Epoch  18 Step:    42900 Batch Loss:     1.789617 Tokens per Sec:    21805, Lr: 0.000300
2020-02-12 17:21:39,637 Epoch  18 Step:    43000 Batch Loss:     1.741903 Tokens per Sec:    21691, Lr: 0.000300
2020-02-12 17:21:59,782 Hooray! New best validation result [ppl]!
2020-02-12 17:21:59,783 Saving new checkpoint.
2020-02-12 17:21:59,974 Example #0
2020-02-12 17:21:59,974 	Source:     Since the marriage arrangement instituted by Jehovah is a lasting one , it is vital that couples endeavor to keep the flame of their love ablaze and maintain an atmosphere in which love can grow . ​ — Mark 10 : 6 - 9 .
2020-02-12 17:21:59,974 	Refere

2020-02-12 17:26:07,696 Epoch  19 Step:    45100 Batch Loss:     1.836917 Tokens per Sec:    22059, Lr: 0.000300
2020-02-12 17:26:17,646 Epoch  19 Step:    45200 Batch Loss:     1.467316 Tokens per Sec:    22531, Lr: 0.000300
2020-02-12 17:26:27,623 Epoch  19 Step:    45300 Batch Loss:     1.681551 Tokens per Sec:    22316, Lr: 0.000300
2020-02-12 17:26:37,794 Epoch  19 Step:    45400 Batch Loss:     1.638225 Tokens per Sec:    22062, Lr: 0.000300
2020-02-12 17:26:47,800 Epoch  19 Step:    45500 Batch Loss:     1.648621 Tokens per Sec:    21838, Lr: 0.000300
2020-02-12 17:26:57,727 Epoch  19 Step:    45600 Batch Loss:     1.807967 Tokens per Sec:    21813, Lr: 0.000300
2020-02-12 17:27:07,666 Epoch  19 Step:    45700 Batch Loss:     1.725803 Tokens per Sec:    22564, Lr: 0.000300
2020-02-12 17:27:17,616 Epoch  19 Step:    45800 Batch Loss:     1.618219 Tokens per Sec:    22026, Lr: 0.000300
2020-02-12 17:27:27,727 Epoch  19 Step:    45900 Batch Loss:     1.445169 Tokens per Sec:    214

2020-02-12 17:42:01,191 Epoch  22 Step:    53100 Batch Loss:     1.735585 Tokens per Sec:    22267, Lr: 0.000300
2020-02-12 17:42:11,271 Epoch  22 Step:    53200 Batch Loss:     1.616482 Tokens per Sec:    21461, Lr: 0.000300
2020-02-12 17:42:21,408 Epoch  22 Step:    53300 Batch Loss:     1.763771 Tokens per Sec:    21460, Lr: 0.000300
2020-02-12 17:42:31,417 Epoch  22 Step:    53400 Batch Loss:     1.682316 Tokens per Sec:    21922, Lr: 0.000300
2020-02-12 17:42:41,387 Epoch  22 Step:    53500 Batch Loss:     1.623947 Tokens per Sec:    22414, Lr: 0.000300
2020-02-12 17:42:51,340 Epoch  22 Step:    53600 Batch Loss:     1.663797 Tokens per Sec:    22639, Lr: 0.000300
2020-02-12 17:43:01,235 Epoch  22 Step:    53700 Batch Loss:     1.489120 Tokens per Sec:    22257, Lr: 0.000300
2020-02-12 17:43:11,193 Epoch  22 Step:    53800 Batch Loss:     1.619790 Tokens per Sec:    22105, Lr: 0.000300
2020-02-12 17:43:21,083 Epoch  22 Step:    53900 Batch Loss:     1.598196 Tokens per Sec:    221

2020-02-12 17:47:59,574 Epoch  23 Step:    56100 Batch Loss:     1.637856 Tokens per Sec:    22497, Lr: 0.000300
2020-02-12 17:48:09,625 Epoch  23 Step:    56200 Batch Loss:     1.865298 Tokens per Sec:    22097, Lr: 0.000300
2020-02-12 17:48:19,832 Epoch  23 Step:    56300 Batch Loss:     1.460898 Tokens per Sec:    21439, Lr: 0.000300
2020-02-12 17:48:30,046 Epoch  23 Step:    56400 Batch Loss:     1.146451 Tokens per Sec:    21982, Lr: 0.000300
2020-02-12 17:48:40,108 Epoch  23 Step:    56500 Batch Loss:     1.336150 Tokens per Sec:    21597, Lr: 0.000300
2020-02-12 17:48:44,957 Epoch  23: total training loss 3955.45
2020-02-12 17:48:44,957 EPOCH 24
2020-02-12 17:48:50,541 Epoch  24 Step:    56600 Batch Loss:     1.687715 Tokens per Sec:    21734, Lr: 0.000300
2020-02-12 17:49:00,481 Epoch  24 Step:    56700 Batch Loss:     1.656443 Tokens per Sec:    22469, Lr: 0.000300
2020-02-12 17:49:10,669 Epoch  24 Step:    56800 Batch Loss:     1.551410 Tokens per Sec:    22112, Lr: 0.000300


2020-02-12 17:53:50,207 Epoch  24: total training loss 3936.89
2020-02-12 17:53:50,207 EPOCH 25
2020-02-12 17:53:59,083 Epoch  25 Step:    59100 Batch Loss:     1.634622 Tokens per Sec:    22183, Lr: 0.000300
2020-02-12 17:54:09,165 Epoch  25 Step:    59200 Batch Loss:     1.739247 Tokens per Sec:    21693, Lr: 0.000300
2020-02-12 17:54:19,316 Epoch  25 Step:    59300 Batch Loss:     1.341596 Tokens per Sec:    21416, Lr: 0.000300
2020-02-12 17:54:29,324 Epoch  25 Step:    59400 Batch Loss:     1.790190 Tokens per Sec:    22408, Lr: 0.000300
2020-02-12 17:54:39,219 Epoch  25 Step:    59500 Batch Loss:     1.507016 Tokens per Sec:    22334, Lr: 0.000300
2020-02-12 17:54:49,189 Epoch  25 Step:    59600 Batch Loss:     1.599367 Tokens per Sec:    22329, Lr: 0.000300
2020-02-12 17:54:59,172 Epoch  25 Step:    59700 Batch Loss:     1.695844 Tokens per Sec:    22224, Lr: 0.000300
2020-02-12 17:55:09,456 Epoch  25 Step:    59800 Batch Loss:     1.547237 Tokens per Sec:    21382, Lr: 0.000300


2020-02-12 17:59:57,420 Epoch  26 Step:    62100 Batch Loss:     1.538173 Tokens per Sec:    22215, Lr: 0.000300
2020-02-12 18:00:07,278 Epoch  26 Step:    62200 Batch Loss:     1.745478 Tokens per Sec:    22920, Lr: 0.000300
2020-02-12 18:00:17,139 Epoch  26 Step:    62300 Batch Loss:     1.611246 Tokens per Sec:    22182, Lr: 0.000300
2020-02-12 18:00:27,120 Epoch  26 Step:    62400 Batch Loss:     1.708399 Tokens per Sec:    21687, Lr: 0.000300
2020-02-12 18:00:36,992 Epoch  26 Step:    62500 Batch Loss:     1.464080 Tokens per Sec:    22382, Lr: 0.000300
2020-02-12 18:00:46,971 Epoch  26 Step:    62600 Batch Loss:     1.664190 Tokens per Sec:    22094, Lr: 0.000300
2020-02-12 18:00:56,913 Epoch  26 Step:    62700 Batch Loss:     1.557987 Tokens per Sec:    22267, Lr: 0.000300
2020-02-12 18:01:06,841 Epoch  26 Step:    62800 Batch Loss:     1.668654 Tokens per Sec:    22139, Lr: 0.000300
2020-02-12 18:01:16,817 Epoch  26 Step:    62900 Batch Loss:     1.345859 Tokens per Sec:    221

2020-02-12 18:05:52,022 Epoch  27 Step:    65100 Batch Loss:     1.622702 Tokens per Sec:    21996, Lr: 0.000300
2020-02-12 18:06:02,063 Epoch  27 Step:    65200 Batch Loss:     1.501931 Tokens per Sec:    22085, Lr: 0.000300
2020-02-12 18:06:12,178 Epoch  27 Step:    65300 Batch Loss:     1.629987 Tokens per Sec:    22224, Lr: 0.000300
2020-02-12 18:06:22,409 Epoch  27 Step:    65400 Batch Loss:     1.633025 Tokens per Sec:    21722, Lr: 0.000300
2020-02-12 18:06:32,502 Epoch  27 Step:    65500 Batch Loss:     1.733178 Tokens per Sec:    21072, Lr: 0.000300
2020-02-12 18:06:42,579 Epoch  27 Step:    65600 Batch Loss:     1.822144 Tokens per Sec:    21533, Lr: 0.000300
2020-02-12 18:06:52,713 Epoch  27 Step:    65700 Batch Loss:     1.648252 Tokens per Sec:    21597, Lr: 0.000300
2020-02-12 18:07:02,728 Epoch  27 Step:    65800 Batch Loss:     1.637809 Tokens per Sec:    22236, Lr: 0.000300
2020-02-12 18:07:12,681 Epoch  27 Step:    65900 Batch Loss:     1.693966 Tokens per Sec:    219

2020-02-12 18:11:50,625 Epoch  28 Step:    68100 Batch Loss:     1.531227 Tokens per Sec:    21840, Lr: 0.000300
2020-02-12 18:12:00,501 Epoch  28 Step:    68200 Batch Loss:     1.345454 Tokens per Sec:    22238, Lr: 0.000300
2020-02-12 18:12:10,423 Epoch  28 Step:    68300 Batch Loss:     1.617265 Tokens per Sec:    22208, Lr: 0.000300
2020-02-12 18:12:20,465 Epoch  28 Step:    68400 Batch Loss:     1.731347 Tokens per Sec:    22151, Lr: 0.000300
2020-02-12 18:12:30,397 Epoch  28 Step:    68500 Batch Loss:     1.498353 Tokens per Sec:    22634, Lr: 0.000300
2020-02-12 18:12:40,264 Epoch  28 Step:    68600 Batch Loss:     1.360098 Tokens per Sec:    22784, Lr: 0.000300
2020-02-12 18:12:50,163 Epoch  28 Step:    68700 Batch Loss:     1.640384 Tokens per Sec:    22473, Lr: 0.000300
2020-02-12 18:13:00,026 Epoch  28 Step:    68800 Batch Loss:     1.543844 Tokens per Sec:    22450, Lr: 0.000300
2020-02-12 18:13:04,790 Epoch  28: total training loss 3821.29
2020-02-12 18:13:04,791 EPOCH 29


2020-02-12 18:17:45,955 Epoch  29 Step:    71100 Batch Loss:     1.727092 Tokens per Sec:    22260, Lr: 0.000300
2020-02-12 18:17:55,968 Epoch  29 Step:    71200 Batch Loss:     1.106232 Tokens per Sec:    22144, Lr: 0.000300
2020-02-12 18:18:05,976 Epoch  29 Step:    71300 Batch Loss:     1.410406 Tokens per Sec:    22345, Lr: 0.000300
2020-02-12 18:18:06,381 Epoch  29: total training loss 3800.10
2020-02-12 18:18:06,381 EPOCH 30
2020-02-12 18:18:16,267 Epoch  30 Step:    71400 Batch Loss:     1.575879 Tokens per Sec:    22000, Lr: 0.000300
2020-02-12 18:18:26,311 Epoch  30 Step:    71500 Batch Loss:     1.388709 Tokens per Sec:    21468, Lr: 0.000300
2020-02-12 18:18:36,239 Epoch  30 Step:    71600 Batch Loss:     1.634655 Tokens per Sec:    22447, Lr: 0.000300
2020-02-12 18:18:46,171 Epoch  30 Step:    71700 Batch Loss:     1.814609 Tokens per Sec:    22184, Lr: 0.000300
2020-02-12 18:18:56,108 Epoch  30 Step:    71800 Batch Loss:     1.648347 Tokens per Sec:    22093, Lr: 0.000300


2020-02-12 20:40:35,739 Epoch  59 Step:   143100 Batch Loss:     1.655330 Tokens per Sec:    21997, Lr: 0.000300
2020-02-12 20:40:45,861 Epoch  59 Step:   143200 Batch Loss:     1.374167 Tokens per Sec:    21439, Lr: 0.000300
2020-02-12 20:40:55,964 Epoch  59 Step:   143300 Batch Loss:     1.405934 Tokens per Sec:    22054, Lr: 0.000300
2020-02-12 20:41:05,941 Epoch  59 Step:   143400 Batch Loss:     1.371737 Tokens per Sec:    22278, Lr: 0.000300
2020-02-12 20:41:15,827 Epoch  59 Step:   143500 Batch Loss:     1.304805 Tokens per Sec:    22822, Lr: 0.000300
2020-02-12 20:41:25,682 Epoch  59 Step:   143600 Batch Loss:     1.395527 Tokens per Sec:    22346, Lr: 0.000300
2020-02-12 20:41:35,595 Epoch  59 Step:   143700 Batch Loss:     1.518479 Tokens per Sec:    22494, Lr: 0.000300
2020-02-12 20:41:45,622 Epoch  59 Step:   143800 Batch Loss:     1.373104 Tokens per Sec:    21925, Lr: 0.000300
2020-02-12 20:41:55,535 Epoch  59 Step:   143900 Batch Loss:     1.451918 Tokens per Sec:    226

2020-02-12 20:46:34,109 Epoch  60 Step:   146100 Batch Loss:     1.409818 Tokens per Sec:    22663, Lr: 0.000300
2020-02-12 20:46:43,994 Epoch  60 Step:   146200 Batch Loss:     1.582477 Tokens per Sec:    22008, Lr: 0.000300
2020-02-12 20:46:54,001 Epoch  60 Step:   146300 Batch Loss:     1.423970 Tokens per Sec:    21738, Lr: 0.000300
2020-02-12 20:47:04,026 Epoch  60 Step:   146400 Batch Loss:     1.432156 Tokens per Sec:    22273, Lr: 0.000300
2020-02-12 20:47:14,184 Epoch  60 Step:   146500 Batch Loss:     1.406879 Tokens per Sec:    21858, Lr: 0.000300
2020-02-12 20:47:24,299 Epoch  60 Step:   146600 Batch Loss:     1.477002 Tokens per Sec:    21966, Lr: 0.000300
2020-02-12 20:47:34,425 Epoch  60 Step:   146700 Batch Loss:     1.466704 Tokens per Sec:    22221, Lr: 0.000300
2020-02-12 20:47:44,529 Epoch  60 Step:   146800 Batch Loss:     1.531728 Tokens per Sec:    22174, Lr: 0.000300
2020-02-12 20:47:54,756 Epoch  60 Step:   146900 Batch Loss:     1.330328 Tokens per Sec:    217

2020-02-12 20:52:31,621 Epoch  61 Step:   149100 Batch Loss:     1.441156 Tokens per Sec:    22866, Lr: 0.000300
2020-02-12 20:52:41,491 Epoch  61 Step:   149200 Batch Loss:     1.400879 Tokens per Sec:    22117, Lr: 0.000300
2020-02-12 20:52:51,454 Epoch  61 Step:   149300 Batch Loss:     1.300657 Tokens per Sec:    22400, Lr: 0.000300
2020-02-12 20:53:01,391 Epoch  61 Step:   149400 Batch Loss:     1.552772 Tokens per Sec:    21856, Lr: 0.000300
2020-02-12 20:53:11,395 Epoch  61 Step:   149500 Batch Loss:     1.522980 Tokens per Sec:    21900, Lr: 0.000300
2020-02-12 20:53:21,480 Epoch  61 Step:   149600 Batch Loss:     1.515286 Tokens per Sec:    21542, Lr: 0.000300
2020-02-12 20:53:31,543 Epoch  61 Step:   149700 Batch Loss:     1.347546 Tokens per Sec:    22254, Lr: 0.000300
2020-02-12 20:53:41,618 Epoch  61 Step:   149800 Batch Loss:     1.403254 Tokens per Sec:    22236, Lr: 0.000300
2020-02-12 20:53:51,674 Epoch  61 Step:   149900 Batch Loss:     1.528272 Tokens per Sec:    217

2020-02-12 20:58:28,091 Epoch  62 Step:   152100 Batch Loss:     1.605337 Tokens per Sec:    22337, Lr: 0.000300
2020-02-12 20:58:38,061 Epoch  62 Step:   152200 Batch Loss:     1.457781 Tokens per Sec:    22304, Lr: 0.000300
2020-02-12 20:58:48,192 Epoch  62 Step:   152300 Batch Loss:     1.252030 Tokens per Sec:    21782, Lr: 0.000300
2020-02-12 20:58:58,216 Epoch  62 Step:   152400 Batch Loss:     1.487466 Tokens per Sec:    22369, Lr: 0.000300
2020-02-12 20:59:02,297 Epoch  62: total training loss 3407.62
2020-02-12 20:59:02,298 EPOCH 63
2020-02-12 20:59:08,413 Epoch  63 Step:   152500 Batch Loss:     0.917715 Tokens per Sec:    20793, Lr: 0.000300
2020-02-12 20:59:18,263 Epoch  63 Step:   152600 Batch Loss:     1.154300 Tokens per Sec:    22631, Lr: 0.000300
2020-02-12 20:59:28,117 Epoch  63 Step:   152700 Batch Loss:     0.896696 Tokens per Sec:    21794, Lr: 0.000300
2020-02-12 20:59:37,994 Epoch  63 Step:   152800 Batch Loss:     1.533581 Tokens per Sec:    22516, Lr: 0.000300


2020-02-12 21:04:27,225 Epoch  64 Step:   155100 Batch Loss:     1.198730 Tokens per Sec:    21476, Lr: 0.000300
2020-02-12 21:04:37,239 Epoch  64 Step:   155200 Batch Loss:     1.229787 Tokens per Sec:    22335, Lr: 0.000300
2020-02-12 21:04:47,198 Epoch  64 Step:   155300 Batch Loss:     1.316694 Tokens per Sec:    21947, Lr: 0.000300
2020-02-12 21:04:57,130 Epoch  64 Step:   155400 Batch Loss:     1.465723 Tokens per Sec:    22158, Lr: 0.000300
2020-02-12 21:05:07,238 Epoch  64 Step:   155500 Batch Loss:     1.254294 Tokens per Sec:    21631, Lr: 0.000300
2020-02-12 21:05:17,387 Epoch  64 Step:   155600 Batch Loss:     1.200530 Tokens per Sec:    21566, Lr: 0.000300
2020-02-12 21:05:27,401 Epoch  64 Step:   155700 Batch Loss:     1.324548 Tokens per Sec:    21785, Lr: 0.000300
2020-02-12 21:05:37,339 Epoch  64 Step:   155800 Batch Loss:     1.348824 Tokens per Sec:    22419, Lr: 0.000300
2020-02-12 21:05:47,241 Epoch  64 Step:   155900 Batch Loss:     1.429127 Tokens per Sec:    228

2020-02-12 21:10:25,797 Epoch  65 Step:   158100 Batch Loss:     1.485911 Tokens per Sec:    23178, Lr: 0.000300
2020-02-12 21:10:35,757 Epoch  65 Step:   158200 Batch Loss:     1.432350 Tokens per Sec:    22102, Lr: 0.000300
2020-02-12 21:10:45,728 Epoch  65 Step:   158300 Batch Loss:     1.526295 Tokens per Sec:    22660, Lr: 0.000300
2020-02-12 21:10:55,886 Epoch  65 Step:   158400 Batch Loss:     1.449861 Tokens per Sec:    21815, Lr: 0.000300
2020-02-12 21:11:06,038 Epoch  65 Step:   158500 Batch Loss:     1.459987 Tokens per Sec:    21674, Lr: 0.000300
2020-02-12 21:11:15,960 Epoch  65 Step:   158600 Batch Loss:     1.538805 Tokens per Sec:    22077, Lr: 0.000300
2020-02-12 21:11:25,910 Epoch  65 Step:   158700 Batch Loss:     1.522396 Tokens per Sec:    22325, Lr: 0.000300
2020-02-12 21:11:35,885 Epoch  65 Step:   158800 Batch Loss:     1.183915 Tokens per Sec:    22091, Lr: 0.000300
2020-02-12 21:11:45,882 Epoch  65 Step:   158900 Batch Loss:     1.379419 Tokens per Sec:    215

2020-02-12 21:16:23,644 Epoch  66 Step:   161100 Batch Loss:     1.236913 Tokens per Sec:    22313, Lr: 0.000300
2020-02-12 21:16:33,592 Epoch  66 Step:   161200 Batch Loss:     1.328223 Tokens per Sec:    22132, Lr: 0.000300
2020-02-12 21:16:43,592 Epoch  66 Step:   161300 Batch Loss:     1.279949 Tokens per Sec:    21972, Lr: 0.000300
2020-02-12 21:16:53,629 Epoch  66 Step:   161400 Batch Loss:     1.409196 Tokens per Sec:    22547, Lr: 0.000300
2020-02-12 21:17:03,665 Epoch  66 Step:   161500 Batch Loss:     1.367484 Tokens per Sec:    21781, Lr: 0.000300
2020-02-12 21:17:13,651 Epoch  66 Step:   161600 Batch Loss:     1.233068 Tokens per Sec:    22163, Lr: 0.000300
2020-02-12 21:17:23,613 Epoch  66 Step:   161700 Batch Loss:     1.384800 Tokens per Sec:    21532, Lr: 0.000300
2020-02-12 21:17:33,728 Epoch  66 Step:   161800 Batch Loss:     1.413949 Tokens per Sec:    21887, Lr: 0.000300
2020-02-12 21:17:43,738 Epoch  66 Step:   161900 Batch Loss:     1.314624 Tokens per Sec:    214

2020-02-12 21:22:21,448 Epoch  67 Step:   164100 Batch Loss:     1.479446 Tokens per Sec:    22097, Lr: 0.000300
2020-02-12 21:22:31,338 Epoch  67 Step:   164200 Batch Loss:     1.453567 Tokens per Sec:    22385, Lr: 0.000300
2020-02-12 21:22:41,451 Epoch  67 Step:   164300 Batch Loss:     1.358651 Tokens per Sec:    21795, Lr: 0.000300
2020-02-12 21:22:51,718 Epoch  67 Step:   164400 Batch Loss:     1.157368 Tokens per Sec:    21548, Lr: 0.000300
2020-02-12 21:23:01,888 Epoch  67 Step:   164500 Batch Loss:     1.052091 Tokens per Sec:    21355, Lr: 0.000300
2020-02-12 21:23:11,869 Epoch  67 Step:   164600 Batch Loss:     1.465715 Tokens per Sec:    22545, Lr: 0.000300
2020-02-12 21:23:21,879 Epoch  67 Step:   164700 Batch Loss:     1.388363 Tokens per Sec:    21592, Lr: 0.000300
2020-02-12 21:23:24,985 Epoch  67: total training loss 3381.95
2020-02-12 21:23:24,985 EPOCH 68
2020-02-12 21:23:32,104 Epoch  68 Step:   164800 Batch Loss:     1.100460 Tokens per Sec:    21014, Lr: 0.000300


2020-02-12 21:28:19,756 Epoch  68 Step:   167100 Batch Loss:     1.320471 Tokens per Sec:    22520, Lr: 0.000300
2020-02-12 21:28:28,596 Epoch  68: total training loss 3378.79
2020-02-12 21:28:28,597 EPOCH 69
2020-02-12 21:28:29,860 Epoch  69 Step:   167200 Batch Loss:     1.252517 Tokens per Sec:    19149, Lr: 0.000300
2020-02-12 21:28:39,683 Epoch  69 Step:   167300 Batch Loss:     0.909732 Tokens per Sec:    22025, Lr: 0.000300
2020-02-12 21:28:49,603 Epoch  69 Step:   167400 Batch Loss:     1.272224 Tokens per Sec:    21858, Lr: 0.000300
2020-02-12 21:28:59,517 Epoch  69 Step:   167500 Batch Loss:     1.142943 Tokens per Sec:    21971, Lr: 0.000300
2020-02-12 21:29:09,505 Epoch  69 Step:   167600 Batch Loss:     1.449285 Tokens per Sec:    22401, Lr: 0.000300
2020-02-12 21:29:19,519 Epoch  69 Step:   167700 Batch Loss:     1.524561 Tokens per Sec:    21455, Lr: 0.000300
2020-02-12 21:29:29,495 Epoch  69 Step:   167800 Batch Loss:     1.219512 Tokens per Sec:    22015, Lr: 0.000300


2020-02-12 21:34:16,872 Epoch  70 Step:   170100 Batch Loss:     1.500207 Tokens per Sec:    21710, Lr: 0.000210
2020-02-12 21:34:26,790 Epoch  70 Step:   170200 Batch Loss:     1.456264 Tokens per Sec:    22478, Lr: 0.000210
2020-02-12 21:34:36,775 Epoch  70 Step:   170300 Batch Loss:     1.225434 Tokens per Sec:    22160, Lr: 0.000210
2020-02-12 21:34:46,674 Epoch  70 Step:   170400 Batch Loss:     1.147439 Tokens per Sec:    22633, Lr: 0.000210
2020-02-12 21:34:56,574 Epoch  70 Step:   170500 Batch Loss:     1.492128 Tokens per Sec:    21642, Lr: 0.000210
2020-02-12 21:35:06,698 Epoch  70 Step:   170600 Batch Loss:     1.142891 Tokens per Sec:    21335, Lr: 0.000210
2020-02-12 21:35:16,806 Epoch  70 Step:   170700 Batch Loss:     1.164842 Tokens per Sec:    21830, Lr: 0.000210
2020-02-12 21:35:26,878 Epoch  70 Step:   170800 Batch Loss:     1.104715 Tokens per Sec:    22225, Lr: 0.000210
2020-02-12 21:35:37,001 Epoch  70 Step:   170900 Batch Loss:     1.374452 Tokens per Sec:    214

2020-02-12 21:40:12,345 Epoch  71 Step:   173100 Batch Loss:     1.539012 Tokens per Sec:    21999, Lr: 0.000210
2020-02-12 21:40:22,200 Epoch  71 Step:   173200 Batch Loss:     1.469249 Tokens per Sec:    22888, Lr: 0.000210
2020-02-12 21:40:32,063 Epoch  71 Step:   173300 Batch Loss:     1.316316 Tokens per Sec:    22537, Lr: 0.000210
2020-02-12 21:40:41,952 Epoch  71 Step:   173400 Batch Loss:     1.058633 Tokens per Sec:    21707, Lr: 0.000210
2020-02-12 21:40:51,943 Epoch  71 Step:   173500 Batch Loss:     1.452735 Tokens per Sec:    22243, Lr: 0.000210
2020-02-12 21:41:01,866 Epoch  71 Step:   173600 Batch Loss:     1.478271 Tokens per Sec:    22119, Lr: 0.000210
2020-02-12 21:41:11,714 Epoch  71 Step:   173700 Batch Loss:     1.305297 Tokens per Sec:    21938, Lr: 0.000210
2020-02-12 21:41:21,573 Epoch  71 Step:   173800 Batch Loss:     1.453498 Tokens per Sec:    22531, Lr: 0.000210
2020-02-12 21:41:31,473 Epoch  71 Step:   173900 Batch Loss:     1.310819 Tokens per Sec:    222

2020-02-12 21:46:10,788 Epoch  72 Step:   176100 Batch Loss:     1.370144 Tokens per Sec:    21528, Lr: 0.000210
2020-02-12 21:46:20,968 Epoch  72 Step:   176200 Batch Loss:     1.523032 Tokens per Sec:    21737, Lr: 0.000210
2020-02-12 21:46:31,047 Epoch  72 Step:   176300 Batch Loss:     1.417220 Tokens per Sec:    22094, Lr: 0.000210


In [21]:
!mkdir -p "$experiment_path/models/${src}${tgt}_transformer/"

In [22]:
# Copy the created models from the notebook storage to google drive for persistant storage 
!cp -r joeynmt/models/${src}${tgt}_transformer/* "$experiment_path/models/${src}${tgt}_transformer/"

In [23]:
# Output our validation accuracy
! cat "$experiment_path/models/${src}${tgt}_transformer/validations.txt"

Steps: 1000	Loss: 105431.41406	PPL: 35.21214	bleu: 2.23576	LR: 0.00030000	*
Steps: 2000	Loss: 90648.33594	PPL: 21.37090	bleu: 3.87765	LR: 0.00030000	*
Steps: 3000	Loss: 83058.23438	PPL: 16.53769	bleu: 4.98396	LR: 0.00030000	*
Steps: 4000	Loss: 77589.78125	PPL: 13.74840	bleu: 7.26332	LR: 0.00030000	*
Steps: 5000	Loss: 73523.78906	PPL: 11.98404	bleu: 9.31015	LR: 0.00030000	*
Steps: 6000	Loss: 70898.39062	PPL: 10.96702	bleu: 10.50791	LR: 0.00030000	*
Steps: 7000	Loss: 68224.22656	PPL: 10.01978	bleu: 11.86918	LR: 0.00030000	*
Steps: 8000	Loss: 66128.57031	PPL: 9.33501	bleu: 13.12602	LR: 0.00030000	*
Steps: 9000	Loss: 64338.49219	PPL: 8.78727	bleu: 14.40652	LR: 0.00030000	*
Steps: 10000	Loss: 62378.29688	PPL: 8.22428	bleu: 15.51252	LR: 0.00030000	*
Steps: 11000	Loss: 61080.13672	PPL: 7.87143	bleu: 15.65950	LR: 0.00030000	*
Steps: 12000	Loss: 59583.07812	PPL: 7.48327	bleu: 16.62673	LR: 0.00030000	*
Steps: 13000	Loss: 58631.89453	PPL: 7.24665	bleu: 16.84451	LR: 0.00030000	*
Steps

In [25]:
# Test our model
! cd joeynmt; python3 -m joeynmt test configs/transformer_$src$tgt.yaml

2020-02-13 08:07:47,711 Hello! This is Joey-NMT.
2020-02-13 08:08:17,568  dev bleu:  30.29 [Beam search decoding with beam size = 5 and alpha = 1.0]
2020-02-13 08:08:23,920 test bleu:  44.70 [Beam search decoding with beam size = 5 and alpha = 1.0]
