milmor commited on
Commit
f54e327
·
1 Parent(s): 7ca117f

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +3 -3
app.py CHANGED
@@ -60,10 +60,10 @@ Also, we collected 3,000 extra samples from the web to increase the data.
60
  We employ two training stages using a multilingual T5-small. The advantage of this model is that it can handle different vocabularies and prefixes. T5-small is pre-trained on different tasks and languages (French, Romanian, English, German).
61
 
62
  ### Training-stage 1 (learning Spanish)
63
- In training stage 1, we first introduce Spanish to the model. The goal is to learn a new language rich in data (Spanish) and not lose the previous knowledge. We use the English-Spanish [Anki](https://www.manythings.org/anki/) dataset, which consists of 118.964 text pairs. Next, we train the model till convergence, adding the prefix "Translate Spanish to English: "
64
 
65
  ### Training-stage 2 (learning Nahuatl)
66
- We use the pre-trained Spanish-English model to learn Spanish-Nahuatl. Since the amount of Nahuatl pairs is limited, we also add 20,000 samples from the English-Spanish Anki dataset to our dataset. This two-task training avoids overfitting and makes the model more robust.
67
 
68
  ### Training setup
69
  We train the models on the same datasets for 660k steps using batch size = 16 and a learning rate of 2e-5.
@@ -78,7 +78,7 @@ We evaluate the model on the same 505 validation Nahuatl sentences for a fair co
78
  | True | 1.31 | 6.18 | 28.21 |
79
 
80
 
81
- The English-Spanish pretraining improves BLEU and Chrf and leads to faster convergence. Is it possible to reproduce the evaluation on the [eval.ipynb](https://github.com/milmor/spanish-nahuatl-translation/blob/main/eval.ipynb) notebook.
82
 
83
  ## References
84
  - Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits
 
60
  We employ two training stages using a multilingual T5-small. The advantage of this model is that it can handle different vocabularies and prefixes. T5-small is pre-trained on different tasks and languages (French, Romanian, English, German).
61
 
62
  ### Training-stage 1 (learning Spanish)
63
+ In training stage 1, we first introduce Spanish to the model. The goal is to learn a new language rich in data (Spanish) and not lose the previous knowledge. We use the English-Spanish [Anki](https://www.manythings.org/anki/) dataset, which consists of 118.964 text pairs. The model is trained till convergence, adding the prefix "Translate Spanish to English: "
64
 
65
  ### Training-stage 2 (learning Nahuatl)
66
+ We use the pre-trained Spanish-English model to learn Spanish-Nahuatl. Since the amount of Nahuatl pairs is limited, we also add 20,000 samples from the English-Spanish Anki dataset. This two-task training avoids overfitting and makes the model more robust.
67
 
68
  ### Training setup
69
  We train the models on the same datasets for 660k steps using batch size = 16 and a learning rate of 2e-5.
 
78
  | True | 1.31 | 6.18 | 28.21 |
79
 
80
 
81
+ The English-Spanish pretraining improves BLEU and Chrf and leads to faster convergence. The evaluation is available on the [eval.ipynb](https://github.com/milmor/spanish-nahuatl-translation/blob/main/eval.ipynb) notebook.
82
 
83
  ## References
84
  - Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits