umsuka-en-zu / README.md
MUNasir's picture
Added citation
9ee1fcb verified
|
raw
history blame
924 Bytes
#### Languages:
- Source language: English
- Source language: isiZulu
#### Model Details:
- model: transformer
- Architecture: MarianMT
- pre-processing: normalization + SentencePiece
#### Pre-trained Model:
- https://huggingface.co/Helsinki-NLP/opus-mt-en-xh
#### Corpus:
- Umsuka English-isiZulu Parallel Corpus (https://zenodo.org/record/5035171#.Yh5NIOhBy3A)
#### Benchmark:
| Benchmark | Train | Test |
|-----------|-------|-------|
| Umsuka | 17.61 | 13.73 |
#### GitHub:
- https://github.com/umair-nasir14/Geographical-Distance-Is-The-New-Hyperparameter
#### Citation:
```
@article{umair2022geographical,
title={Geographical Distance Is The New Hyperparameter: A Case Study Of Finding The Optimal Pre-trained Language For English-isiZulu Machine Translation},
author={Umair Nasir, Muhammad and Amos Mchechesi, Innocent},
journal={arXiv e-prints},
pages={arXiv--2205},
year={2022}
}
```