edaiofficial commited on
Commit
5716dec
·
2 Parent(s): a2c106c 5e994ef

Merge branch 'main' of https://huggingface.co/chrisjay/masakhane_benchmarks into main

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: african-languages
3
+ tags:
4
+ - african-languages
5
+ - machine-translation
6
+ - text
7
+ license: apache-2.0
8
+ model-index:
9
+ - name: Masakhane Benchmark Models
10
+ results:
11
+ - task:
12
+ name: Machine Translation
13
+ type: machine-translation
14
+ dataset:
15
+ name: masakhane benchmarks
16
+ args: african-languages
17
+
18
+ ---
19
+ # Interacting with the Masakhane Benchmark Models
20
+
21
+ I created this demo for very easy interaction with the [benchmark models on Masakhane](https://github.com/masakhane-io/masakhane-mt/tree/master/benchmarks) which were trained with [JoeyNMT](https://github.com/chrisemezue/joeynmt)(my forked version).
22
+
23
+ To access the space click [here](https://huggingface.co/spaces/chrisjay/masakhane-benchmarks).
24
+
25
+ To include your language, all you need to do is:
26
+ 1. Create a folder in the format *src-tgt/main* for your language pair, if it does not exist.
27
+ 2. Inside the *main* folder put the following files:
28
+ 1. model checkpoint. Rename it to `best.ckpt`.
29
+ 2. `config.yaml` file. This is the JoeyNMT config file which loads the model an pre-processing parameters.
30
+ 3. `src_vocab.txt` file.
31
+ 4. `trg_vocab.txt` file.
32
+
33
+ The space currently supports these languages:
34
+
35
+ | source language | target language |
36
+ |:---------------:|:---------------:|
37
+ | English | Swahili |
38
+ | English | Afrikaans |
39
+ | English | Arabic |
40
+ | Efik | English |
41
+ | English | Hausa |
42
+ | English | Igbo |
43
+ | English | Fon |
44
+ | English | Twi |
45
+ | Shona | English |
46
+ | Swahili | English |
47
+ | Yoruba | English |
48
+
49
+ TO DO:
50
+ 1. Improve the inference time.
51
+ 2. Include more languages from the benchmark.