Commit
·
7116a2a
1
Parent(s):
b2c866d
Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,17 @@ tags:
|
|
8 |
|
9 |
This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
## Table Of Content
|
12 |
|
13 |
- [Example](#example)
|
@@ -17,10 +28,8 @@ This repository consists of the n-gram language models trained on Common Crawl d
|
|
17 |
|
18 |
## Example
|
19 |
|
20 |
-
|
21 |
|
22 |
-
TODO
|
23 |
-
```
|
24 |
|
25 |
## Supported Languages
|
26 |
|
|
|
8 |
|
9 |
This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
|
10 |
|
11 |
+
|
12 |
+
For the following languages, the LMs are not present in the repository (due to 50GB limit on HuggingFace) and can be downloaded using the link provided here.
|
13 |
+
|
14 |
+
Mandarin Chinese (Simplified) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/cmn-script_simplified/char_20gram.bin)
|
15 |
+
|
16 |
+
Japanese - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/jpn/char_20gram.bin)
|
17 |
+
|
18 |
+
Thai - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/tha/char_20gram.bin)
|
19 |
+
|
20 |
+
Cantonese(Traditional) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/yue-script_traditional/char_20gram.bin)
|
21 |
+
|
22 |
## Table Of Content
|
23 |
|
24 |
- [Example](#example)
|
|
|
28 |
|
29 |
## Example
|
30 |
|
31 |
+
Checkout the code here - https://huggingface.co/spaces/mms-meta/MMS/blob/main/asr.py which uses LMs for decoding the output from ASR models.
|
32 |
|
|
|
|
|
33 |
|
34 |
## Supported Languages
|
35 |
|