Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,19 @@ language:
|
|
4 |
- ru
|
5 |
library_name: nemo
|
6 |
pipeline_tag: token-classification
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- ru
|
5 |
library_name: nemo
|
6 |
pipeline_tag: token-classification
|
7 |
+
---
|
8 |
+
|
9 |
+
# Russian G2P token classification model
|
10 |
+
|
11 |
+
This is a non-autoregressive model for Russian grapheme-to-phoneme (G2P) conversion based on BERT architecture. It predicts phonemes in IPA format.
|
12 |
+
Initial data was built using Wiktionary json from https://kaikki.org/dictionary/Russian/index.html
|
13 |
+
|
14 |
+
|
15 |
+
## Intended uses & limitations
|
16 |
+
|
17 |
+
The input is expected to consist of cyrillic letters separated by space. Real space should be replaced to underscore(_).
|
18 |
+
Note that the model was trained on single words and some short phrases.
|
19 |
+
Though it can accept longer phrases its accuracy may degrade on them.
|
20 |
+
|
21 |
+
### How to use
|
22 |
+
|