bene-ges
/

ru_g2p_ipa_bert_large

Token Classification

Grapheme-to-Phoneme

Model card Files Files and versions Community

bene-ges commited on Mar 25, 2023

Commit

fa25fb6

•

1 Parent(s): cbd8ff2

Update README.md

Files changed (1) hide show

README.md +16 -1

README.md CHANGED Viewed

@@ -4,4 +4,19 @@ language:
 - ru
 library_name: nemo
 pipeline_tag: token-classification
----

 - ru
 library_name: nemo
 pipeline_tag: token-classification
+---
+# Russian G2P token classification model
+This is a non-autoregressive model for Russian grapheme-to-phoneme (G2P) conversion based on BERT architecture. It predicts phonemes in IPA format.
+Initial data was built using Wiktionary json from https://kaikki.org/dictionary/Russian/index.html
+## Intended uses & limitations
+The input is expected to consist of cyrillic letters separated by space. Real space should be replaced to underscore(_).
+Note that the model was trained on single words and some short phrases.
+Though it can accept longer phrases its accuracy may degrade on them.
+### How to use