bene-ges
/

en_g2p_cmu_bert_large

Token Classification

Grapheme-to-Phoneme

Model card Files Files and versions Community

bene-ges commited on May 14, 2023

Commit

8bab68f

•

1 Parent(s): 9a3d45b

Update README.md

Files changed (1) hide show

README.md +62 -1

README.md CHANGED Viewed

@@ -7,4 +7,65 @@ pipeline_tag: token-classification
 tags:
 - G2P
 - Grapheme-to-Phoneme
----

 tags:
 - G2P
 - Grapheme-to-Phoneme
+---
+# English G2P token classification model
+This is a non-autoregressive model for English grapheme-to-phoneme (G2P) conversion based on BERT architecture. It predicts phonemes in CMU format.
+Initial data was built using CMUdict v0.07
+## Intended uses & limitations
+The input is expected to contain english words consisting of latin letters and apostrophe, all letters separated by space.
+### How to use
+Install NeMo.
+Download en_g2p.nemo (this model)
+```bash
+git lfs install
+git clone https://huggingface.co/bene-ges/en_g2p_cmu_bert_large
+```
+Run
+```bash
+python ${NEMO_ROOT}/examples/nlp/text_normalization_as_tagging/normalization_as_tagging_infer.py \
+  pretrained_model=en_g2p_cmu_bert_large/en_g2p.nemo \
+  inference.from_file=input.txt \
+  inference.out_file=output.txt \
+  model.max_sequence_len=64 \
+  inference.batch_size=128 \
+  lang=en
+```
+Example of input file:
+```
+g e f f e r t
+p r o s c r i b e d
+p r o m i n e n t l y
+j o c e l y n
+m a r c e c a ' s
+s t a n k o w s k i
+m u f f l e
+```
+Example of output file:
+```
+G EH1  F  ER0 T	g e f f e r t	G EH1 <DELETE> F <DELETE> ER0 T	G EH1 <DELETE> F <DELETE> ER0 T	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+P R OW0 S K R AY1 B  D	p r o s c r i b e d	P R OW0 S K R AY1 B <DELETE> D	P R OW0 S K R AY1 B <DELETE> D	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+P R AA1 M AH0 N AH0 N T L IY0	p r o m i n e n t l y	P R AA1 M AH0 N AH0 N T L IY0	P R AA1 M AH0 N AH0 N T L IY0	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+JH AO1 S  L IH0 N	j o c e l y n	JH AO1 S <DELETE> L IH0 N	JH AO1 S <DELETE> L IH0 N	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+M AA0 R S EH1 K AH0  Z	m a r c e c a ' s	M AA0 R S EH1 K AH0 <DELETE> Z	M AA0 R S EH1 K AH0 <DELETE> Z	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+S T AH0 NG K AO1 F S K IY0	s t a n k o w s k i	S T AH0 NG K AO1 F S K IY0	S T AH0 NG K AO1 F S K IY0	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+M AH1  F AH0L	m u f f l e	M AH1 <DELETE> F AH0_L <DELETE>	M AH1 <DELETE> F AH0_L <DELETE>	PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
+```
+Note that the correct output tags are in the **third** column, input is in the second column.
+Tags correspond to input letters in a one-to-one fashion. If you remove `<DELETE>` tag, `_`, you should get CMU-like transcription.
+### How to use for TTS