Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,65 @@ pipeline_tag: token-classification
|
|
7 |
tags:
|
8 |
- G2P
|
9 |
- Grapheme-to-Phoneme
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
tags:
|
8 |
- G2P
|
9 |
- Grapheme-to-Phoneme
|
10 |
+
---
|
11 |
+
|
12 |
+
# English G2P token classification model
|
13 |
+
|
14 |
+
This is a non-autoregressive model for English grapheme-to-phoneme (G2P) conversion based on BERT architecture. It predicts phonemes in CMU format.
|
15 |
+
Initial data was built using CMUdict v0.07
|
16 |
+
|
17 |
+
|
18 |
+
## Intended uses & limitations
|
19 |
+
|
20 |
+
The input is expected to contain english words consisting of latin letters and apostrophe, all letters separated by space.
|
21 |
+
|
22 |
+
### How to use
|
23 |
+
|
24 |
+
Install NeMo.
|
25 |
+
|
26 |
+
Download en_g2p.nemo (this model)
|
27 |
+
```bash
|
28 |
+
git lfs install
|
29 |
+
git clone https://huggingface.co/bene-ges/en_g2p_cmu_bert_large
|
30 |
+
```
|
31 |
+
|
32 |
+
Run
|
33 |
+
|
34 |
+
```bash
|
35 |
+
python ${NEMO_ROOT}/examples/nlp/text_normalization_as_tagging/normalization_as_tagging_infer.py \
|
36 |
+
pretrained_model=en_g2p_cmu_bert_large/en_g2p.nemo \
|
37 |
+
inference.from_file=input.txt \
|
38 |
+
inference.out_file=output.txt \
|
39 |
+
model.max_sequence_len=64 \
|
40 |
+
inference.batch_size=128 \
|
41 |
+
lang=en
|
42 |
+
```
|
43 |
+
|
44 |
+
Example of input file:
|
45 |
+
```
|
46 |
+
g e f f e r t
|
47 |
+
p r o s c r i b e d
|
48 |
+
p r o m i n e n t l y
|
49 |
+
j o c e l y n
|
50 |
+
m a r c e c a ' s
|
51 |
+
s t a n k o w s k i
|
52 |
+
m u f f l e
|
53 |
+
```
|
54 |
+
|
55 |
+
Example of output file:
|
56 |
+
```
|
57 |
+
G EH1 F ER0 T g e f f e r t G EH1 <DELETE> F <DELETE> ER0 T G EH1 <DELETE> F <DELETE> ER0 T PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
58 |
+
P R OW0 S K R AY1 B D p r o s c r i b e d P R OW0 S K R AY1 B <DELETE> D P R OW0 S K R AY1 B <DELETE> D PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
59 |
+
P R AA1 M AH0 N AH0 N T L IY0 p r o m i n e n t l y P R AA1 M AH0 N AH0 N T L IY0 P R AA1 M AH0 N AH0 N T L IY0 PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
60 |
+
JH AO1 S L IH0 N j o c e l y n JH AO1 S <DELETE> L IH0 N JH AO1 S <DELETE> L IH0 N PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
61 |
+
M AA0 R S EH1 K AH0 Z m a r c e c a ' s M AA0 R S EH1 K AH0 <DELETE> Z M AA0 R S EH1 K AH0 <DELETE> Z PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
62 |
+
S T AH0 NG K AO1 F S K IY0 s t a n k o w s k i S T AH0 NG K AO1 F S K IY0 S T AH0 NG K AO1 F S K IY0 PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
63 |
+
M AH1 F AH0L m u f f l e M AH1 <DELETE> F AH0_L <DELETE> M AH1 <DELETE> F AH0_L <DELETE> PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
|
64 |
+
```
|
65 |
+
|
66 |
+
Note that the correct output tags are in the **third** column, input is in the second column.
|
67 |
+
Tags correspond to input letters in a one-to-one fashion. If you remove `<DELETE>` tag, `_`, you should get CMU-like transcription.
|
68 |
+
|
69 |
+
### How to use for TTS
|
70 |
+
|
71 |
+
|