bene-ges commited on
Commit
9a3d45b
1 Parent(s): 5bf6c55

Update README.md

Browse files

# English G2P token classification model

This is a non-autoregressive model for English grapheme-to-phoneme (G2P) conversion based on BERT architecture. It predicts phonemes in CMU format.
Initial data was built using CMUdict v0.07.

## Intended uses & limitations

The input is expected to consist of single words, consisting of English letters and apostrophe, all separated by space.

### How to use

Install NeMo.

Download en_g2p.nemo (this model)
```bash
git lfs install
git clone https://huggingface.co/bene-ges/en_g2p_cmu_bert_large
```

Run

```bash
python ${NEMO_ROOT}/examples/nlp/text_normalization_as_tagging/normalization_as_tagging_infer.py \
pretrained_model=en_g2p_cmu_bert_large/en_g2p.nemo \
inference.from_file=input.txt \
inference.out_file=output.txt \
model.max_sequence_len=64 \
inference.batch_size=128 \
lang=en
```

Example of input file:
```
g e f f e r t
p r o s c r i b e d
p r o m i n e n t l y
j o c e l y n
m a r c e c a ' s
s t a n k o w s k i
m u f f l e
```

Example of output file:
```
G EH1 F ER0 T g e f f e r t G EH1 <DELETE> F <DELETE> ER0 T G EH1 <DELETE> F <DELETE> ER0 T PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
P R OW0 S K R AY1 B D p r o s c r i b e d P R OW0 S K R AY1 B <DELETE> D P R OW0 S K R AY1 B <DELETE> D PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
P R AA1 M AH0 N AH0 N T L IY0 p r o m i n e n t l y P R AA1 M AH0 N AH0 N T L IY0 P R AA1 M AH0 N AH0 N T L IY0 PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
JH AO1 S L IH0 N j o c e l y n JH AO1 S <DELETE> L IH0 N JH AO1 S <DELETE> L IH0 N PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
M AA0 R S EH1 K AH0 Z m a r c e c a ' s M AA0 R S EH1 K AH0 <DELETE> Z M AA0 R S EH1 K AH0 <DELETE> Z PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
S T AH0 NG K AO1 F S K IY0 s t a n k o w s k i S T AH0 NG K AO1 F S K IY0 S T AH0 NG K AO1 F S K IY0 PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
M AH1 F AH0L m u f f l e M AH1 <DELETE> F AH0_L <DELETE> M AH1 <DELETE> F AH0_L <DELETE> PLAIN PLAIN PLAIN PLAIN PLAIN PLAIN
```

Note that the correct output tags are in the **third** column, input is in the second column.
Tags correspond to input letters in a one-to-one fashion. If you remove `<DELETE>` tag and `_`, you should get CMU-like transcription.

### How to use for TTS

Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -1,3 +1,10 @@
1
  ---
2
  license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language:
4
+ - en
5
+ library_name: nemo
6
+ pipeline_tag: token-classification
7
+ tags:
8
+ - G2P
9
+ - Grapheme-to-Phoneme
10
+ ---