Upload 4 files
Browse files
README.md
CHANGED
@@ -10,13 +10,14 @@ This project provides a Grapheme to Phoneme (G2P) conversion tool that first che
|
|
10 |
- **Stress Addition**: The second Transformer model adds stress markers to the phonemes.
|
11 |
3. **ARPAbet Output**: Outputs phonemes in ARPAbet format.
|
12 |
4. **Phoneme Integer Indices**: Converts graphemes to phoneme integer indices.
|
|
|
13 |
|
14 |
## Installation
|
15 |
|
16 |
1. Clone the repository:
|
17 |
```sh
|
18 |
-
git clone https://github.com/NikiPshg/
|
19 |
-
cd
|
20 |
```
|
21 |
|
22 |
2. Install the required dependencies:
|
@@ -31,15 +32,17 @@ This project provides a Grapheme to Phoneme (G2P) conversion tool that first che
|
|
31 |
from G2P_lexicon import g2p_en_lexicon
|
32 |
|
33 |
# Initialize the G2P converter
|
34 |
-
|
35 |
-
|
36 |
# Convert a word to phonemes
|
37 |
text = "text, numbers, and some strange symbols !№;% 21"
|
38 |
-
phonemes =
|
39 |
-
['T', 'EH', 'K', 'S', 'T', ' ', ',', ' ',
|
40 |
-
'
|
41 |
-
'
|
42 |
-
'
|
|
|
|
|
|
|
43 |
|
44 |
|
45 |
|
|
|
10 |
- **Stress Addition**: The second Transformer model adds stress markers to the phonemes.
|
11 |
3. **ARPAbet Output**: Outputs phonemes in ARPAbet format.
|
12 |
4. **Phoneme Integer Indices**: Converts graphemes to phoneme integer indices.
|
13 |
+
5. A BPE tokenizer was used, which led to a better translation quality
|
14 |
|
15 |
## Installation
|
16 |
|
17 |
1. Clone the repository:
|
18 |
```sh
|
19 |
+
git clone https://github.com/NikiPshg/Grapheme-to-Phoneme-G2P-with-Stress.git
|
20 |
+
cd Grapheme-to-Phoneme-G2P-with-Stress
|
21 |
```
|
22 |
|
23 |
2. Install the required dependencies:
|
|
|
32 |
from G2P_lexicon import g2p_en_lexicon
|
33 |
|
34 |
# Initialize the G2P converter
|
35 |
+
g2p = g2p_en_lexicon()
|
|
|
36 |
# Convert a word to phonemes
|
37 |
text = "text, numbers, and some strange symbols !№;% 21"
|
38 |
+
phonemes = g2p(text, with_stress=False)
|
39 |
+
['T', 'EH', 'K', 'S', 'T', ' ', ',', ' ',
|
40 |
+
'N', 'AH', 'M', 'B', 'ER', 'Z',' ', ',', ' ',
|
41 |
+
'AE', 'N', 'D', ' ', 'S', 'AH', 'M', ' ',
|
42 |
+
'S', 'T', 'R', 'EY', 'N', 'JH',' ',
|
43 |
+
'S', 'IH', 'M', 'B', 'AH', 'L', 'Z',' ',
|
44 |
+
'T', 'W', 'EH', 'N', 'IY', ' ', 'W', 'AH', 'N']
|
45 |
+
|
46 |
|
47 |
|
48 |
|