AOLCDROM commited on
Commit
efb610a
1 Parent(s): 88888bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -6,6 +6,35 @@ requires the tokenizer in tokenizers/
6
 
7
  Voice latents pre-computed in voices/
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
  license: other
11
  language:
 
6
 
7
  Voice latents pre-computed in voices/
8
 
9
+ For use in MRQ Voice Cloning WebUI:
10
+
11
+ Requires the tokenizer used in training, and code changes to disable text cleaners. At minimum, change english_cleaners to basic_cleaners.
12
+
13
+ Code changes:
14
+ modules\tortoise-tts\tortoise\utils\tokenizer.py
15
+ Change Line 201: txt = english_cleaners(txt) and replace it
16
+ with txt = basic_cleaners(txt)
17
+
18
+ modules\tortoise-tts\build\lib\tortoise\utils\tokenizer.py
19
+ Change Line 201: txt = english_cleaners(txt) and replace it
20
+ with txt = basic_cleaners(txt)
21
+
22
+ \modules\dlas\dlas\data\audio\paired_voice_audio_dataset.py
23
+ Line 133: return text_to_sequence(txt, ['english_cleaners'])
24
+ and replace it with: return text_to_sequence(txt, ['basic_cleaners'])
25
+
26
+ modules\dlas\dlas\data\audio\voice_tokenizer.py
27
+ Line 14: from dlas.models.audio.tts.tacotron2.text.cleaners import
28
+ english_cleaners
29
+ to: from dlas.models.audio.tts.tacotron2.text.cleaners import
30
+ english_cleaners, basic_cleaners
31
+ Line 85: txt = english_cleaners(txt) to txt =
32
+ basic_cleaners(txt)
33
+ Line 134: word = english_cleaners(word) to basic_cleaners(word)
34
+
35
+ Copy and paste German text into the tokenizer tester on the utilities
36
+ tab, and you should see it tokenized with all of the special
37
+ characters, and no [UNK].
38
  ---
39
  license: other
40
  language: