Spaces:
Sleeping
Sleeping
# Introduction to configurations | |
## Dataset | |
``` yaml | |
dataset: # the dataset part is for training only | |
train: | |
wav_scp: './train/wav.scp' | |
mel_scp: './train/mel.scp' | |
dur_scp: './train/dur.scp' | |
emb_type1: | |
_name: 'pinyin' | |
scp: './train/py.scp' | |
vocab: 'py.vocab' | |
emb_type2: | |
_name: 'graphic' | |
scp: './train/gp.scp' | |
vocab: 'gp.vocab' | |
#emb_type3: | |
#_name: 'speaker' | |
# scp: './train/spk.scp' | |
# vocab: # dosn't need vocab | |
emb_type4: | |
_name: 'prosody' | |
scp: './train/psd.scp' | |
vocab: | |
``` | |
## Vocoder | |
```yaml | |
vocoder: | |
type: VocGan # choose one of the following | |
MelGAN: | |
checkpoint: ~/checkpoints/melgan/melgan_ljspeech.pth | |
config: ~/checkpoints/melgan/default.yaml | |
device: cpu | |
VocGan: | |
checkpoint: ~/checkpoints/vctk_pretrained_model_3180.pt #~/checkpoints/ljspeech_29de09d_4000.pt | |
denoise: True | |
device: cpu | |
HiFiGAN: | |
checkpoint: ~/checkpoints/VCTK_V3/generator_v3 # you need to download checkpoint and set the params here | |
device: cpu | |
Waveglow: | |
checkpoint: ~/checkpoints/waveglow_256channels_universal_v5_state_dict.pt | |
sigma: 1.0 | |
denoiser_strength: 0.0 # try 0.1 | |
device: cpu #try cpu if out of memory | |
``` | |
## Make your own changes | |
Two config files are provided in the examples for illustration purpose. You can changed the config file if you know what you are doing. | |
For example, you can remove speaker_emb from the following section, or add prosody embedding if you have prosody label (as in biaobei dataset). | |