sulaimank
/

tts-tacotron2-commonvoice-single-female

speech-synthesis

Inference Endpoints

Model card Files Files and versions Community

sulaimank commited on Mar 8

Commit

4a68622

•

1 Parent(s): e981371

Update README.md

Files changed (1) hide show

README.md +3 -39

README.md CHANGED Viewed

@@ -40,11 +40,11 @@ from speechbrain.inference.TTS import Tacotron2
 from speechbrain.inference.vocoders import HIFIGAN
 # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
-tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts")
-hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
 # Running the TTS
-mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")
 # Running Vocoder (spectrogram-to-waveform)
 waveforms = hifi_gan.decode_batch(mel_output)
@@ -53,42 +53,6 @@ waveforms = hifi_gan.decode_batch(mel_output)
 torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
 ```
-If you want to generate multiple sentences in one-shot, you can do in this way:
-```
-from speechbrain.pretrained import Tacotron2
-tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
-items = [
-       "A quick brown fox jumped over the lazy dog",
-       "How much wood would a woodchuck chuck?",
-       "Never odd or even"
-     ]
-mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)
-```
-### Inference on GPU
-To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling the `from_hparams` method.
-### Training
-The model was trained with SpeechBrain.
-To train it from scratch follow these steps:
-1. Clone SpeechBrain:
-```bash
-git clone https://github.com/speechbrain/speechbrain/
-```
-2. Install it:
-```bash
-cd speechbrain
-pip install -r requirements.txt
-pip install -e .
-```
-3. Run Training:
-```bash
-cd recipes/LJSpeech/TTS/tacotron2/
-python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
-```
-You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1PKju-_Nal3DQqd-n0PsaHK-bVIOlbf26?usp=sharing).
 ### Limitations
 The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.

 from speechbrain.inference.vocoders import HIFIGAN
 # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
+tacotron2 = Tacotron2.from_hparams(source="Sulaimank/tts-tacotron2-commonvoice-single-female", savedir="tmpdir_tts")
+hifi_gan = HIFIGAN.from_hparams(source="Sulaimank/tts-hifigan-commonvoice-single-female", savedir="tmpdir_vocoder")
 # Running the TTS
+mel_output, mel_length, alignment = tacotron2.encode_text("Obwedda ndowooza wagenze.")
 # Running Vocoder (spectrogram-to-waveform)
 waveforms = hifi_gan.decode_batch(mel_output)
 torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
 ```
 ### Limitations
 The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.