microsoft
/

speecht5_tts

Model card Files Files and versions

Ezi commited on Feb 28, 2023

Commit

e0cc994

·

1 Parent(s): bc779c1

update: some edits

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ Leveraging large-scale unlabeled speech and text data, we pre-train SpeechT5 to
 Extensive evaluations show the superiority of the proposed SpeechT5 framework on a wide variety of spoken language processing tasks, including automatic speech recognition, speech synthesis, speech translation, voice conversion, speech enhancement, and speaker identification.
 - **Developed by:** Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei.
-- **Shared by [optional]:** Mathijs Hollemans
 - **Model type:** text-to-speech
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [MIT](https://github.com/microsoft/SpeechT5/blob/main/LICENSE
@@ -84,6 +84,7 @@ Use the code below to convert text into a mono 16 kHz speech waveform.
 ```python
 from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
 import torch
 import soundfile as sf

 Extensive evaluations show the superiority of the proposed SpeechT5 framework on a wide variety of spoken language processing tasks, including automatic speech recognition, speech synthesis, speech translation, voice conversion, speech enhancement, and speaker identification.
 - **Developed by:** Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei.
+- **Shared by [optional]:** Matthijs Hollemans
 - **Model type:** text-to-speech
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [MIT](https://github.com/microsoft/SpeechT5/blob/main/LICENSE
 ```python
 from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
+from datasets import load_dataset
 import torch
 import soundfile as sf