update: some edits
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ Leveraging large-scale unlabeled speech and text data, we pre-train SpeechT5 to
|
|
26 |
Extensive evaluations show the superiority of the proposed SpeechT5 framework on a wide variety of spoken language processing tasks, including automatic speech recognition, speech synthesis, speech translation, voice conversion, speech enhancement, and speaker identification.
|
27 |
|
28 |
- **Developed by:** Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei.
|
29 |
-
- **Shared by [optional]:**
|
30 |
- **Model type:** text-to-speech
|
31 |
- **Language(s) (NLP):** [More Information Needed]
|
32 |
- **License:** [MIT](https://github.com/microsoft/SpeechT5/blob/main/LICENSE
|
@@ -84,6 +84,7 @@ Use the code below to convert text into a mono 16 kHz speech waveform.
|
|
84 |
|
85 |
```python
|
86 |
from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
|
|
|
87 |
import torch
|
88 |
import soundfile as sf
|
89 |
|
|
|
26 |
Extensive evaluations show the superiority of the proposed SpeechT5 framework on a wide variety of spoken language processing tasks, including automatic speech recognition, speech synthesis, speech translation, voice conversion, speech enhancement, and speaker identification.
|
27 |
|
28 |
- **Developed by:** Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei.
|
29 |
+
- **Shared by [optional]:** Matthijs Hollemans
|
30 |
- **Model type:** text-to-speech
|
31 |
- **Language(s) (NLP):** [More Information Needed]
|
32 |
- **License:** [MIT](https://github.com/microsoft/SpeechT5/blob/main/LICENSE
|
|
|
84 |
|
85 |
```python
|
86 |
from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
|
87 |
+
from datasets import load_dataset
|
88 |
import torch
|
89 |
import soundfile as sf
|
90 |
|