parler-tts
/

parler-tts-mini-v1

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

ylacombe commited on Aug 7

Commit

370e6a3

•

1 Parent(s): c68d438

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -77,11 +77,6 @@ audio_arr = generation.cpu().numpy().squeeze()
 sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
-**Tips**:
-* Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
-* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
-* The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt
 ### 🎯 Using a specific speaker
 To ensure speaker consistency across generations, this checkpoint was also trained on 34 speakers, characterized by name (e.g. Jon, Lea, Gary, Jenna, Mike, Laura).
@@ -110,6 +105,11 @@ audio_arr = generation.cpu().numpy().squeeze()
 sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
 ## Motivation
 Parler-TTS is a reproduction of work from the paper [Natural language guidance of high-fidelity text-to-speech with synthetic annotations](https://www.text-description-to-speech.com) by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.

 sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
 ### 🎯 Using a specific speaker
 To ensure speaker consistency across generations, this checkpoint was also trained on 34 speakers, characterized by name (e.g. Jon, Lea, Gary, Jenna, Mike, Laura).
 sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
+**Tips**:
+* Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
+* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
+* The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt
 ## Motivation
 Parler-TTS is a reproduction of work from the paper [Natural language guidance of high-fidelity text-to-speech with synthetic annotations](https://www.text-description-to-speech.com) by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.