zonos-longform-unleashed

Running on Zero

benjamin-paine commited on Feb 16

Commit

d0d9cc8

verified ·

1 Parent(s): 4cef777

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -24,9 +24,8 @@ State of the art text-to-speech model [[model]](https://huggingface.co/collectio
 ## Unleashed
 Use this space to generate long-form speech up to around ~2 minutes in length. To generate an unlimited length, clone this space and run it locally.
 ### Tips
-- If you are generating more than one chunk of audio, you should supply speaker conditioning. Otherwise, each chunk will have a slightly different voice.
 - When providing prefix audio, include the text of the prefix audio in your speech text to ensure a smooth transition.
-- The cleaner the speaker audio, the better the speaker conditioning will be - however, speaker audio is only sampled at 16kHz, so you do not need to provide high-bitrate speaker audio. Unlike this, however, prefix audio should be high-quality, as it is sampled at the full 44.1kHz.
 - The appropriate range of Speaking Rate and Pitch STD are highly dependent on the speaker audio. Start with the defaults and adjust as needed.
 - Emotion sliders do not completely function intuitively, and require some experimentation to get the desired effect.
 """.strip()

 ## Unleashed
 Use this space to generate long-form speech up to around ~2 minutes in length. To generate an unlimited length, clone this space and run it locally.
 ### Tips
 - When providing prefix audio, include the text of the prefix audio in your speech text to ensure a smooth transition.
 - The appropriate range of Speaking Rate and Pitch STD are highly dependent on the speaker audio. Start with the defaults and adjust as needed.
 - Emotion sliders do not completely function intuitively, and require some experimentation to get the desired effect.
 """.strip()