Model Description

This model is an end-to-end deep-learning-based Kinyarwanda Text-to-Speech (TTS). The model was trained using the Coqui's TTS library, and the YourTTS[1] architecture.

Usage

Install the Coqui's TTS library:

pip install TTS

Download the files from this repo, then run:

tts --text "text" --model_path model.pth --config_path config.json --speakers_file_path speakers.pth --speaker_wav conditioning_audio.wav --out_path out.wav

Where the conditioning audio is a wav file(s) to condition a multi-speaker TTS model with a Speaker Encoder, you can give multiple file paths. The d_vectors is computed as their average.

References

[1] YourTTS paper

[2] Kinyarwanda TTS: Using a multi-speaker dataset to build a Kinyarwanda TTS model

Downloads last month
20
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.