ONNX

nl_NL-mls-medium voice output is weird

#13
by BramNH - opened

Are the nl_NL models verified? The 7432-low and 5809-low are sounding weird, but understandable.
The mls-medium model is just gibberish. I am doing something wrong? I have tested them by manually installing piper in a python venv and outputted .wav files, but also installed the Docker container and tested within Home Assistant.

Rhasspy org

Those models were trained from audio books, and so they perform poorly with shorter sentences. They are also VERY sensitive to punctuation. Some things that help:

  1. Always use a period at the end of your phrase
  2. Use 0.333 for noise-scale and noise-scale-w

In the Piper sample generator, we have a method for correcting the short sentence problem that hasn't made it into Piper itself yet. We basically just repeat the phrase over and over, and then pull out the audio of the last spoken instance.

What we really need is more Dutch audio datasets with people reading specific Dutch phrases.

I only tested with shorter sentences. Longer sentences are indeed producing something hearable, but very slow.

I assume that retraining the model will not make it perfect and that simply more Dutch audio datasets are required. Could you provide links to where I can help providing these datasets?

For now I will stick to the Belgian Dutch models, those are working fine!

Rhasspy org

If you're interested in contributing (or know someone who is), send me an e-mail at [email protected] and I can get your a login code for the contribution website. Another option is to install Piper recording studio locally and record a dataset.

Thanks!

Just ran into the same issue. I've tried them with 0.333 for both the noise-scale and noise-scale-w (and verified that the settings are actually getting applied!) but no luck. It doesn't produce anything that sounds like dutch (or any language at all), except for the very long sample sentence.

I will also stick to the nl_BE voices for now, which are working great. Sadly, I don't have any datasets to contribute for nl_NL. I hope to see a working nl_NL voice in the future.

Sign up or log in to comment