sinhprous
/

F5TTS-stabilized-LJSpeech

Text-to-Speech

English

Model card Files Files and versions Community

sinhprous commited on 10 days ago

Commit

4a5b1ab

verified ·

1 Parent(s): bce517f

Update README.md

Browse files

Files changed (1) hide show

README.md +30 -4

README.md CHANGED Viewed

@@ -18,10 +18,11 @@ Source code for duration predictor: https://github.com/sinhprous/F5-TTS/blob/mai
 ## Audio samples
 Outputs from original model was generated using https://huggingface.co/spaces/mrfakename/E2-F5-TTS
-Data - driven AI systems said, "Key data is the key, data is key, data is key, data is the key, and the key to the data is key, the data key is the key to the data that is key to the key". Can you keep up?
-Original model: (skipping words)
 <audio controls>
   <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_origin_1.wav" type="audio/mp3">
   Your browser does not support the audio element.
@@ -33,10 +34,35 @@ Finetuned model:
   Your browser does not support the audio element.
 </audio>
-Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
-Call one two three - one two three - one two three four who call one two three - one two three - one two three four who call one two three - one two three - one two three four who call one two three - one two three - one two three four.
 ## License
 This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution

 ## Audio samples
 Outputs from original model was generated using https://huggingface.co/spaces/mrfakename/E2-F5-TTS
+The original model usually skips words in these hard texts..
+*Data - driven AI systems said, "Key data is the key, data is key, data is key, data is the key, and the key to the data is key, the data key is the key to the data that is key to the key". Can you keep up? *
+Original model:
 <audio controls>
   <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_origin_1.wav" type="audio/mp3">
   Your browser does not support the audio element.
   Your browser does not support the audio element.
 </audio>
+*Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.*
+Original model:
+<audio controls>
+  <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_origin_2.wav" type="audio/mp3">
+  Your browser does not support the audio element.
+</audio>
+Finetuned model:
+<audio controls>
+  <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_aligned_2.wav" type="audio/mp3">
+  Your browser does not support the audio element.
+</audio>
+*Call one two three - one two three - one two three four who call one two three - one two three - one two three four who call one two three - one two three - one two three four who call one two three - one two three - one two three four.*
+Original model:
+<audio controls>
+  <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_origin_3.wav" type="audio/mp3">
+  Your browser does not support the audio element.
+</audio>
+Finetuned model:
+<audio controls>
+  <source src="https://huggingface.co/sinhprous/F5TTS-stabilized-LJSpeech/resolve/main/audio_samples/sample_aligned_3.wav" type="audio/mp3">
+  Your browser does not support the audio element.
+</audio>
 ## License
 This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution