Spaces:

Respair
/

Darya_TTS

Running

Respair commited on 16 days ago

Commit

797f680

verified ·

1 Parent(s): f7c483f

Update demo.py

Files changed (1) hide show

demo.py CHANGED Viewed

@@ -317,11 +317,13 @@ with gr.Blocks() as longform:
                         outputs=[audio_longform],
                         concurrency_limit=4)
-# --- User Guide / Info Tab (Reformatted User Text) ---
-# Convert Markdown-like text to basic HTML for styling
 user_guide_html = f"""
 <div style="background-color: rgba(30, 30, 30, 0.9); color: #f0f0f0; padding: 20px; border-radius: 10px; border: 1px solid #444;">
     <h2 style="border-bottom: 1px solid #555; padding-bottom: 5px;">Quick Notes:</h2>
     <p>Everything in this demo & the repo (coming soon) is experimental. The main idea is just playing around with different things to see what works when you're limited to training on a pair of RTX 3090s.</p>
     <p>The data used for the english model is rough and pretty tough for any TTS model (think debates, real conversations, plus a little bit of cleaner professional performances). It mostly comes from public sources or third parties (no TOS signed). I'll probably write a blog post later with more details.</p>
     <p>So far I focused on English and Russian, more can be covered.</p>

                         outputs=[audio_longform],
                         concurrency_limit=4)
 user_guide_html = f"""
 <div style="background-color: rgba(30, 30, 30, 0.9); color: #f0f0f0; padding: 20px; border-radius: 10px; border: 1px solid #444;">
     <h2 style="border-bottom: 1px solid #555; padding-bottom: 5px;">Quick Notes:</h2>
+    <p> This is run on a single RTX 3090. </p>
+    <p> These networks can only generate natural speech with correct intonations (i.e generating NSFW, non-speech sounds, stutters etc. doesn't work.) </p>
+    <p> I will gradually update here and -> [Github](https://github.com/Respaired/Project_Kalliope) </p>
     <p>Everything in this demo & the repo (coming soon) is experimental. The main idea is just playing around with different things to see what works when you're limited to training on a pair of RTX 3090s.</p>
     <p>The data used for the english model is rough and pretty tough for any TTS model (think debates, real conversations, plus a little bit of cleaner professional performances). It mostly comes from public sources or third parties (no TOS signed). I'll probably write a blog post later with more details.</p>
     <p>So far I focused on English and Russian, more can be covered.</p>