File size: 3,432 Bytes
1dafe20
 
 
 
 
 
 
 
 
 
 
 
 
541fedd
 
 
1dafe20
 
 
541fedd
1dafe20
 
 
 
541fedd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dafe20
541fedd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dafe20
1ff23e1
 
 
1dafe20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
<html lang="en">

<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width" />
  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for Text-to-speech</title>
  <style>
    h1,div {
      text-align: center;
    }
    textarea {
      width:100%;
    }
    .loading {
      display: none !important;
    }
  </style>
</head>

<body style="font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;">
  <h1>
    Next-gen Kaldi + WebAssembly<br/>
    Text-to-speech Demo with <a href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a>
  </h1>

  <div style="width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;">
    <div id="status">Loading...</div>

    <div id="singleAudioContent" class="tab-content loading">
      <label for="speakerId" id="speakerIdLabel">Speaker ID: </label>
      <input type="text" id="speakerId" name="speakerId" value="0" />
      <br/>
      <br/>
      <label for="speed" id="speedLabel">Speed: </label>
      <input type="range" id="speed" name="speed" min="0.4" max="3.5" step="0.1" value="1.0" />
      <span id="speedValue"></span>
      <br/>
      <br/>
      <textarea id="text" rows="10" placeholder="Please enter your text here and click the Generate button"></textarea>
      <br/>
      <br/>
      <button id="generateBtn" disabled>Generate</button>
    </div>

    <section flex="1" overflow="auto" id="sound-clips">
    </section>
  </div>

  <!-- Footer Section -->
  <div style="width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;">
    <h3>Description</h3>
    <ul>
      <li>Everything is <strong>open-sourced.</strong> <a href="https://github.com/k2-fsa/sherpa-onnx">code</a></li>
      <li>If you have any issues, please either <a href="https://github.com/k2-fsa/sherpa-onnx/issues">file a ticket</a> or contact us via</li>
        <ul>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#wechat">WeChat group</a></li>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#qq">QQ group</a></li>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b">Bilibili</a></li>
        </ul>
    </ul>
    <h3>About This Demo</h3>
    <ul>
      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>
      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>
    </ul>
    <h3>Latest Update</h3>
    <ul>
      <li>Update UI.</li>
      <li>First working version.</li>
    </ul>

    <h3>Acknowledgement</h3>
    <ul>
      <li>We refer to <a href="https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>
    </ul>
  </div>


  <script src="app-tts.js"></script>
  <script src="sherpa-onnx-tts.js"></script>
  <script src="sherpa-onnx-wasm-main-tts.js"></script>
</body>