sriting commited on
Commit
184b86a
·
1 Parent(s): ee03a71

feat: update link in tech report

Browse files
Files changed (1) hide show
  1. index.html +13 -9
index.html CHANGED
@@ -10,7 +10,7 @@
10
  <meta name="keywords" content="latex.css,css library,class-less css,latex css" />
11
  <meta property="og:title"
12
  content="MiniMax-Speech Tech Report | Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder" />
13
- <meta property="og:url" content="https://huggingface.co/spaces/MiniMaxAI/MiniMax-Speech-Tech-Report" />
14
  <meta property="og:description"
15
  content=" MiniMax-Speech, an autoregressive Transformer-based Text-to-Speech (TTS) model that generates high-quality speech" />
16
  <meta property="og:type" content="website" />
@@ -28,9 +28,11 @@
28
  Encoder</h4>
29
  <p class="author">
30
  MiniMax Team <span class="date">May 2025</span><br />
31
- <a style="font-size: 1.1rem;" target="_blank"
32
- href="https://huggingface.co/spaces/MiniMaxAI/MiniMax-Speech-Tech-Report/blob/main/MiniMax_Speech.pdf">[Tech
33
  Report]</a>
 
 
 
34
  </p>
35
  </header>
36
 
@@ -57,13 +59,16 @@
57
  control
58
  via LoRA; text to voice (T2V) by synthesizing timbre features directly from text description; and professional
59
  voice
60
- cloning (PVC) by fine-tuning timbre features with additional data. Welcome to visit
61
- <a href="https://www.minimax.io/audio">MiniMax Audio</a> and
62
- explore our powerful TTS features.
63
  </p>
64
  </div>
65
 
66
  <nav role="navigation" class="toc">
 
 
 
 
 
67
  <h2>Contents</h2>
68
  <ol>
69
  <li>
@@ -232,9 +237,8 @@
232
  features based
233
  on the text content, whereas OneShot adheres more strictly to the speaker characteristics (prosody, speech
234
  rate,
235
- emotions, etc.) demonstrated in the audio prompt (The additional input that OneShot has compared to ZeroShot,
236
- see
237
- technical report for details).
238
  </p>
239
  <div class="scroll-wrapper" style="margin-top: 2rem;">
240
  <table style="width: 100%;">
 
10
  <meta name="keywords" content="latex.css,css library,class-less css,latex css" />
11
  <meta property="og:title"
12
  content="MiniMax-Speech Tech Report | Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder" />
13
+ <meta property="og:url" content="https://minimax-ai.github.io/tts_tech_report" />
14
  <meta property="og:description"
15
  content=" MiniMax-Speech, an autoregressive Transformer-based Text-to-Speech (TTS) model that generates high-quality speech" />
16
  <meta property="og:type" content="website" />
 
28
  Encoder</h4>
29
  <p class="author">
30
  MiniMax Team <span class="date">May 2025</span><br />
31
+ <a style="font-size: 1.1rem;" target="_blank" href="https://arxiv.org/abs/2505.07916">[Tech
 
32
  Report]</a>
33
+ <a style="font-size: 1.1rem; margin-left: 1rem;" target="_blank"
34
+ href="https://huggingface.co/datasets/MiniMaxAI/TTS-Multilingual-Test-Set">[Multilingual Test Set]</a>
35
+ <a style="font-size: 1.1rem; margin-left: 1rem;" target="_blank" href="https://github.com/MiniMax-AI">[GitHub]</a>
36
  </p>
37
  </header>
38
 
 
59
  control
60
  via LoRA; text to voice (T2V) by synthesizing timbre features directly from text description; and professional
61
  voice
62
+ cloning (PVC) by fine-tuning timbre features with additional data.
 
 
63
  </p>
64
  </div>
65
 
66
  <nav role="navigation" class="toc">
67
+ <h2>Explore MiniMax-Speech</h2>
68
+ <p>Welcome to visit
69
+ <a href="https://www.minimax.io/audio">MiniMax Audio</a> and
70
+ explore our powerful TTS features.
71
+ </p>
72
  <h2>Contents</h2>
73
  <ol>
74
  <li>
 
237
  features based
238
  on the text content, whereas OneShot adheres more strictly to the speaker characteristics (prosody, speech
239
  rate,
240
+ emotions, etc.). For details of Zero-Shot and One-Shot, refer to the <a
241
+ href="https://arxiv.org/abs/2505.07916" target="_blank">technical report</a>.
 
242
  </p>
243
  <div class="scroll-wrapper" style="margin-top: 2rem;">
244
  <table style="width: 100%;">