File size: 776 Bytes
1be4490
 
 
 
 
 
 
 
d7d37c6
 
 
 
1be4490
 
 
 
 
 
a410373
0218323
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
license: bsd
---

# Model Card for FALCO-TTS

<!-- Provide a quick summary of what the model is/does. -->

This model implements a three-stage, SPEAR-TTS-like model, supporting zero-shot and cross-language speech synthesis. </p>

We trained this model on the corpus MLS (https://openslr.org/94/) and WenetSpeech (https://openslr.org/121/), utilizing about 20,000 hours data, including English and Mandarin part. </p>

This model have the auto code-switch capability.

## Model Details

|Model                            |Parameters   |Attention     |Output Vocab size
|:---                             |:----        |:---          |:---
|text_to_semantic                 |240 M     |Causal        |1024
|semantic_to_acoustic             |370 M     |Causal        |8x 1,024