panda0125 commited on
Commit
1b8e6c5
1 Parent(s): 10dcbd4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +162 -66
README.md CHANGED
@@ -1,103 +1,199 @@
1
  ---
2
- license: mit
3
- language:
4
- - ko
5
- pipeline_tag: text-to-speech
6
  ---
7
 
8
- # MeloTTS
9
 
10
- MeloTTS is a **high-quality multi-lingual** text-to-speech library by [MyShell.ai](https://myshell.ai). Supported languages include:
11
 
12
 
13
- | Model card | Example |
14
- | --- | --- |
15
- | [English](https://huggingface.co/myshell-ai/MeloTTS-English-v2) (American) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/en/EN-US/speed_1.0/sent_000.wav) |
16
- | [English](https://huggingface.co/myshell-ai/MeloTTS-English-v2) (British) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/en/EN-BR/speed_1.0/sent_000.wav) |
17
- | [English](https://huggingface.co/myshell-ai/MeloTTS-English-v2) (Indian) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/en/EN_INDIA/speed_1.0/sent_000.wav) |
18
- | [English](https://huggingface.co/myshell-ai/MeloTTS-English-v2) (Australian) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/en/EN-AU/speed_1.0/sent_000.wav) |
19
- | [English](https://huggingface.co/myshell-ai/MeloTTS-English-v2) (Default) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/en/EN-Default/speed_1.0/sent_000.wav) |
20
- | [Spanish](https://huggingface.co/myshell-ai/MeloTTS-Spanish) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/es/ES/speed_1.0/sent_000.wav) |
21
- | [French](https://huggingface.co/myshell-ai/MeloTTS-French) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/fr/FR/speed_1.0/sent_000.wav) |
22
- | [Chinese](https://huggingface.co/myshell-ai/MeloTTS-Chinese) (mix EN) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/zh/ZH/speed_1.0/sent_008.wav) |
23
- | [Japanese](https://huggingface.co/myshell-ai/MeloTTS-Japanese) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/jp/JP/speed_1.0/sent_000.wav) |
24
- | [Korean](https://huggingface.co/myshell-ai/MeloTTS-Korean/) | [Link](https://myshell-public-repo-hosting.s3.amazonaws.com/myshellttsbase/examples/kr/KR/speed_1.0/sent_000.wav) |
25
 
26
- Some other features include:
27
- - The Chinese speaker supports `mixed Chinese and English`.
28
- - Fast enough for `CPU real-time inference`.
29
 
 
30
 
31
- ## Usage
32
 
33
- ### Without Installation
34
 
35
- An unofficial [live demo](https://huggingface.co/spaces/mrfakename/MeloTTS) is hosted on Hugging Face Spaces.
 
 
 
 
 
 
36
 
37
- #### Use it on MyShell
38
 
39
- There are hundreds of TTS models on MyShell, much more than MeloTTS. See examples [here](https://github.com/myshell-ai/MeloTTS/blob/main/docs/quick_use.md#use-melotts-without-installation).
40
- More can be found at the widget center of [MyShell.ai](https://app.myshell.ai/robot-workshop).
41
 
42
- ### Install and Use Locally
 
 
43
 
44
- Follow the installation steps [here](https://github.com/myshell-ai/MeloTTS/blob/main/docs/install.md#linux-and-macos-install) before using the following snippet:
45
 
46
- ```python
47
- from melo.api import TTS
48
 
49
- # Speed is adjustable
50
- speed = 1.0
51
 
52
- # CPU is sufficient for real-time inference.
53
- # You can set it manually to 'cpu' or 'cuda' or 'cuda:0' or 'mps'
54
- device = 'auto' # Will automatically use GPU if available
55
 
56
- # English
57
- text = "Did you ever hear a folk tale about a giant turtle?"
58
- model = TTS(language='EN', device=device)
59
- speaker_ids = model.hps.data.spk2id
60
 
61
- # American accent
62
- output_path = 'en-us.wav'
63
- model.tts_to_file(text, speaker_ids['EN-US'], output_path, speed=speed)
64
 
65
- # British accent
66
- output_path = 'en-br.wav'
67
- model.tts_to_file(text, speaker_ids['EN-BR'], output_path, speed=speed)
68
 
69
- # Indian accent
70
- output_path = 'en-india.wav'
71
- model.tts_to_file(text, speaker_ids['EN_INDIA'], output_path, speed=speed)
72
 
73
- # Australian accent
74
- output_path = 'en-au.wav'
75
- model.tts_to_file(text, speaker_ids['EN-AU'], output_path, speed=speed)
76
 
77
- # Default accent
78
- output_path = 'en-default.wav'
79
- model.tts_to_file(text, speaker_ids['EN-Default'], output_path, speed=speed)
80
 
81
- ```
82
 
 
83
 
84
- ## Join the Community
85
 
86
- **Open Source AI Grant**
87
 
88
- We are actively sponsoring open-source AI projects. The sponsorship includes GPU resources, fundings and intellectual support (collaboration with top research labs). We welcome both reseach and engineering projects, as long as the open-source community needs them. Please contact [Zengyi Qin](https://www.qinzy.tech/) if you are interested.
89
 
90
- **Contributing**
91
 
92
- If you find this work useful, please consider contributing to the GitHub [repo](https://github.com/myshell-ai/MeloTTS).
93
 
94
- - Many thanks to [@fakerybakery](https://github.com/fakerybakery) for adding the Web UI and CLI part.
95
 
96
- ## License
97
 
98
- This library is under MIT License, which means it is free for both commercial and non-commercial use.
99
 
100
- ## Acknowledgements
101
 
102
- This implementation is based on [TTS](https://github.com/coqui-ai/TTS), [VITS](https://github.com/jaywalnut310/vits), [VITS2](https://github.com/daniilrobnikov/vits2) and [Bert-VITS2](https://github.com/fishaudio/Bert-VITS2). We appreciate their awesome work.
103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: transformers
3
+ tags: []
 
 
4
  ---
5
 
6
+ # Model Card for Model ID
7
 
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
 
10
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
+ ## Model Details
 
 
13
 
14
+ ### Model Description
15
 
16
+ <!-- Provide a longer summary of what this model is. -->
17
 
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
+ ### Model Sources [optional]
29
 
30
+ <!-- Provide the basic links for the model. -->
 
31
 
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
 
36
+ ## Uses
37
 
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
39
 
40
+ ### Direct Use
 
41
 
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
43
 
44
+ [More Information Needed]
 
 
 
45
 
46
+ ### Downstream Use [optional]
 
 
47
 
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
49
 
50
+ [More Information Needed]
 
 
51
 
52
+ ### Out-of-Scope Use
 
 
53
 
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
55
 
56
+ [More Information Needed]
57
 
58
+ ## Bias, Risks, and Limitations
59
 
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
 
62
+ [More Information Needed]
63
 
64
+ ### Recommendations
65
 
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
+ ## How to Get Started with the Model
71
 
72
+ Use the code below to get started with the model.
73
 
74
+ [More Information Needed]
75
 
76
+ ## Training Details
77
 
78
+ ### Training Data
79
 
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]