rasenganai commited on
Commit
edaade8
1 Parent(s): 7ad668d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -1
README.md CHANGED
@@ -1 +1,114 @@
1
- # MahaTTS
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ <h1>MahaTTS: An Open-Source Large Speech Generation Model in the making</h1>
4
+ a Dubverse Black initiative <br> <br>
5
+
6
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1-eOQqznKWwAfMdusJ_LDtDhjIyAlSMrG?usp=sharing)
7
+
8
+ </div>
9
+
10
+ ------
11
+
12
+ ## Description
13
+ MahaTTS (Maha means 'Great' in sanskrit), is a speech generation model which is inspired from tortoise-tts, except it uses seamless M4t wav2vec2 to extract semantic tokens.
14
+ Since seamless M4t wav2vec2 is trained on multilingual data, it makes this model easier to scale on multilingual data.
15
+
16
+ <img width="993" alt="Screenshot 2023-11-19 at 11 53 52 PM" src="https://github.com/dubverse-ai/MahaTTS/assets/32906806/7429d3b6-3f19-4bd8-9005-ff9e16a698f8">
17
+
18
+
19
+ ## Features
20
+ 1. Multilinguality
21
+ 2. Realistic Prosody and intonation
22
+ 3. Multi-voice capabilities
23
+
24
+ ## Current Progress
25
+ Trained on 200 hours of LibriTTS model -> 'Smolie'
26
+
27
+ ## Installation
28
+ ```bash
29
+ pip install git+https://github.com/dubverse-ai/MahaTTS.git
30
+ ```
31
+
32
+ ```bash
33
+ pip install maha-tts
34
+ ```
35
+ ## Roadmap
36
+ - [x] Smolie - eng
37
+ - [ ] Smolie - indic
38
+ - [ ] Optimizations for inference
39
+
40
+ ## Some Generated Samples
41
+ text:
42
+ 0 -> "I seriously laughed so much hahahaha (seals with headphones...) and appreciate both the interviewer and the subject. Major respect for two extraordinary humans - and in this time of gratefulness, I'm thankful for you both and this forum!"
43
+
44
+ 1 -> "I freakin love how Elon came to life the moment they started talking about gaming and specifically diablo, you can tell that he didn't want that part of the discussion to end, while Lex to move on to the next subject! Once a true gamer, always a true gamer!"
45
+
46
+ 2 -> "hello there! how are you?" (This one didn't work well, M1 model hallucinated)
47
+
48
+ 3 -> "Who doesn't love a good scary story, something to send a chill across your skin in the middle of summer's heat or really, any other time? And this year, we're celebrating the two hundredth birthday of one of the most famous scary stories of all time: Frankenstein."
49
+
50
+
51
+
52
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/66fc7a08-3e8a-4d63-a3fa-88bc705a172a
53
+
54
+
55
+
56
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/5acf5a4b-aeb8-4f14-94fe-45811868a886
57
+
58
+
59
+
60
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/0af2ce6e-4172-4aac-9322-4fd545f1d4ac
61
+
62
+
63
+
64
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/2d5b0335-d1fc-473a-aea8-c5bb6afbce27
65
+
66
+
67
+
68
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/a63ba39f-a261-4fe6-8d06-a172a993acc1
69
+
70
+
71
+
72
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/4355f633-9b27-4290-a284-96d650f5f4b8
73
+
74
+
75
+
76
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/7c93d81e-02bc-4819-a97b-d48e39ec5689
77
+
78
+
79
+
80
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/63456535-0b38-429a-a8a0-686cfb6a92c5
81
+
82
+
83
+
84
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/960aa78c-888f-4f0b-a380-145a87f65a99
85
+
86
+
87
+
88
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/5027f0eb-3601-468b-9dda-6b436b774741
89
+
90
+
91
+
92
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/266285e0-a8f3-4784-81dc-f98b0a9c9373
93
+
94
+
95
+
96
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/68ba18d6-430b-41e7-84e5-e15990064836
97
+
98
+
99
+
100
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/0f7321a7-efb1-407c-8b8c-69e812865739
101
+
102
+
103
+
104
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/dcedffe6-d81b-4eff-95c0-cbd00279fdb7
105
+
106
+
107
+
108
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/8050db3e-7acb-44be-a039-7e0b9e6a9905
109
+
110
+
111
+
112
+ https://github.com/dubverse-ai/MahaTTS/assets/32906806/6486af1c-2e14-420b-8419-bf5e01fe49a5
113
+
114
+