File size: 3,452 Bytes
e416924
 
 
 
5f11ceb
e416924
20935f9
 
257fd44
 
 
20935f9
257fd44
 
 
 
 
20935f9
257fd44
 
 
 
 
20935f9
 
 
 
 
 
 
257fd44
 
 
20935f9
 
 
 
257fd44
 
 
20935f9
 
 
 
 
 
 
 
 
257fd44
 
 
 
20935f9
 
 
 
257fd44
20935f9
257fd44
20935f9
 
 
 
257fd44
 
20935f9
 
 
257fd44
 
 
20935f9
 
 
 
 
257fd44
 
20935f9
 
 
 
257fd44
20935f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5f11ceb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
Model Type: Text to Speech
Supported Languages: Assamese, Bengali, Bodo, Gujarati, Hindi, Kannada, Malayalam, Manipuri, Marathi, Odia, Punjabi, Rajasthani, Tamil, Telugu, Urdu
---
***Demo: [IITM-TTS Demo](https://iitm-tts.onrender.com) | This may take approximately 30 seconds to load the first time and will go idle after 15 minutes of inactivity.***

# Fastspeech2_HS_Flask_API

This repository contains the Flask API implementation of the Text to Speech Model developed by the Speech Lab at IIT Madras. 
For a comprehensive understanding of the models and inference details, please consult the original repository 
[Fastspeech2_HS](https://github.com/smtiitm/Fastspeech2_HS).

### Table of Contents
- [Setup](#setup)
- [Installation](#installation)
- [Run Flask server](#run-flask-server)
- [Citation for the original repo](#citation-for-the-original-repo)

### Setup
Some of the large files in this repo are uploaded using git lfs. Install latest git LFS by following the given commands:

Some of the large files in this repository have been uploaded using Git-LFS. 
To ensure seamless handling of these files, please install Git-LFS by executing the provided commands:

```
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.python.sh | bash
sudo apt-get install git-lfs
git lfs install
```

The entire repository, including the models, has been uploaded to Hugging Face 
"[Fastspeech2_HS_Flask_API](https://huggingface.co/k-m-irfan/Fastspeech2_HS_Flask_API)" due to size restrictions on GitHub for Git LFS. 
To clone the repository from Hugging Face, please use the following command:

```
git clone https://huggingface.co/k-m-irfan/Fastspeech2_HS_Flask_API
```

Alternatively, you can download the models from the original repository [Fastspeech2_HS](https://github.com/smtiitm/Fastspeech2_HS) 
and organize the folder structure as specified below. Skip this step if already cloned the repository from Hugging Face.

```
models
β”œβ”€β”€ hindi
β”‚   β”œβ”€β”€ female
β”‚   └── male
β”œβ”€β”€ tamil
β”‚   β”œβ”€β”€ female
β”‚   └── male
.
.
.
└── marathi
    β”œβ”€β”€ female
    └── male
```

### Installation:

Create a virtual environment and activate it:
```
python3 -m venv tts-hs-hifigan
source tts-hs-hifigan/bin/activate
```

Install the required dependencies by running:
```
pip install -r requirements.txt
```

### Run Flask server:
Ensure the server application is running correctly before proceeding. Use the following commands and check for any errors:
```
python3 flask_app.py
# OR
gunicorn -w 2 -b 0.0.0.0:5000 flask_app:app --timeout 600
```

If the application is running without any issues, proceed to start the server using the following command:
```
bash start.sh
```

### Citation for the original repo
If you use this Fastspeech2 Model in your research or work, please consider citing:

β€œ
COPYRIGHT
2023, Speech Technology Consortium,
Bhashini, MeiTY and by Hema A Murthy & S Umesh,
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
and
ELECTRICAL ENGINEERING,
IIT MADRAS. ALL RIGHTS RESERVED "



Shield: [![CC BY 4.0][cc-by-shield]][cc-by]

This work is licensed under a
[Creative Commons Attribution 4.0 International License][cc-by].

[![CC BY 4.0][cc-by-image]][cc-by]

[cc-by]: http://creativecommons.org/licenses/by/4.0/
[cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg