Babyloncoder commited on
Commit
60721d2
·
verified ·
1 Parent(s): 3e3ddd8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -9,5 +9,20 @@ app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
 
 
 
 
 
 
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  license: mit
11
  ---
12
+ 1. Libraries and Tools Used:
13
+ - Transformers: Provides the `VitsModel` and `AutoTokenizer`, with the use of `facebook/mms-tts-eng` model, a sophisticated text-to-speech model designed by Facebook.
14
+ - Torch: A companion library for Transformers, essential for processing the data through the speech model.
15
+ - Librosa: A library for audio processing, especially used here for pitch adjustment of the speech.
16
+ - Soundfile: Utilized to save the speech output as an audio file.
17
+ - Tempfile: Creates temporary files for intermediate storage during processing.
18
+ - Gradio: Facilitates the creation of a user-friendly web interface for the text-to-speech application.
19
 
20
+ 2. Pipeline for Text-to-Speech Conversion:
21
+ - Text Input: You begin by typing in the text you want to be converted into speech.
22
+ - Tokenization: `AutoTokenizer` processes this text, preparing it for the speech model.
23
+ - Speech Synthesis: The `facebook/mms-tts-eng` model within the `VitsModel` takes this processed text and generates the spoken words.
24
+ - Pitch Adjustment: 0 Pitch Value: Represents the normal, unaltered pitch of the speech. This is the default state where the voice sounds as it naturally would, without any modifications.
25
+ Negative Pitch Values: When you set the pitch to a negative value, it makes the voice sound higher. This is similar to moving up the notes on a piano, resulting in a higher, perhaps more youthful or feminine tone.
26
+ Positive Pitch Values: Conversely, positive pitch values make the voice sound lower. This is akin to moving down the notes on a piano. A positive pitch shift results in a deeper, more resonant tone, often associated with a more masculine or mature voice.
27
+ - Saving Audio: The speech with the adjusted pitch is saved as an audio file using `Soundfile` and `Tempfile`.
28
+ - Interactive Web Interface: Gradio provides an interface where you input text, adjust the pitch using a slider, and listen to the speech output.