Spaces:
No application file
No application file
Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,135 @@ app_file: app.py
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
+
# Video Dubber
|
13 |
+
|
14 |
+
The program for automatic dubbing any video file for a lot of languages.
|
15 |
+
|
16 |
+
This Python script extracts the audio from a video file, transcribes it,
|
17 |
+
translates it into a different language, generates a new audio file with
|
18 |
+
the translated text, and then merges it with the original video.
|
19 |
+
|
20 |
+
## Prerequisites
|
21 |
+
|
22 |
+
- Python 3.8 or higher
|
23 |
+
- [FFmpeg](https://ffmpeg.org/download.html)
|
24 |
+
|
25 |
+
## Technologies Used
|
26 |
+
|
27 |
+
- [Google Cloud Text-to-Speech
|
28 |
+
API](https://cloud.google.com/text-to-speech): Used to generate the
|
29 |
+
audio for the translated text.
|
30 |
+
- [Google Cloud Translate API](https://cloud.google.com/translate):
|
31 |
+
Used to translate the transcribed text into a different language.
|
32 |
+
- [Whisper ASR](https://www.openai.com/research/whisper/): Used to
|
33 |
+
transcribe the audio from the video file.
|
34 |
+
- [Spacy](https://spacy.io/): Used for natural language processing
|
35 |
+
tasks, such as tokenization and syllable counting.
|
36 |
+
- [PyDub](http://pydub.com/): Used for manipulating audio files.
|
37 |
+
- [MoviePy](https://zulko.github.io/moviepy/): Used for extracting the
|
38 |
+
audio from the video file.
|
39 |
+
|
40 |
+
## Installation
|
41 |
+
|
42 |
+
1. Clone this repository:
|
43 |
+
|
44 |
+
git clone https://github.com/am-sokolov/videodubber.git
|
45 |
+
|
46 |
+
2. Install the required Python packages:
|
47 |
+
|
48 |
+
pip install -r requirements.txt
|
49 |
+
|
50 |
+
## Google Cloud Credentials
|
51 |
+
|
52 |
+
This script uses Google Cloud's Text-to-Speech and Translate APIs, which
|
53 |
+
require authentication. Follow these steps to get your credentials:
|
54 |
+
|
55 |
+
1. Create a new project in the [Google Cloud
|
56 |
+
Console](https://console.cloud.google.com/).
|
57 |
+
2. Enable the
|
58 |
+
[Text-to-Speech](https://cloud.google.com/text-to-speech/docs/quickstart-client-libraries)
|
59 |
+
and [Translate](https://cloud.google.com/translate/docs/setup) APIs
|
60 |
+
for your project.
|
61 |
+
3. Create a new service account for your project in the [Service
|
62 |
+
Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
|
63 |
+
page.
|
64 |
+
4. Create a new JSON key for your service account, and download it.
|
65 |
+
This is your credentials file.
|
66 |
+
|
67 |
+
## Usage
|
68 |
+
|
69 |
+
Run the script with the following command:
|
70 |
+
|
71 |
+
python main.py --input <path_to_video_file> --voice <target_voice> --credentials <path_to_credentials_file> --source_language <source_language>
|
72 |
+
|
73 |
+
- `<path_to_video_file>`: Path to the source video file
|
74 |
+
|
75 |
+
- `<target_voice>`: Target dubbing voice name from [Google Cloud
|
76 |
+
Text-to-Speech
|
77 |
+
Voices](https://cloud.google.com/text-to-speech/docs/voices).
|
78 |
+
Default is "es-US-Neural2-B". Recommended voices are:
|
79 |
+
|
80 |
+
- English: "en-US-Neural2-J"
|
81 |
+
- Spanish: "es-US-Neural2-B"
|
82 |
+
- German: "de-DE-Neural2-D"
|
83 |
+
- Italian: "it-IT-Neural2-C"
|
84 |
+
- French: "fr-FR-Neural2-D"
|
85 |
+
- Russian: "ru-RU-Wavenet-D"
|
86 |
+
- Hindi: "hi-IN-Neural2-B"\
|
87 |
+
But you feel free to use any other voice.
|
88 |
+
|
89 |
+
- `<path_to_credentials_file>`: Path to the Google Cloud credentials
|
90 |
+
JSON file
|
91 |
+
|
92 |
+
- `<source_language>`: Source language, e.g. "english".
|
93 |
+
|
94 |
+
Now, the fully supported source languages are: English, German, French,
|
95 |
+
Italian, Catalan, Chinese, Croatian, Danish, Dutch, Finnish, Greek,
|
96 |
+
Japanese, Korean, Lithuanian, Macedonian, Polish, Portuguese, Romanian,
|
97 |
+
Russian, Spanish, Swedish, Ukrainian.
|
98 |
+
|
99 |
+
## Output
|
100 |
+
|
101 |
+
The script will create a new video file with the same name as the input
|
102 |
+
video file, but with "\_translated" appended to the name. The new video
|
103 |
+
file will have the original video with the new translated audio track.
|
104 |
+
Additionaly, the script will create a new `.wav` audio track with the
|
105 |
+
same name as the input video file contains translation only.
|
106 |
+
|
107 |
+
## Testing
|
108 |
+
|
109 |
+
You can test this script with any video that contains narration. For
|
110 |
+
example, you can use this [free video of US President Donald Trump
|
111 |
+
speaking at the Young Black Leadership Summit at the White
|
112 |
+
House](https://www.videvo.net/video/us-president-donald-trump-speaks-to-african-americans-young-black-leadership-summit-at-the-white-house-8/613121/).
|
113 |
+
|
114 |
+
Here are the step-by-step instructions for testing:
|
115 |
+
|
116 |
+
1. Download the video from the link above.
|
117 |
+
|
118 |
+
2. Save the video file in the same directory as the script under the
|
119 |
+
name `trump_speech.mp4`.
|
120 |
+
|
121 |
+
3. Run the script with the downloaded video file as the input. For
|
122 |
+
example, if you saved the video as `trump_speech.mp4`, you would
|
123 |
+
run:
|
124 |
+
|
125 |
+
python main.py trump_speech.mp4 de-DE-Neural2-B path_to_credentials.json english
|
126 |
+
|
127 |
+
Replace `path_to_credentials.json` with the path to your Google
|
128 |
+
Cloud credentials JSON file.
|
129 |
+
|
130 |
+
4. The script will create a new `.wav` audio file named
|
131 |
+
`trump_speech.wav` in the same directory. This file contains the
|
132 |
+
translated audio.
|
133 |
+
|
134 |
+
5. Listen to the `trump_speech.wav` file to verify that the script
|
135 |
+
worked correctly. The audio should be a translation of the original
|
136 |
+
speech in the video.
|
137 |
+
|
138 |
+
Feel free to replace `de-DE-Neural2-B` with the desired target voice.
|
139 |
+
|
140 |
+
## License
|
141 |
+
|
142 |
+
Alexey Sokolov (c). This project is licensed under the terms of the MIT
|
143 |
+
license included in this repository.
|