animatedaliensfans commited on
Commit
3912a59
·
verified ·
1 Parent(s): d6d049b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -1
README.md CHANGED
@@ -9,4 +9,135 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ # Video Dubber
13
+
14
+ The program for automatic dubbing any video file for a lot of languages.
15
+
16
+ This Python script extracts the audio from a video file, transcribes it,
17
+ translates it into a different language, generates a new audio file with
18
+ the translated text, and then merges it with the original video.
19
+
20
+ ## Prerequisites
21
+
22
+ - Python 3.8 or higher
23
+ - [FFmpeg](https://ffmpeg.org/download.html)
24
+
25
+ ## Technologies Used
26
+
27
+ - [Google Cloud Text-to-Speech
28
+ API](https://cloud.google.com/text-to-speech): Used to generate the
29
+ audio for the translated text.
30
+ - [Google Cloud Translate API](https://cloud.google.com/translate):
31
+ Used to translate the transcribed text into a different language.
32
+ - [Whisper ASR](https://www.openai.com/research/whisper/): Used to
33
+ transcribe the audio from the video file.
34
+ - [Spacy](https://spacy.io/): Used for natural language processing
35
+ tasks, such as tokenization and syllable counting.
36
+ - [PyDub](http://pydub.com/): Used for manipulating audio files.
37
+ - [MoviePy](https://zulko.github.io/moviepy/): Used for extracting the
38
+ audio from the video file.
39
+
40
+ ## Installation
41
+
42
+ 1. Clone this repository:
43
+
44
+ git clone https://github.com/am-sokolov/videodubber.git
45
+
46
+ 2. Install the required Python packages:
47
+
48
+ pip install -r requirements.txt
49
+
50
+ ## Google Cloud Credentials
51
+
52
+ This script uses Google Cloud's Text-to-Speech and Translate APIs, which
53
+ require authentication. Follow these steps to get your credentials:
54
+
55
+ 1. Create a new project in the [Google Cloud
56
+ Console](https://console.cloud.google.com/).
57
+ 2. Enable the
58
+ [Text-to-Speech](https://cloud.google.com/text-to-speech/docs/quickstart-client-libraries)
59
+ and [Translate](https://cloud.google.com/translate/docs/setup) APIs
60
+ for your project.
61
+ 3. Create a new service account for your project in the [Service
62
+ Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
63
+ page.
64
+ 4. Create a new JSON key for your service account, and download it.
65
+ This is your credentials file.
66
+
67
+ ## Usage
68
+
69
+ Run the script with the following command:
70
+
71
+ python main.py --input <path_to_video_file> --voice <target_voice> --credentials <path_to_credentials_file> --source_language <source_language>
72
+
73
+ - `<path_to_video_file>`: Path to the source video file
74
+
75
+ - `<target_voice>`: Target dubbing voice name from [Google Cloud
76
+ Text-to-Speech
77
+ Voices](https://cloud.google.com/text-to-speech/docs/voices).
78
+ Default is "es-US-Neural2-B". Recommended voices are:
79
+
80
+ - English: "en-US-Neural2-J"
81
+ - Spanish: "es-US-Neural2-B"
82
+ - German: "de-DE-Neural2-D"
83
+ - Italian: "it-IT-Neural2-C"
84
+ - French: "fr-FR-Neural2-D"
85
+ - Russian: "ru-RU-Wavenet-D"
86
+ - Hindi: "hi-IN-Neural2-B"\
87
+ But you feel free to use any other voice.
88
+
89
+ - `<path_to_credentials_file>`: Path to the Google Cloud credentials
90
+ JSON file
91
+
92
+ - `<source_language>`: Source language, e.g. "english".
93
+
94
+ Now, the fully supported source languages are: English, German, French,
95
+ Italian, Catalan, Chinese, Croatian, Danish, Dutch, Finnish, Greek,
96
+ Japanese, Korean, Lithuanian, Macedonian, Polish, Portuguese, Romanian,
97
+ Russian, Spanish, Swedish, Ukrainian.
98
+
99
+ ## Output
100
+
101
+ The script will create a new video file with the same name as the input
102
+ video file, but with "\_translated" appended to the name. The new video
103
+ file will have the original video with the new translated audio track.
104
+ Additionaly, the script will create a new `.wav` audio track with the
105
+ same name as the input video file contains translation only.
106
+
107
+ ## Testing
108
+
109
+ You can test this script with any video that contains narration. For
110
+ example, you can use this [free video of US President Donald Trump
111
+ speaking at the Young Black Leadership Summit at the White
112
+ House](https://www.videvo.net/video/us-president-donald-trump-speaks-to-african-americans-young-black-leadership-summit-at-the-white-house-8/613121/).
113
+
114
+ Here are the step-by-step instructions for testing:
115
+
116
+ 1. Download the video from the link above.
117
+
118
+ 2. Save the video file in the same directory as the script under the
119
+ name `trump_speech.mp4`.
120
+
121
+ 3. Run the script with the downloaded video file as the input. For
122
+ example, if you saved the video as `trump_speech.mp4`, you would
123
+ run:
124
+
125
+ python main.py trump_speech.mp4 de-DE-Neural2-B path_to_credentials.json english
126
+
127
+ Replace `path_to_credentials.json` with the path to your Google
128
+ Cloud credentials JSON file.
129
+
130
+ 4. The script will create a new `.wav` audio file named
131
+ `trump_speech.wav` in the same directory. This file contains the
132
+ translated audio.
133
+
134
+ 5. Listen to the `trump_speech.wav` file to verify that the script
135
+ worked correctly. The audio should be a translation of the original
136
+ speech in the video.
137
+
138
+ Feel free to replace `de-DE-Neural2-B` with the desired target voice.
139
+
140
+ ## License
141
+
142
+ Alexey Sokolov (c). This project is licensed under the terms of the MIT
143
+ license included in this repository.