Spaces:
No application file
No application file
File size: 5,088 Bytes
d6d049b 3912a59 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
title: Videodubber
emoji: ๐
colorFrom: purple
colorTo: pink
sdk: streamlit
sdk_version: 1.38.0
app_file: app.py
pinned: false
---
# Video Dubber
The program for automatic dubbing any video file for a lot of languages.
This Python script extracts the audio from a video file, transcribes it,
translates it into a different language, generates a new audio file with
the translated text, and then merges it with the original video.
## Prerequisites
- Python 3.8 or higher
- [FFmpeg](https://ffmpeg.org/download.html)
## Technologies Used
- [Google Cloud Text-to-Speech
API](https://cloud.google.com/text-to-speech): Used to generate the
audio for the translated text.
- [Google Cloud Translate API](https://cloud.google.com/translate):
Used to translate the transcribed text into a different language.
- [Whisper ASR](https://www.openai.com/research/whisper/): Used to
transcribe the audio from the video file.
- [Spacy](https://spacy.io/): Used for natural language processing
tasks, such as tokenization and syllable counting.
- [PyDub](http://pydub.com/): Used for manipulating audio files.
- [MoviePy](https://zulko.github.io/moviepy/): Used for extracting the
audio from the video file.
## Installation
1. Clone this repository:
git clone https://github.com/am-sokolov/videodubber.git
2. Install the required Python packages:
pip install -r requirements.txt
## Google Cloud Credentials
This script uses Google Cloud's Text-to-Speech and Translate APIs, which
require authentication. Follow these steps to get your credentials:
1. Create a new project in the [Google Cloud
Console](https://console.cloud.google.com/).
2. Enable the
[Text-to-Speech](https://cloud.google.com/text-to-speech/docs/quickstart-client-libraries)
and [Translate](https://cloud.google.com/translate/docs/setup) APIs
for your project.
3. Create a new service account for your project in the [Service
Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
page.
4. Create a new JSON key for your service account, and download it.
This is your credentials file.
## Usage
Run the script with the following command:
python main.py --input <path_to_video_file> --voice <target_voice> --credentials <path_to_credentials_file> --source_language <source_language>
- `<path_to_video_file>`: Path to the source video file
- `<target_voice>`: Target dubbing voice name from [Google Cloud
Text-to-Speech
Voices](https://cloud.google.com/text-to-speech/docs/voices).
Default is "es-US-Neural2-B". Recommended voices are:
- English: "en-US-Neural2-J"
- Spanish: "es-US-Neural2-B"
- German: "de-DE-Neural2-D"
- Italian: "it-IT-Neural2-C"
- French: "fr-FR-Neural2-D"
- Russian: "ru-RU-Wavenet-D"
- Hindi: "hi-IN-Neural2-B"\
But you feel free to use any other voice.
- `<path_to_credentials_file>`: Path to the Google Cloud credentials
JSON file
- `<source_language>`: Source language, e.g.ย "english".
Now, the fully supported source languages are: English, German, French,
Italian, Catalan, Chinese, Croatian, Danish, Dutch, Finnish, Greek,
Japanese, Korean, Lithuanian, Macedonian, Polish, Portuguese, Romanian,
Russian, Spanish, Swedish, Ukrainian.
## Output
The script will create a new video file with the same name as the input
video file, but with "\_translated" appended to the name. The new video
file will have the original video with the new translated audio track.
Additionaly, the script will create a new `.wav` audio track with the
same name as the input video file contains translation only.
## Testing
You can test this script with any video that contains narration. For
example, you can use this [free video of US President Donald Trump
speaking at the Young Black Leadership Summit at the White
House](https://www.videvo.net/video/us-president-donald-trump-speaks-to-african-americans-young-black-leadership-summit-at-the-white-house-8/613121/).
Here are the step-by-step instructions for testing:
1. Download the video from the link above.
2. Save the video file in the same directory as the script under the
name `trump_speech.mp4`.
3. Run the script with the downloaded video file as the input. For
example, if you saved the video as `trump_speech.mp4`, you would
run:
python main.py trump_speech.mp4 de-DE-Neural2-B path_to_credentials.json english
Replace `path_to_credentials.json` with the path to your Google
Cloud credentials JSON file.
4. The script will create a new `.wav` audio file named
`trump_speech.wav` in the same directory. This file contains the
translated audio.
5. Listen to the `trump_speech.wav` file to verify that the script
worked correctly. The audio should be a translation of the original
speech in the video.
Feel free to replace `de-DE-Neural2-B` with the desired target voice.
## License
Alexey Sokolov (c). This project is licensed under the terms of the MIT
license included in this repository. |