---
title: Videodubber
emoji: 🏃
colorFrom: purple
colorTo: pink
sdk: streamlit
sdk_version: 1.38.0
app_file: app.py
pinned: false
---

# Video Dubber

The program for automatic dubbing any video file for a lot of languages.

This Python script extracts the audio from a video file, transcribes it,
translates it into a different language, generates a new audio file with
the translated text, and then merges it with the original video.

## Prerequisites

-   Python 3.8 or higher
-   [FFmpeg](https://ffmpeg.org/download.html)

## Technologies Used

-   [Google Cloud Text-to-Speech
    API](https://cloud.google.com/text-to-speech): Used to generate the
    audio for the translated text.
-   [Google Cloud Translate API](https://cloud.google.com/translate):
    Used to translate the transcribed text into a different language.
-   [Whisper ASR](https://www.openai.com/research/whisper/): Used to
    transcribe the audio from the video file.
-   [Spacy](https://spacy.io/): Used for natural language processing
    tasks, such as tokenization and syllable counting.
-   [PyDub](http://pydub.com/): Used for manipulating audio files.
-   [MoviePy](https://zulko.github.io/moviepy/): Used for extracting the
    audio from the video file.

## Installation

1.  Clone this repository:

        git clone https://github.com/am-sokolov/videodubber.git

2.  Install the required Python packages:

        pip install -r requirements.txt

## Google Cloud Credentials

This script uses Google Cloud's Text-to-Speech and Translate APIs, which
require authentication. Follow these steps to get your credentials:

1.  Create a new project in the [Google Cloud
    Console](https://console.cloud.google.com/).
2.  Enable the
    [Text-to-Speech](https://cloud.google.com/text-to-speech/docs/quickstart-client-libraries)
    and [Translate](https://cloud.google.com/translate/docs/setup) APIs
    for your project.
3.  Create a new service account for your project in the [Service
    Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
    page.
4.  Create a new JSON key for your service account, and download it.
    This is your credentials file.

## Usage

Run the script with the following command:

    python main.py --input <path_to_video_file> --voice <target_voice> --credentials <path_to_credentials_file> --source_language <source_language>

-   `<path_to_video_file>`: Path to the source video file

-   `<target_voice>`: Target dubbing voice name from [Google Cloud
    Text-to-Speech
    Voices](https://cloud.google.com/text-to-speech/docs/voices).
    Default is "es-US-Neural2-B". Recommended voices are:

    -   English: "en-US-Neural2-J"
    -   Spanish: "es-US-Neural2-B"
    -   German: "de-DE-Neural2-D"
    -   Italian: "it-IT-Neural2-C"
    -   French: "fr-FR-Neural2-D"
    -   Russian: "ru-RU-Wavenet-D"
    -   Hindi: "hi-IN-Neural2-B"\
        But you feel free to use any other voice.

-   `<path_to_credentials_file>`: Path to the Google Cloud credentials
    JSON file

-   `<source_language>`: Source language, e.g. "english".

Now, the fully supported source languages are: English, German, French,
Italian, Catalan, Chinese, Croatian, Danish, Dutch, Finnish, Greek,
Japanese, Korean, Lithuanian, Macedonian, Polish, Portuguese, Romanian,
Russian, Spanish, Swedish, Ukrainian.

## Output

The script will create a new video file with the same name as the input
video file, but with "\_translated" appended to the name. The new video
file will have the original video with the new translated audio track.
Additionaly, the script will create a new `.wav` audio track with the
same name as the input video file contains translation only.

## Testing

You can test this script with any video that contains narration. For
example, you can use this [free video of US President Donald Trump
speaking at the Young Black Leadership Summit at the White
House](https://www.videvo.net/video/us-president-donald-trump-speaks-to-african-americans-young-black-leadership-summit-at-the-white-house-8/613121/).

Here are the step-by-step instructions for testing:

1.  Download the video from the link above.

2.  Save the video file in the same directory as the script under the
    name `trump_speech.mp4`.

3.  Run the script with the downloaded video file as the input. For
    example, if you saved the video as `trump_speech.mp4`, you would
    run:

        python main.py trump_speech.mp4  de-DE-Neural2-B path_to_credentials.json  english

    Replace `path_to_credentials.json` with the path to your Google
    Cloud credentials JSON file.

4.  The script will create a new `.wav` audio file named
    `trump_speech.wav` in the same directory. This file contains the
    translated audio.

5.  Listen to the `trump_speech.wav` file to verify that the script
    worked correctly. The audio should be a translation of the original
    speech in the video.

Feel free to replace `de-DE-Neural2-B` with the desired target voice.

## License

Alexey Sokolov (c). This project is licensed under the terms of the MIT
license included in this repository.