AudioSeal: Proactive Localized Watermarking

AudioCraft provides the training code and models for AudioSeal, a method for speech localized watermarking [Proactive Detection of Voice Cloning with Localized Watermarking][arxiv], with state-of-the-art robustness and detector speed. It jointly trains a generator that embeds a watermark in the audio, and a detector that detects the watermarked fragments in longer audios, even in the presence of editing.

Installation and setup

Make sure to install audiocraft version 1.4.0a1 or later, and with the [wm] extra (see README). Alternatively, you can just install audioseal yourself. To install AudioSeal, follow Installation guidelines in the AudioSeal repo.

NOTE: Since we use AAC augmentation in our training loop, you need to install ffmpeg, or it will not work (See Section "Installation" in README).

Make sure you follow steps for basic training setup before starting.

API

Check the Github repository for more details.

Training

The WatermarkSolver implements the AudioSeal's training pipeline. It joins the generator and detector that wrap audioseal.AudioSealWM and audioseal.AudioSealDetector respectively. For the training recipe, see config/solver/watermark/robustness.yaml.

For illustration, we use the three example audios in datasets, with datasourc definition in dset/audio/example.yaml (Please read DATASET to understand AudioCraft's dataset structure.)

To run the Watermarking training pipeline locally:

dora run solver=watermark/robustness dset=audio/example

you can override model / experiment parameters here directly like:

dora run solver=watermark/robustness dset=audio/example sample_rate=24000

If you want to run in debug mode:

python3 -m pdb -c c -m dora run solver=watermark/robustness dset=audio/example