jhtonyKoo's picture
Upload 61 files
2777fde
|
raw
history blame
5.82 kB

Music Mixing Style Transfer

This repository includes source code and pre-trained models of the work Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects by Junghyun Koo, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, and Yuki Mitsufuji.

arXiv Web Supplementary

Pre-trained Models

Model Configuration Training Dataset
FXencoder (Φp.s.) Used FX normalization and probability scheduling techniques for training Trained with MUSDB18 Dataset
MixFXcloner Mixing style converter trained with Φp.s. Trained with MUSDB18 Dataset

Installation

pip install -r "requirements.txt"

Inference

Mixing Style Transfer

To run the inference code for mixing style transfer,

  1. Download pre-trained models above and place them under the folder named 'weights' (default)
  2. Prepare input and reference tracks under the folder named 'samples/style_transfer' (default) Target files should be organized as follow:
    "path_to_data_directory"/"song_name_#1"/"input_file_name".wav
    "path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
    ...
    "path_to_data_directory"/"song_name_#n"/"input_file_name".wav
    "path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
  1. Run 'inference/style_transfer.py'
python inference/style_transfer.py \
    --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
    --ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
    --target_dir "path_to_directory_containing_inference_samples"
  1. Outputs will be stored under the same folder to inference data directory (default)

Note: The system accepts WAV files of stereo-channeled, 44.1kHZ, and 16-bit rate. We recommend to use audio samples that are not too loud: it's better for the system to transfer these samples by reducing the loudness of mixture-wise inputs (maintaining the overall balance of each instrument).

Interpolation With 2 Different Reference Tracks

Inference code for two reference tracks is almost the same as mixing style transfer.

  1. Download pre-trained models above and place them under the folder named 'weights' (default)
  2. Prepare input and 2 reference tracks under the folder named 'samples/style_transfer' (default) Target files should be organized as follow:
    "path_to_data_directory"/"song_name_#1"/"input_track_name".wav
    "path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
    "path_to_data_directory"/"song_name_#1"/"reference_file_name_2interpolate".wav
    ...
    "path_to_data_directory"/"song_name_#n"/"input_track_name".wav
    "path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
    "path_to_data_directory"/"song_name_#n"/"reference_file_name_2interpolate".wav
  1. Run 'inference/style_transfer.py'
python inference/style_transfer.py \
    --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
    --ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
    --target_dir "path_to_directory_containing_inference_samples" \
    --interpolation True \
    --interpolate_segments "number of segments to perform interpolation"
  1. Outputs will be stored under the same folder to inference data directory (default)

Note: This example of interpolating 2 different reference tracks is not mentioned in the paper, but this example implies a potential for controllable style transfer using latent space.

Feature Extraction Using FXencoder

This inference code will extracts audio effects-related embeddings using our proposed FXencoder. This code will process all the .wav files under the target directory.

  1. Download FXencoder's pre-trained model above and place it under the folder named 'weights' (default)=
  2. Run 'inference/style_transfer.py'
python inference/feature_extraction.py \
    --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
    --target_dir "path_to_directory_containing_inference_samples"
  1. Outputs will be stored under the same folder to inference data directory (default)

Implementation

All the details of our system implementation are under the folder "mixing_style_transfer".

  • FXmanipulator
  •   -> mixing_style_transfer/mixing_manipulator/
  • network architectures
  •   -> mixing_style_transfer/networks/
  • configuration of each sub-networks
  •   -> mixing_style_transfer/networks/configs.yaml
  • data loader
  •   -> mixing_style_transfer/data_loader/

    Citation

    Please consider citing the work upon usage.

    @article{koo2022music,
      title={Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects},
      author={Koo, Junghyun and Martinez-Ramirez, Marco A and Liao, Wei-Hsiang and Uhlich, Stefan and Lee, Kyogu and Mitsufuji, Yuki},
      journal={arXiv preprint arXiv:2211.02247},
      year={2022}
    }