Update README.md
Browse files
README.md
CHANGED
@@ -1,126 +1,9 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
## Pre-trained Models
|
13 |
-
| Model | Configuration | Training Dataset |
|
14 |
-
|-------------|-------------|-------------|
|
15 |
-
[FXencoder (Φ<sub>p.s.</sub>)](https://drive.google.com/file/d/1BFABsJRUVgJS5UE5iuM03dbfBjmI9LT5/view?usp=sharing) | Used *FX normalization* and *probability scheduling* techniques for training | Trained with [MUSDB18](https://sigsep.github.io/datasets/musdb.html) Dataset
|
16 |
-
[MixFXcloner](https://drive.google.com/file/d/1Qu8rD7HpTNA1gJUVp2IuaeU_Nue8-VA3/view?usp=sharing) | Mixing style converter trained with Φ<sub>p.s.</sub> | Trained with [MUSDB18](https://sigsep.github.io/datasets/musdb.html) Dataset
|
17 |
-
|
18 |
-
|
19 |
-
## Installation
|
20 |
-
```
|
21 |
-
pip install -r "requirements.txt"
|
22 |
-
```
|
23 |
-
|
24 |
-
# Inference
|
25 |
-
|
26 |
-
## Mixing Style Transfer
|
27 |
-
|
28 |
-
To run the inference code for <i>mixing style transfer</i>,
|
29 |
-
1. Download pre-trained models above and place them under the folder named 'weights' (default)
|
30 |
-
2. Prepare input and reference tracks under the folder named 'samples/style_transfer' (default)
|
31 |
-
Target files should be organized as follow:
|
32 |
-
```
|
33 |
-
"path_to_data_directory"/"song_name_#1"/"input_file_name".wav
|
34 |
-
"path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
|
35 |
-
...
|
36 |
-
"path_to_data_directory"/"song_name_#n"/"input_file_name".wav
|
37 |
-
"path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
|
38 |
-
```
|
39 |
-
3. Run 'inference/style_transfer.py'
|
40 |
-
```
|
41 |
-
python inference/style_transfer.py \
|
42 |
-
--ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
|
43 |
-
--ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
|
44 |
-
--target_dir "path_to_directory_containing_inference_samples"
|
45 |
-
```
|
46 |
-
4. Outputs will be stored under the same folder to inference data directory (default)
|
47 |
-
|
48 |
-
*Note: The system accepts WAV files of stereo-channeled, 44.1kHZ, and 16-bit rate. We recommend to use audio samples that are not too loud: it's better for the system to transfer these samples by reducing the loudness of mixture-wise inputs (maintaining the overall balance of each instrument).*
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
## Interpolation With 2 Different Reference Tracks
|
53 |
-
|
54 |
-
Inference code for <interpolating> two reference tracks is almost the same as <i>mixing style transfer</i>.
|
55 |
-
1. Download pre-trained models above and place them under the folder named 'weights' (default)
|
56 |
-
2. Prepare input and 2 reference tracks under the folder named 'samples/style_transfer' (default)
|
57 |
-
Target files should be organized as follow:
|
58 |
-
```
|
59 |
-
"path_to_data_directory"/"song_name_#1"/"input_track_name".wav
|
60 |
-
"path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
|
61 |
-
"path_to_data_directory"/"song_name_#1"/"reference_file_name_2interpolate".wav
|
62 |
-
...
|
63 |
-
"path_to_data_directory"/"song_name_#n"/"input_track_name".wav
|
64 |
-
"path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
|
65 |
-
"path_to_data_directory"/"song_name_#n"/"reference_file_name_2interpolate".wav
|
66 |
-
```
|
67 |
-
3. Run 'inference/style_transfer.py'
|
68 |
-
```
|
69 |
-
python inference/style_transfer.py \
|
70 |
-
--ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
|
71 |
-
--ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
|
72 |
-
--target_dir "path_to_directory_containing_inference_samples" \
|
73 |
-
--interpolation True \
|
74 |
-
--interpolate_segments "number of segments to perform interpolation"
|
75 |
-
```
|
76 |
-
4. Outputs will be stored under the same folder to inference data directory (default)
|
77 |
-
|
78 |
-
*Note: This example of interpolating 2 different reference tracks is not mentioned in the paper, but this example implies a potential for controllable style transfer using latent space.*
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
## Feature Extraction Using *FXencoder*
|
83 |
-
|
84 |
-
This inference code will extracts audio effects-related embeddings using our proposed <i>FXencoder</i>. This code will process all the .wav files under the target directory.
|
85 |
-
|
86 |
-
1. Download <i>FXencoder</i>'s pre-trained model above and place it under the folder named 'weights' (default)=
|
87 |
-
2. Run 'inference/style_transfer.py'
|
88 |
-
```
|
89 |
-
python inference/feature_extraction.py \
|
90 |
-
--ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
|
91 |
-
--target_dir "path_to_directory_containing_inference_samples"
|
92 |
-
```
|
93 |
-
3. Outputs will be stored under the same folder to inference data directory (default)
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
# Implementation
|
99 |
-
|
100 |
-
All the details of our system implementation are under the folder "mixing_style_transfer".
|
101 |
-
|
102 |
-
<li><i>FXmanipulator</i></li>
|
103 |
-
  -> mixing_style_transfer/mixing_manipulator/
|
104 |
-
<li>network architectures</li>
|
105 |
-
  -> mixing_style_transfer/networks/
|
106 |
-
<li>configuration of each sub-networks</li>
|
107 |
-
  -> mixing_style_transfer/networks/configs.yaml
|
108 |
-
<li>data loader</li>
|
109 |
-
  -> mixing_style_transfer/data_loader/
|
110 |
-
|
111 |
-
|
112 |
-
# Citation
|
113 |
-
|
114 |
-
Please consider citing the work upon usage.
|
115 |
-
|
116 |
-
```
|
117 |
-
@article{koo2022music,
|
118 |
-
title={Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects},
|
119 |
-
author={Koo, Junghyun and Martinez-Ramirez, Marco A and Liao, Wei-Hsiang and Uhlich, Stefan and Lee, Kyogu and Mitsufuji, Yuki},
|
120 |
-
journal={arXiv preprint arXiv:2211.02247},
|
121 |
-
year={2022}
|
122 |
-
}
|
123 |
-
```
|
124 |
-
|
125 |
-
|
126 |
-
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
title: Music Mixing Style Transfer Demo
|
4 |
+
sdk: gradio
|
5 |
+
emoji: 🎶
|
6 |
+
pinned: true
|
7 |
+
colorFrom: black
|
8 |
+
colorTo: white
|
9 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|