Papers
arxiv:2211.07302

MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation

Published on Nov 14, 2022
Authors:
,
,
,
,

Abstract

Separation of multiple singing voices into each voice is a rarely studied area in music source separation research. The absence of a benchmark dataset has hindered its progress. In this paper, we present an evaluation dataset and provide baseline studies for multiple singing voices separation. First, we introduce <PRE_TAG>MedleyVox</POST_TAG>, an evaluation dataset for multiple singing voices separation. We specify the problem definition in this dataset by categorizing it into i) <PRE_TAG>unison</POST_TAG>, ii) <PRE_TAG>duet</POST_TAG>, iii) main vs. rest, and iv) N-singing separation. Second, to overcome the absence of existing multi-singing datasets for a training purpose, we present a strategy for construction of multiple singing mixtures using various single-singing datasets. Third, we propose the improved super-resolution network (iSRNet), which greatly enhances initial estimates of separation networks. Jointly trained with the Conv-TasNet and the multi-singing mixture construction strategy, the proposed iSRNet achieved comparable performance to ideal time-frequency masks on <PRE_TAG>duet</POST_TAG> and <PRE_TAG>unison</POST_TAG> subsets of <PRE_TAG>MedleyVox</POST_TAG>. Audio samples, the dataset, and codes are available on our website (https://github.com/jeonchangbin49/<PRE_TAG>MedleyVox</POST_TAG>).

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2211.07302 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2211.07302 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.