JunzheJosephZhu
commited on
Commit
·
2d978ea
1
Parent(s):
22c7bc6
change task, add data files
Browse files- README.md +2 -2
- create-speaker-mixtures-2345/__MACOSX/._create-speaker-mixtures-2345 +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._activlev.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_2speakers.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_3speakers.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_4speakers.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_5speakers.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._maxfilt.m +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345.zip +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/activlev.m +345 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_2speakers.m +188 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_3speakers.m +188 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_4speakers.m +214 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_5speakers.m +238 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/maxfilt.m +127 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_tt.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_cv.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_tr.txt +0 -0
- create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_tt.txt +0 -0
README.md
CHANGED
@@ -17,7 +17,7 @@ Demo Page: https://junzhejosephzhu.github.io/Multi-Decoder-DPRNN/
|
|
17 |
Original research repo is at https://github.com/JunzheJosephZhu/MultiDecoder-DPRNN
|
18 |
|
19 |
This model was trained by Joseph Zhu using the wsj0-mix-var/Multi-Decoder-DPRNN recipe in Asteroid.
|
20 |
-
It was trained on the `
|
21 |
|
22 |
## Training config:
|
23 |
```yaml
|
@@ -51,7 +51,7 @@ optim:
|
|
51 |
data:
|
52 |
train_dir: "data/{}speakers/wav8k/min/tr"
|
53 |
valid_dir: "data/{}speakers/wav8k/min/cv"
|
54 |
-
task:
|
55 |
sample_rate: 8000
|
56 |
seglen: 4.0
|
57 |
minlen: 2.0
|
|
|
17 |
Original research repo is at https://github.com/JunzheJosephZhu/MultiDecoder-DPRNN
|
18 |
|
19 |
This model was trained by Joseph Zhu using the wsj0-mix-var/Multi-Decoder-DPRNN recipe in Asteroid.
|
20 |
+
It was trained on the `sep_count` task of the Wsj0MixVar dataset.
|
21 |
|
22 |
## Training config:
|
23 |
```yaml
|
|
|
51 |
data:
|
52 |
train_dir: "data/{}speakers/wav8k/min/tr"
|
53 |
valid_dir: "data/{}speakers/wav8k/min/cv"
|
54 |
+
task: sep_count
|
55 |
sample_rate: 8000
|
56 |
seglen: 4.0
|
57 |
minlen: 2.0
|
create-speaker-mixtures-2345/__MACOSX/._create-speaker-mixtures-2345
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._activlev.m
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_2speakers.m
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_3speakers.m
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_4speakers.m
ADDED
Binary file (312 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._create_wav_5speakers.m
ADDED
Binary file (268 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._maxfilt.m
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_cv.txt
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_tr.txt
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_2_spk_tt.txt
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_cv.txt
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_tr.txt
ADDED
Binary file (212 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_3_spk_tt.txt
ADDED
Binary file (268 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_cv.txt
ADDED
Binary file (594 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_tr.txt
ADDED
Binary file (596 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_4_spk_tt.txt
ADDED
Binary file (652 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_cv.txt
ADDED
Binary file (368 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_tr.txt
ADDED
Binary file (312 Bytes). View file
|
|
create-speaker-mixtures-2345/__MACOSX/create-speaker-mixtures-2345/._mix_5_spk_tt.txt
ADDED
Binary file (312 Bytes). View file
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345.zip
ADDED
Binary file (3.66 MB). View file
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/activlev.m
ADDED
@@ -0,0 +1,345 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
function [lev,af,fso,vad]=activlev(sp,fs,mode)
|
2 |
+
%ACTIVLEV Measure active speech level as in ITU-T P.56 [LEV,AF,FSO]=(sp,FS,MODE)
|
3 |
+
%
|
4 |
+
%Usage: (1) lev=activlev(s,fs); % speech level in units of power
|
5 |
+
% (2) db=activlev(s,fs,'d'); % speech level in dB
|
6 |
+
% (3) s=activlev(s,fs,'n'); % normalize active level to 0 dB
|
7 |
+
%
|
8 |
+
%Inputs: sp is the speech signal (with better than 20dB SNR)
|
9 |
+
% FS is the sample frequency in Hz (see also FSO below)
|
10 |
+
% MODE is a combination of the following:
|
11 |
+
% 0 - omit high pass filter completely (i.e. include DC)
|
12 |
+
% 3 - high pass filter at 30 Hz instead of 200 Hz (but allows mains hum to pass)
|
13 |
+
% 4 - high pass filter at 40 Hz instead of 200 Hz (but allows mains hum to pass)
|
14 |
+
% 1 - use cheybyshev 1 filter
|
15 |
+
% 2 - use chebyshev 2 filter (default)
|
16 |
+
% e - use elliptic filter
|
17 |
+
% h - omit low pass filter at 5.5, 12 or 18 kHz
|
18 |
+
% w - use wideband filter frequencies: 70 Hz to 12 kHz
|
19 |
+
% W - use ultra wideband filter frequencies: 30 Hz to 18 kHz
|
20 |
+
% d - give outputs in dB rather than power
|
21 |
+
% n - output a normalized speech signal as the first argument
|
22 |
+
% N - output a normalized filtered speech signal as the first argument
|
23 |
+
% l - give both active and long-term power levels
|
24 |
+
% a - include A-weighting filter
|
25 |
+
% i - include ITU-R-BS.468/ITU-T-J.16 weighting filter
|
26 |
+
% z - do NOT zero-pad the signal by 0.35 s
|
27 |
+
%
|
28 |
+
%Outputs:
|
29 |
+
% If the "n" option is specified, a speech signal normalized to 0dB will be given as
|
30 |
+
% the first output followed by the other outputs.
|
31 |
+
% LEV gives the speech level in units of power (or dB if mode='d')
|
32 |
+
% if mode='l' is specified, LEV is a row vector with the "long term
|
33 |
+
% level" as its second element (this is just the mean power)
|
34 |
+
% AF is the activity factor (or duty cycle) in the range 0 to 1
|
35 |
+
% FSO is a column vector of intermediate information that allows
|
36 |
+
% you to process a speech signal in chunks. Thus:
|
37 |
+
% fso=fs;
|
38 |
+
% for i=1:inc:nsamp
|
39 |
+
% [lev,af,fso]=activlev(sp(i:min(i+inc-1,nsamp)),fso,['z' mode]);
|
40 |
+
% end
|
41 |
+
% lev=activlev([],fso)
|
42 |
+
% is equivalent to:
|
43 |
+
% lev=activlev(sp(1:nsamp),fs,mode)
|
44 |
+
% but is much slower. The two methods will not give identical results
|
45 |
+
% because they will use slightly different thresholds. Note you need
|
46 |
+
% the 'z' option for all calls except the last.
|
47 |
+
% VAD is a boolean vector the same length as sp that acts as an approximate voice activity detector
|
48 |
+
|
49 |
+
%For completeness we list here the contents of the FSO structure:
|
50 |
+
%
|
51 |
+
% ffs : sample frequency
|
52 |
+
% fmd : mode string
|
53 |
+
% nh : hangover time in samples
|
54 |
+
% ae : smoothing filter coefs
|
55 |
+
% abl: HP filter numerator and denominator coefficient
|
56 |
+
% bh : LP filter numerator coefficient
|
57 |
+
% ah : LP filter denominator coefficients
|
58 |
+
% ze : smoothing filter state
|
59 |
+
% zl : HP filter state
|
60 |
+
% zh : LP filter state
|
61 |
+
% zx : hangover max filter state
|
62 |
+
% emax : maximum envelope exponent + 1
|
63 |
+
% ssq : signal sum of squares
|
64 |
+
% ns : number of signal samples
|
65 |
+
% ss : sum of speech samples (not actually used here)
|
66 |
+
% kc : cumulative occupancy counts
|
67 |
+
% aw : weighting filter denominator
|
68 |
+
% bw : weighting filter numerator
|
69 |
+
% zw : weighting filter state
|
70 |
+
%
|
71 |
+
% This routine implements "Method B" from [1],[2] to calculate the active
|
72 |
+
% speech level which is defined to be the speech energy divided by the
|
73 |
+
% duration of speech activity. Speech is designated as "active" based on an
|
74 |
+
% adaptive threshold applied to the smoothed rectified speech signal. A
|
75 |
+
% bandpass filter is first applied to the input speech whose -0.25 dB points
|
76 |
+
% are at 200 Hz & 5.5 kHz by default but this can be changed to 70 Hz & 5.5 kHz
|
77 |
+
% or to 30 Hz & 18 kHz by specifying the 'w' or 'W' options; these
|
78 |
+
% correspond respectively to Annexes B and C in [2].
|
79 |
+
%
|
80 |
+
% References:
|
81 |
+
% [1] ITU-T. Objective measurement of active speech level. Recommendation P.56, Mar. 1993.
|
82 |
+
% [2] ITU-T. Objective measurement of active speech level. Recommendation P.56, Dec. 2011.
|
83 |
+
|
84 |
+
% Copyright (C) Mike Brookes 2008-2016
|
85 |
+
% Version: $Id: activlev.m 9407 2017-02-07 13:25:55Z dmb $
|
86 |
+
%
|
87 |
+
% VOICEBOX is a MATLAB toolbox for speech processing.
|
88 |
+
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
|
89 |
+
%
|
90 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
91 |
+
% This program is free software; you can redistribute it and/or modify
|
92 |
+
% it under the terms of the GNU General Public License as published by
|
93 |
+
% the Free Software Foundation; either version 2 of the License, or
|
94 |
+
% (at your option) any later version.
|
95 |
+
%
|
96 |
+
% This program is distributed in the hope that it will be useful,
|
97 |
+
% but WITHOUT ANY WARRANTY; without even the implied warranty of
|
98 |
+
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
99 |
+
% GNU General Public License for more details.
|
100 |
+
%
|
101 |
+
% You can obtain a copy of the GNU General Public License from
|
102 |
+
% http://www.gnu.org/copyleft/gpl.html or by writing to
|
103 |
+
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
|
104 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
105 |
+
|
106 |
+
persistent nbin thresh c25zp c15zp e5zp
|
107 |
+
if isempty(nbin)
|
108 |
+
nbin=20; % 60 dB range at 3dB per bin
|
109 |
+
thresh=15.9; % threshold in dB
|
110 |
+
% High pass s-domain zeros and poles of filters with passband ripple<0.25dB, stopband<-50dB, w0=1
|
111 |
+
% w0=fzero(@ch2,0.5); [c2z,c2p,k]=cheby2(5,50,w0,'high','s');
|
112 |
+
% function v=ch2(w); [c2z,c2p,k]=cheby2(5,50,w,'high','s'); v= 20*log10(prod(abs(1i-c2z))/prod(abs(1i-c2p)))+0.25;
|
113 |
+
c25zp=[0.37843443673309i 0.23388534441447i; -0.20640255179496+0.73942185906851i -0.54036889596392+0.45698784092898i];
|
114 |
+
c25zp=[[0; -0.66793268833792] c25zp conj(c25zp)];
|
115 |
+
% [c1z,c1p,c1k] = cheby1(5,0.25,1,'high','s');
|
116 |
+
c15zp=[-0.659002835294875+1.195798636925079i -0.123261821596263+0.947463030958881i];
|
117 |
+
c15zp=[zeros(1,5); -2.288586431066945 c15zp conj(c15zp)];
|
118 |
+
% [ez,ep,ek] = ellip(5,0.25,50,1,'high','s')
|
119 |
+
e5zp=[0.406667680649209i 0.613849362744881i; -0.538736390607201+1.130245082677107i -0.092723126159100+0.958193646330194i];
|
120 |
+
e5zp=[[0; -1.964538608244084] e5zp conj(e5zp)];
|
121 |
+
% w=linspace(0.2,2,100);
|
122 |
+
% figure(1); plot(w,20*log10(abs(freqs(real(poly(c15zp(1,:))),real(poly(c15zp(2,:))),w)))); title('Chebyshev 1');
|
123 |
+
% figure(2); plot(w,20*log10(abs(freqs(real(poly(c25zp(1,:))),real(poly(c25zp(2,:))),w)))); title('Chebyshev 2');
|
124 |
+
% figure(3); plot(w,20*log10(abs(freqs(real(poly(e5zp(1,:))),real(poly(e5zp(2,:))),w)))); title('Elliptic');
|
125 |
+
end
|
126 |
+
|
127 |
+
if ~isstruct(fs) % no state vector given
|
128 |
+
if nargin<3
|
129 |
+
mode=' ';
|
130 |
+
end
|
131 |
+
fso.ffs=fs; % sample frequency
|
132 |
+
|
133 |
+
ti=1/fs;
|
134 |
+
g=exp(-ti/0.03); % pole position for envelope filter
|
135 |
+
fso.ae=[1 -2*g g^2]/(1-g)^2; % envelope filter coefficients (DC gain = 1)
|
136 |
+
fso.ze=zeros(2,1);
|
137 |
+
fso.nh=ceil(0.2/ti)+1; % hangover time in samples
|
138 |
+
fso.zx=-Inf; % initial value for maxfilt()
|
139 |
+
fso.emax=-Inf; % maximum exponent
|
140 |
+
fso.ns=0;
|
141 |
+
fso.ssq=0;
|
142 |
+
fso.ss=0;
|
143 |
+
fso.kc=zeros(nbin,1); % cumulative occupancy counts
|
144 |
+
% s-plane zeros and poles of high pass 5'th order filter -0.25dB at w=1 and -50dB stopband
|
145 |
+
if any(mode=='1')
|
146 |
+
szp=c15zp; % Chebyshev 1
|
147 |
+
elseif any(mode=='e')
|
148 |
+
szp=e5zp; % Elliptic
|
149 |
+
else
|
150 |
+
szp=c25zp; % Chebyshev 2
|
151 |
+
end
|
152 |
+
flh=[200 5500]; % default frequency range +- 0.25 dB
|
153 |
+
if any(mode=='w')
|
154 |
+
flh=[70 12000]; % super-wideband (Annex B of [2])
|
155 |
+
elseif any(mode=='W')
|
156 |
+
flh=[30 18000]; % full band (Annex C of [2])
|
157 |
+
end
|
158 |
+
if any(mode=='3')
|
159 |
+
flh(1)=30; % force a 30 Hz HPF cutoff
|
160 |
+
end
|
161 |
+
if any(mode=='4')
|
162 |
+
flh(1)=40; % force a 40 Hz HPF cutoff
|
163 |
+
end
|
164 |
+
if any(mode=='r') % included for backward compatibility
|
165 |
+
mode=['0h' mode]; % abolish both filters
|
166 |
+
elseif fs<flh(2)*2.2
|
167 |
+
mode=['h' mode]; % abolish lowpass filter at low sample rates
|
168 |
+
end
|
169 |
+
fso.fmd=mode; % save mode flags
|
170 |
+
if all(mode~='0') % implement the HPF as biquads to avoid rounding errors
|
171 |
+
zl=2./(1-szp*tan(flh(1)*pi/fs))-1; % Transform s-domain poles/zeros with bilinear transform
|
172 |
+
abl=[ones(2,1) -zl(:,1) -2*real(zl(:,2:3)) abs(zl(:,2:3)).^2]; % biquad coefficients
|
173 |
+
hfg=(abl*[1 -1 0 0 0 0]').*(abl*[1 0 -1 0 1 0]').*(abl*[1 0 0 -1 0 1]');
|
174 |
+
abl=abl(:,[1 2 1 3 5 1 4 6]); % reorder into biquads
|
175 |
+
abl(1,1:2)= abl(1,1:2)*hfg(2)/hfg(1); % force Nyquist gain to equal 1
|
176 |
+
fso.abl=abl;
|
177 |
+
fso.zl=zeros(5,1); % space for HPF filter state
|
178 |
+
end
|
179 |
+
if all(mode~='h')
|
180 |
+
zh=2./(szp/tan(flh(2)*pi/fs)-1)+1; % Transform s-domain poles/zeros with bilinear transform
|
181 |
+
ah=real(poly(zh(2,:)));
|
182 |
+
bh=real(poly(zh(1,:)));
|
183 |
+
fso.bh=bh*sum(ah)/sum(bh);
|
184 |
+
fso.ah=ah;
|
185 |
+
fso.zh=zeros(5,1);
|
186 |
+
end
|
187 |
+
if any(mode=='a')
|
188 |
+
[fso.bw,fso.aw]=stdspectrum(2,'z',fs);
|
189 |
+
fso.zw=zeros(length(fso.aw)-1,1);
|
190 |
+
elseif any(mode=='i')
|
191 |
+
[fso.bw,fso.aw]=stdspectrum(8,'z',fs);
|
192 |
+
fso.zw=zeros(length(fso.aw)-1,1);
|
193 |
+
end
|
194 |
+
else
|
195 |
+
fso=fs; % use existing structure
|
196 |
+
end
|
197 |
+
md=fso.fmd;
|
198 |
+
if nargin<3
|
199 |
+
mode=fso.fmd;
|
200 |
+
end
|
201 |
+
nsp=length(sp); % original length of speech
|
202 |
+
if all(mode~='z')
|
203 |
+
nz=ceil(0.35*fso.ffs); % number of zeros to append
|
204 |
+
sp=[sp(:);zeros(nz,1)];
|
205 |
+
else
|
206 |
+
nz=0;
|
207 |
+
end
|
208 |
+
ns=length(sp);
|
209 |
+
if ns % process this speech chunk
|
210 |
+
% apply the input filters to the speech
|
211 |
+
if all(md~='0') % implement the HPF as biquads to avoid rounding errors
|
212 |
+
[sq,fso.zl(1)]=filter(fso.abl(1,1:2),fso.abl(2,1:2),sp(:),fso.zl(1)); % highpass filter: real pole/zero
|
213 |
+
[sq,fso.zl(2:3)]=filter(fso.abl(1,3:5),fso.abl(2,3:5),sq(:),fso.zl(2:3)); % highpass filter: biquad 1
|
214 |
+
[sq,fso.zl(4:5)]=filter(fso.abl(1,6:8),fso.abl(2,6:8),sq(:),fso.zl(4:5)); % highpass filter: biquad 2
|
215 |
+
else
|
216 |
+
sq=sp(:);
|
217 |
+
end
|
218 |
+
if all(md~='h')
|
219 |
+
[sq,fso.zh]=filter(fso.bh,fso.ah,sq(:),fso.zh); % lowpass filter
|
220 |
+
end
|
221 |
+
if any(md=='a') || any(md=='i')
|
222 |
+
[sq,fso.zw]=filter(fso.bw,fso.aw,sq(:),fso.zw); % weighting filter
|
223 |
+
end
|
224 |
+
fso.ns=fso.ns+ns; % count the number of speech samples
|
225 |
+
fso.ss=fso.ss+sum(sq); % sum of speech samples
|
226 |
+
fso.ssq=fso.ssq+sum(sq.*sq); % sum of squared speech samples
|
227 |
+
[s,fso.ze]=filter(1,fso.ae,abs(sq(:)),fso.ze); % envelope filter
|
228 |
+
[qf,qe]=log2(s.^2); % take efficient log2 function, 2^qe is upper limit of bin
|
229 |
+
qe(qf==0)=-Inf; % fix zero values
|
230 |
+
[qe,qk,fso.zx]=maxfilt(qe,1,fso.nh,1,fso.zx); % apply the 0.2 second hangover
|
231 |
+
oemax=fso.emax;
|
232 |
+
fso.emax=max(oemax,max(qe)+1);
|
233 |
+
if fso.emax==-Inf
|
234 |
+
fso.kc(1)=fso.kc(1)+ns;
|
235 |
+
else
|
236 |
+
qe=min(fso.emax-qe,nbin); % force in the range 1:nbin. Bin k has 2^(emax-k-1)<=s^2<=2^(emax-k)
|
237 |
+
wqe=ones(length(qe),1);
|
238 |
+
% below: could use kc=cumsum(accumarray(qe,wqe,nbin)) but unsure about backwards compatibility
|
239 |
+
kc=cumsum(full(sparse(qe,wqe,wqe,nbin,1))); % cumulative occupancy counts
|
240 |
+
esh=fso.emax-oemax; % amount to shift down previous bin counts
|
241 |
+
if esh<nbin-1 % if any of the previous bins are worth keeping
|
242 |
+
kc(esh+1:nbin-1)=kc(esh+1:nbin-1)+fso.kc(1:nbin-esh-1);
|
243 |
+
kc(nbin)=kc(nbin)+sum(fso.kc(nbin-esh:nbin));
|
244 |
+
else
|
245 |
+
kc(nbin)=kc(nbin)+sum(fso.kc); % otherwise just add all old counts into the last (lowest) bin
|
246 |
+
end
|
247 |
+
fso.kc=kc;
|
248 |
+
end
|
249 |
+
end
|
250 |
+
if fso.ns % now calculate the output values
|
251 |
+
if fso.ssq>0
|
252 |
+
aj=10*log10(fso.ssq*(fso.kc).^(-1));
|
253 |
+
% equivalent to cj=20*log10(sqrt(2).^(fso.emax-(1:nbin)-1));
|
254 |
+
cj=10*log10(2)*(fso.emax-(1:nbin)-1); % lower limit of bin j in dB
|
255 |
+
mj=aj'-cj-thresh;
|
256 |
+
% jj=find(mj*sign(mj(1))<=0); % Find threshold
|
257 |
+
jj=find(mj(1:end-1)<0 & mj(2:end)>=0,1); % find +ve transition through threshold
|
258 |
+
if isempty(jj) % if we never cross the threshold
|
259 |
+
if mj(end)<=0 % if we end up below if
|
260 |
+
jj=length(mj)-1; % take the threshold to be the bottom of the last (lowest) bin
|
261 |
+
jf=1;
|
262 |
+
else % if we are always above it
|
263 |
+
jj=1; % take the threshold to be the bottom of the first (highest) bin
|
264 |
+
jf=0;
|
265 |
+
end
|
266 |
+
else
|
267 |
+
jf=1/(1-mj(jj+1)/mj(jj)); % fractional part of j using linear interpolation
|
268 |
+
end
|
269 |
+
lev=aj(jj)+jf*(aj(jj+1)-aj(jj)); % active level in decibels
|
270 |
+
lp=10.^(lev/10); % active level in power
|
271 |
+
if any(md=='d') % 'd' option -> output in dB
|
272 |
+
lev=[lev 10*log10(fso.ssq/fso.ns)];
|
273 |
+
else % ~'d' option -> output in power
|
274 |
+
lev=[lp fso.ssq/fso.ns];
|
275 |
+
end
|
276 |
+
af=fso.ssq/(fso.ns*lp);
|
277 |
+
else % if all samples are equal to zero
|
278 |
+
af=0;
|
279 |
+
if any(md=='d') % 'd' option -> output in dB
|
280 |
+
lev=[-Inf -Inf]; % active level is 0 dB
|
281 |
+
else % ~'d' option -> output in power
|
282 |
+
lev=[0 0]; % active level is 0 power
|
283 |
+
end
|
284 |
+
end
|
285 |
+
if all(md~='l')
|
286 |
+
lev=lev(1); % only output the first element of lev unless 'l' option
|
287 |
+
end
|
288 |
+
end
|
289 |
+
if nargout>3
|
290 |
+
vad=maxfilt(s(1:nsp),1,fso.nh,1);
|
291 |
+
vad=vad>(sqrt(lp)/10^(thresh/20));
|
292 |
+
end
|
293 |
+
if ~nargout
|
294 |
+
vad=maxfilt(s,1,fso.nh,1);
|
295 |
+
vad=vad>(sqrt(lp)/10^(thresh/20));
|
296 |
+
levdb=10*log10(lp);
|
297 |
+
clf;
|
298 |
+
subplot(2,2,[1 2]);
|
299 |
+
tax=(1:ns)/fso.ffs;
|
300 |
+
plot(tax,sp,'-y',tax,s,'-r',tax,(vad>0)*sqrt(lp),'-b');
|
301 |
+
xlabel('Time (s)');
|
302 |
+
title(sprintf('Active Level = %.2g dB, Activity = %.0f%% (ITU-T P.56)',levdb,100*af));
|
303 |
+
axisenlarge([-1 -1 -1.4 -1.05]);
|
304 |
+
if nz>0
|
305 |
+
hold on
|
306 |
+
ylim=get(gca,'ylim');
|
307 |
+
plot(tax(end-nz)*[1 1],ylim,':k');
|
308 |
+
hold off
|
309 |
+
end
|
310 |
+
ylabel('Amplitude');
|
311 |
+
legend('Signal','Smoothed envelope','VAD * Active-Level','Location','SouthEast');
|
312 |
+
subplot(2,2,4);
|
313 |
+
plot(cj,repmat(levdb,nbin,1),'k:',cj,aj(:),'-b',cj,cj,'-r',levdb-thresh*ones(1,2),[levdb-thresh levdb],'-r');
|
314 |
+
xlabel('Threshold (dB)');
|
315 |
+
ylabel('Active Level (dB)');
|
316 |
+
legend('Active Level','Speech>Thresh','Threshold','Location','NorthWest');
|
317 |
+
texthvc(levdb-thresh,levdb-0.5*thresh,sprintf('%.1f dB ',thresh),'rmr');
|
318 |
+
axisenlarge([-1 -1.05]);
|
319 |
+
ylim=get(gca,'ylim');
|
320 |
+
set(gca,'ylim',[levdb-1.2*thresh max(ylim(2),levdb+1.9*thresh)]);
|
321 |
+
kch=filter([1 -1],1,kc);
|
322 |
+
subplot(2,2,3);
|
323 |
+
bar(5*log10(2)+cj(end:-1:1),kch(end:-1:1)*100/kc(end));
|
324 |
+
set(gca,'xlim',[cj(end) cj(1)+10*log10(2)]);
|
325 |
+
ylim=get(gca,'ylim');
|
326 |
+
hold on
|
327 |
+
plot(lev([1 1]),ylim,'k:',lev([1 1])-thresh,ylim,'r:');
|
328 |
+
hold off
|
329 |
+
texthvc(lev(1),ylim(2),sprintf(' Act\n Lev'),'ltk');
|
330 |
+
texthvc(lev(1)-thresh,ylim(2),sprintf('Threshold '),'rtr');
|
331 |
+
xlabel('Frame power (dB)')
|
332 |
+
ylabel('% frames');
|
333 |
+
elseif any(md=='n') || any(md=='N') % output normalized speech waveform
|
334 |
+
fsx=fso; % shift along other outputs
|
335 |
+
fso=af;
|
336 |
+
af=lev;
|
337 |
+
if any(md=='n')
|
338 |
+
sq=sp; % 'n' -> use unfiltered speech
|
339 |
+
end
|
340 |
+
if fsx.ns>0 && fsx.ssq>0 % if there has been any non-zero speech
|
341 |
+
lev=sq(1:nsp)/sqrt(lp);
|
342 |
+
else
|
343 |
+
lev=sq(1:nsp);
|
344 |
+
end
|
345 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_2speakers.m
ADDED
@@ -0,0 +1,188 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
% create_wav_2_speakers.m
|
2 |
+
%
|
3 |
+
% Create 2-speaker mixtures
|
4 |
+
%
|
5 |
+
% This script assumes that WSJ0's wv1 sphere files have already
|
6 |
+
% been converted to wav files, using the original folder structure
|
7 |
+
% under wsj0/, e.g.,
|
8 |
+
% 11-1.1/wsj0/si_tr_s/01t/01to030v.wv1 is converted to wav and
|
9 |
+
% stored in YOUR_PATH/wsj0/si_tr_s/01t/01to030v.wav, and
|
10 |
+
% 11-6.1/wsj0/si_dt_05/050/050a0501.wv1 is converted to wav and
|
11 |
+
% stored in YOUR_PATH/wsj0/si_dt_05/050/050a0501.wav.
|
12 |
+
% Relevant data from all disks are assumed merged under YOUR_PATH/wsj0/
|
13 |
+
%
|
14 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
15 |
+
% Copyright (C) 2016 Mitsubishi Electric Research Labs
|
16 |
+
% (Jonathan Le Roux, John R. Hershey, Zhuo Chen)
|
17 |
+
% Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
|
18 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
19 |
+
|
20 |
+
|
21 |
+
data_type = {'tr','cv','tt'};
|
22 |
+
wsj0root = '/home/joseph/Desktop/WSJ0/'; % YOUR_PATH/, the folder containing wsj0/
|
23 |
+
output_dir16k='/home/joseph/Desktop/WSJ0/dataset/2speakers/wav16k';
|
24 |
+
output_dir8k='/home/joseph/Desktop/WSJ0/dataset/2speakers/wav8k';
|
25 |
+
|
26 |
+
min_max = {'min'};
|
27 |
+
|
28 |
+
useaudioread = 0;
|
29 |
+
if exist('audioread','file')
|
30 |
+
useaudioread = 1;
|
31 |
+
end
|
32 |
+
|
33 |
+
for i_mm = 1:length(min_max)
|
34 |
+
for i_type = 1:length(data_type)
|
35 |
+
if ~exist([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
36 |
+
mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}]);
|
37 |
+
end
|
38 |
+
if ~exist([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
39 |
+
mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}]);
|
40 |
+
end
|
41 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
42 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
43 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']); %#ok<NASGU>
|
44 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
45 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
46 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']);
|
47 |
+
|
48 |
+
TaskFile = ['mix_2_spk_' data_type{i_type} '.txt'];
|
49 |
+
fid=fopen(TaskFile,'r');
|
50 |
+
C=textscan(fid,'%s %f %s %f');
|
51 |
+
|
52 |
+
Source1File = ['mix_2_spk_' min_max{i_mm} '_' data_type{i_type} '_1'];
|
53 |
+
Source2File = ['mix_2_spk_' min_max{i_mm} '_' data_type{i_type} '_2'];
|
54 |
+
MixFile = ['mix_2_spk_' min_max{i_mm} '_' data_type{i_type} '_mix'];
|
55 |
+
|
56 |
+
fid_s1 = fopen(Source1File,'w');
|
57 |
+
fid_s2 = fopen(Source2File,'w');
|
58 |
+
fid_m = fopen(MixFile,'w');
|
59 |
+
|
60 |
+
num_files = length(C{1});
|
61 |
+
fs8k=8000;
|
62 |
+
|
63 |
+
scaling_16k = zeros(num_files,2);
|
64 |
+
scaling_8k = zeros(num_files,2);
|
65 |
+
scaling16bit_16k = zeros(num_files,1);
|
66 |
+
scaling16bit_8k = zeros(num_files,1);
|
67 |
+
fprintf(1,'%s\n',[min_max{i_mm} '_' data_type{i_type}]);
|
68 |
+
for i = 1:num_files
|
69 |
+
[inwav1_dir,invwav1_name,inwav1_ext] = fileparts(C{1}{i});
|
70 |
+
[inwav2_dir,invwav2_name,inwav2_ext] = fileparts(C{3}{i});
|
71 |
+
fprintf(fid_s1,'%s\n',C{1}{i});
|
72 |
+
fprintf(fid_s2,'%s\n',C{3}{i});
|
73 |
+
inwav1_snr = C{2}(i);
|
74 |
+
inwav2_snr = C{4}(i);
|
75 |
+
mix_name = [invwav1_name,'_',num2str(inwav1_snr),'_',invwav2_name,'_',num2str(inwav2_snr)];
|
76 |
+
fprintf(fid_m,'%s\n',mix_name);
|
77 |
+
|
78 |
+
% get input wavs
|
79 |
+
if useaudioread
|
80 |
+
[s1, fs] = audioread([wsj0root C{1}{i}]);
|
81 |
+
s2 = audioread([wsj0root C{3}{i}]);
|
82 |
+
else
|
83 |
+
[s1, fs] = wavread([wsj0root C{1}{i}]); %#ok<*DWVRD>
|
84 |
+
s2 = wavread([wsj0root C{3}{i}]);
|
85 |
+
end
|
86 |
+
|
87 |
+
% resample, normalize 8 kHz file, save scaling factor
|
88 |
+
s1_8k=resample(s1,fs8k,fs);
|
89 |
+
[s1_8k,lev1]=activlev(s1_8k,fs8k,'n'); % y_norm = y /sqrt(lev);
|
90 |
+
s2_8k=resample(s2,fs8k,fs);
|
91 |
+
[s2_8k,lev2]=activlev(s2_8k,fs8k,'n');
|
92 |
+
|
93 |
+
weight_1=10^(inwav1_snr/20);
|
94 |
+
weight_2=10^(inwav2_snr/20);
|
95 |
+
|
96 |
+
s1_8k = weight_1 * s1_8k;
|
97 |
+
s2_8k = weight_2 * s2_8k;
|
98 |
+
|
99 |
+
switch min_max{i_mm}
|
100 |
+
case 'max'
|
101 |
+
mix_8k_length = max(length(s1_8k),length(s2_8k));
|
102 |
+
s1_8k = cat(1,s1_8k,zeros(mix_8k_length - length(s1_8k),1));
|
103 |
+
s2_8k = cat(1,s2_8k,zeros(mix_8k_length - length(s2_8k),1));
|
104 |
+
case 'min'
|
105 |
+
mix_8k_length = min(length(s1_8k),length(s2_8k));
|
106 |
+
s1_8k = s1_8k(1:mix_8k_length);
|
107 |
+
s2_8k = s2_8k(1:mix_8k_length);
|
108 |
+
end
|
109 |
+
mix_8k = s1_8k + s2_8k;
|
110 |
+
|
111 |
+
max_amp_8k = max(cat(1,abs(mix_8k(:)),abs(s1_8k(:)),abs(s2_8k(:))));
|
112 |
+
mix_scaling_8k = 1/max_amp_8k*0.9;
|
113 |
+
s1_8k = mix_scaling_8k * s1_8k;
|
114 |
+
s2_8k = mix_scaling_8k * s2_8k;
|
115 |
+
mix_8k = mix_scaling_8k * mix_8k;
|
116 |
+
|
117 |
+
% apply same gain to 16 kHz file
|
118 |
+
s1_16k = weight_1 * s1 / sqrt(lev1);
|
119 |
+
s2_16k = weight_2 * s2 / sqrt(lev2);
|
120 |
+
|
121 |
+
switch min_max{i_mm}
|
122 |
+
case 'max'
|
123 |
+
mix_16k_length = max(length(s1_16k),length(s2_16k));
|
124 |
+
s1_16k = cat(1,s1_16k,zeros(mix_16k_length - length(s1_16k),1));
|
125 |
+
s2_16k = cat(1,s2_16k,zeros(mix_16k_length - length(s2_16k),1));
|
126 |
+
case 'min'
|
127 |
+
mix_16k_length = min(length(s1_16k),length(s2_16k));
|
128 |
+
s1_16k = s1_16k(1:mix_16k_length);
|
129 |
+
s2_16k = s2_16k(1:mix_16k_length);
|
130 |
+
end
|
131 |
+
mix_16k = s1_16k + s2_16k;
|
132 |
+
|
133 |
+
max_amp_16k = max(cat(1,abs(mix_16k(:)),abs(s1_16k(:)),abs(s2_16k(:))));
|
134 |
+
mix_scaling_16k = 1/max_amp_16k*0.9;
|
135 |
+
s1_16k = mix_scaling_16k * s1_16k;
|
136 |
+
s2_16k = mix_scaling_16k * s2_16k;
|
137 |
+
mix_16k = mix_scaling_16k * mix_16k;
|
138 |
+
|
139 |
+
% save 8 kHz and 16 kHz mixtures, as well as
|
140 |
+
% necessary scaling factors
|
141 |
+
|
142 |
+
scaling_16k(i,1) = weight_1 * mix_scaling_16k/ sqrt(lev1);
|
143 |
+
scaling_16k(i,2) = weight_2 * mix_scaling_16k/ sqrt(lev2);
|
144 |
+
scaling_8k(i,1) = weight_1 * mix_scaling_8k/ sqrt(lev1);
|
145 |
+
scaling_8k(i,2) = weight_2 * mix_scaling_8k/ sqrt(lev2);
|
146 |
+
|
147 |
+
scaling16bit_16k(i) = mix_scaling_16k;
|
148 |
+
scaling16bit_8k(i) = mix_scaling_8k;
|
149 |
+
|
150 |
+
if useaudioread
|
151 |
+
s1_8k = int16(round((2^15)*s1_8k));
|
152 |
+
s2_8k = int16(round((2^15)*s2_8k));
|
153 |
+
mix_8k = int16(round((2^15)*mix_8k));
|
154 |
+
s1_16k = int16(round((2^15)*s1_16k));
|
155 |
+
s2_16k = int16(round((2^15)*s2_16k));
|
156 |
+
mix_16k = int16(round((2^15)*mix_16k));
|
157 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'],s1_8k,fs8k);
|
158 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'],s1_16k,fs);
|
159 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'],s2_8k,fs8k);
|
160 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'],s2_16k,fs);
|
161 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'],mix_8k,fs8k);
|
162 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'],mix_16k,fs);
|
163 |
+
else
|
164 |
+
wavwrite(s1_8k,fs8k,[output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav']); %#ok<*DWVWR>
|
165 |
+
wavwrite(s1_16k,fs,[output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav']);
|
166 |
+
wavwrite(s2_8k,fs8k,[output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav']);
|
167 |
+
wavwrite(s2_16k,fs,[output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav']);
|
168 |
+
wavwrite(mix_8k,fs8k,[output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav']);
|
169 |
+
wavwrite(mix_16k,fs,[output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav']);
|
170 |
+
end
|
171 |
+
|
172 |
+
if mod(i,10)==0
|
173 |
+
fprintf(1,'.');
|
174 |
+
if mod(i,200)==0
|
175 |
+
fprintf(1,'\n');
|
176 |
+
end
|
177 |
+
end
|
178 |
+
|
179 |
+
end
|
180 |
+
save([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_8k','scaling16bit_8k');
|
181 |
+
save([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_16k','scaling16bit_16k');
|
182 |
+
|
183 |
+
fclose(fid);
|
184 |
+
fclose(fid_s1);
|
185 |
+
fclose(fid_s2);
|
186 |
+
fclose(fid_m);
|
187 |
+
end
|
188 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_3speakers.m
ADDED
@@ -0,0 +1,188 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
% create_wav_3_speakers.m
|
2 |
+
%
|
3 |
+
% Create 3-speaker mixtures
|
4 |
+
%
|
5 |
+
% This script assumes that WSJ0's wv1 sphere files have already
|
6 |
+
% been converted to wav files, using the original folder structure
|
7 |
+
% under wsj0/, e.g.,
|
8 |
+
% 11-1.1/wsj0/si_tr_s/01t/01to030v.wv1 is converted to wav and
|
9 |
+
% stored in YOUR_PATH/wsj0/si_tr_s/01t/01to030v.wav, and
|
10 |
+
% 11-6.1/wsj0/si_dt_05/050/050a0501.wv1 is converted to wav and
|
11 |
+
% stored in YOUR_PATH/wsj0/si_dt_05/050/050a0501.wav.
|
12 |
+
% Relevant data from all disks are assumed merged under YOUR_PATH/wsj0/
|
13 |
+
%
|
14 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
15 |
+
% Copyright (C) 2016 Mitsubishi Electric Research Labs
|
16 |
+
% (Jonathan Le Roux, John R. Hershey, Zhuo Chen)
|
17 |
+
% Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
|
18 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
19 |
+
|
20 |
+
%addpath('./voicebox')
|
21 |
+
data_type = {'tr','cv','tt'};
|
22 |
+
wsj0root = '/home/joseph/Desktop/WSJ0/'; % YOUR_PATH/, the folder containing wsj0/
|
23 |
+
output_dir16k='/home/joseph/Desktop/WSJ0/dataset/3speakers/wav16k';
|
24 |
+
output_dir8k='/home/joseph/Desktop/WSJ0/dataset/3speakers/wav8k';
|
25 |
+
|
26 |
+
min_max = {'min'}; %{'min','max'};
|
27 |
+
|
28 |
+
for i_mm = 1:length(min_max)
|
29 |
+
for i_type = 1:length(data_type)
|
30 |
+
if ~exist([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
31 |
+
mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}]);
|
32 |
+
end
|
33 |
+
if ~exist([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
34 |
+
mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}]);
|
35 |
+
end
|
36 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
37 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
38 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
39 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']); %#ok<NASGU>
|
40 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
41 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
42 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
43 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']);
|
44 |
+
|
45 |
+
TaskFile = ['mix_3_spk_' data_type{i_type} '.txt'];
|
46 |
+
fid=fopen(TaskFile,'r');
|
47 |
+
C=textscan(fid,'%s %f %s %f %s %f');
|
48 |
+
|
49 |
+
Source1File = ['mix_3_spk_' min_max{i_mm} '_' data_type{i_type} '_1'];
|
50 |
+
Source2File = ['mix_3_spk_' min_max{i_mm} '_' data_type{i_type} '_2'];
|
51 |
+
Source3File = ['mix_3_spk_' min_max{i_mm} '_' data_type{i_type} '_3'];
|
52 |
+
MixFile = ['mix_3_spk_' min_max{i_mm} '_' data_type{i_type} '_mix'];
|
53 |
+
fid_s1 = fopen(Source1File,'w');
|
54 |
+
fid_s2 = fopen(Source2File,'w');
|
55 |
+
fid_s3 = fopen(Source3File,'w');
|
56 |
+
fid_m = fopen(MixFile,'w');
|
57 |
+
|
58 |
+
num_files = length(C{1});
|
59 |
+
fs8k=8000;
|
60 |
+
|
61 |
+
scaling_16k = zeros(num_files,3);
|
62 |
+
scaling_8k = zeros(num_files,3);
|
63 |
+
scaling16bit_16k = zeros(num_files,1);
|
64 |
+
scaling16bit_8k = zeros(num_files,1);
|
65 |
+
fprintf(1,'%s\n',[min_max{i_mm} '_' data_type{i_type}]);
|
66 |
+
for i = 1:num_files
|
67 |
+
[inwav1_dir,invwav1_name,inwav1_ext] = fileparts(C{1}{i});
|
68 |
+
[inwav2_dir,invwav2_name,inwav2_ext] = fileparts(C{3}{i});
|
69 |
+
[inwav3_dir,invwav3_name,inwav3_ext] = fileparts(C{5}{i});
|
70 |
+
fprintf(fid_s1,'%s\n',C{1}{i});%[inwav1_dir,'/',invwav1_name,inwav1_ext]);
|
71 |
+
fprintf(fid_s2,'%s\n',C{3}{i});%[inwav2_dir,'/',invwav2_name,inwav2_ext]);
|
72 |
+
fprintf(fid_s3,'%s\n',C{5}{i});%[inwav3_dir,'/',invwav3_name,inwav3_ext]);
|
73 |
+
inwav1_snr = C{2}(i);
|
74 |
+
inwav2_snr = C{4}(i);
|
75 |
+
inwav3_snr = C{6}(i);
|
76 |
+
mix_name = [invwav1_name,'_',num2str(inwav1_snr),...
|
77 |
+
'_',invwav2_name,'_',num2str(inwav2_snr),...
|
78 |
+
'_',invwav3_name,'_',num2str(inwav3_snr)];
|
79 |
+
fprintf(fid_m,'%s\n',mix_name);
|
80 |
+
|
81 |
+
% get input wavs
|
82 |
+
[s1, fs] = audioread([wsj0root C{1}{i}]);
|
83 |
+
s2 = audioread([wsj0root C{3}{i}]);
|
84 |
+
s3 = audioread([wsj0root C{5}{i}]);
|
85 |
+
|
86 |
+
% resample, normalize 8 kHz file, save scaling factor
|
87 |
+
s1_8k=resample(s1,fs8k,fs);
|
88 |
+
[s1_8k,lev1]=activlev(s1_8k,fs8k,'n'); % y_norm = y /sqrt(lev);
|
89 |
+
s2_8k=resample(s2,fs8k,fs);
|
90 |
+
[s2_8k,lev2]=activlev(s2_8k,fs8k,'n');
|
91 |
+
s3_8k=resample(s3,fs8k,fs);
|
92 |
+
[s3_8k,lev3]=activlev(s3_8k,fs8k,'n');
|
93 |
+
|
94 |
+
weight_1=10^(inwav1_snr/20);
|
95 |
+
weight_2=10^(inwav2_snr/20);
|
96 |
+
weight_3=10^(inwav3_snr/20);
|
97 |
+
|
98 |
+
s1_8k = weight_1 * s1_8k;
|
99 |
+
s2_8k = weight_2 * s2_8k;
|
100 |
+
s3_8k = weight_3 * s3_8k;
|
101 |
+
|
102 |
+
switch min_max{i_mm}
|
103 |
+
case 'max'
|
104 |
+
mix_8k_length = max([length(s1_8k),length(s2_8k),length(s3_8k)]);
|
105 |
+
s1_8k = cat(1,s1_8k,zeros(mix_8k_length - length(s1_8k),1));
|
106 |
+
s2_8k = cat(1,s2_8k,zeros(mix_8k_length - length(s2_8k),1));
|
107 |
+
s3_8k = cat(1,s3_8k,zeros(mix_8k_length - length(s3_8k),1));
|
108 |
+
case 'min'
|
109 |
+
mix_8k_length = min([length(s1_8k),length(s2_8k),length(s3_8k)]);
|
110 |
+
s1_8k = s1_8k(1:mix_8k_length);
|
111 |
+
s2_8k = s2_8k(1:mix_8k_length);
|
112 |
+
s3_8k = s3_8k(1:mix_8k_length);
|
113 |
+
end
|
114 |
+
mix_8k = s1_8k + s2_8k + s3_8k;
|
115 |
+
|
116 |
+
max_amp_8k = max(cat(1,abs(mix_8k(:)),abs(s1_8k(:)),abs(s2_8k(:)),abs(s3_8k(:))));
|
117 |
+
mix_scaling_8k = 1/max_amp_8k*0.9;
|
118 |
+
s1_8k = mix_scaling_8k * s1_8k;
|
119 |
+
s2_8k = mix_scaling_8k * s2_8k;
|
120 |
+
s3_8k = mix_scaling_8k * s3_8k;
|
121 |
+
mix_8k = mix_scaling_8k * mix_8k;
|
122 |
+
|
123 |
+
% apply same gain to 16 kHz file
|
124 |
+
s1_16k = weight_1 * s1 / sqrt(lev1);
|
125 |
+
s2_16k = weight_2 * s2 / sqrt(lev2);
|
126 |
+
s3_16k = weight_3 * s3 / sqrt(lev3);
|
127 |
+
|
128 |
+
switch min_max{i_mm}
|
129 |
+
case 'max'
|
130 |
+
mix_16k_length = max([length(s1_16k),length(s2_16k),length(s3_16k)]);
|
131 |
+
s1_16k = cat(1,s1_16k,zeros(mix_16k_length - length(s1_16k),1));
|
132 |
+
s2_16k = cat(1,s2_16k,zeros(mix_16k_length - length(s2_16k),1));
|
133 |
+
s3_16k = cat(1,s3_16k,zeros(mix_16k_length - length(s3_16k),1));
|
134 |
+
case 'min'
|
135 |
+
mix_16k_length = min([length(s1_16k),length(s2_16k),length(s3_16k)]);
|
136 |
+
s1_16k = s1_16k(1:mix_16k_length);
|
137 |
+
s2_16k = s2_16k(1:mix_16k_length);
|
138 |
+
s3_16k = s3_16k(1:mix_16k_length);
|
139 |
+
end
|
140 |
+
mix_16k = s1_16k + s2_16k + s3_16k;
|
141 |
+
|
142 |
+
max_amp_16k = max(cat(1,abs(mix_16k(:)),abs(s1_16k(:)),abs(s2_16k(:)),abs(s3_16k(:))));
|
143 |
+
mix_scaling_16k = 1/max_amp_16k*0.9;
|
144 |
+
s1_16k = mix_scaling_16k * s1_16k;
|
145 |
+
s2_16k = mix_scaling_16k * s2_16k;
|
146 |
+
s3_16k = mix_scaling_16k * s3_16k;
|
147 |
+
mix_16k = mix_scaling_16k * mix_16k;
|
148 |
+
|
149 |
+
% save 8 kHz and 16 kHz mixtures, as well as
|
150 |
+
% necessary scaling factors
|
151 |
+
|
152 |
+
scaling_16k(i,1) = weight_1 * mix_scaling_16k/ sqrt(lev1);
|
153 |
+
scaling_16k(i,2) = weight_2 * mix_scaling_16k/ sqrt(lev2);
|
154 |
+
scaling_16k(i,3) = weight_3 * mix_scaling_16k/ sqrt(lev3);
|
155 |
+
scaling_8k(i,1) = weight_1 * mix_scaling_8k/ sqrt(lev1);
|
156 |
+
scaling_8k(i,2) = weight_2 * mix_scaling_8k/ sqrt(lev2);
|
157 |
+
scaling_8k(i,3) = weight_3 * mix_scaling_8k/ sqrt(lev3);
|
158 |
+
|
159 |
+
scaling16bit_16k(i) = mix_scaling_16k;
|
160 |
+
scaling16bit_8k(i) = mix_scaling_8k;
|
161 |
+
|
162 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_8k,fs8k);
|
163 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_16k,fs);
|
164 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_8k,fs8k);
|
165 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_16k,fs);
|
166 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_8k,fs8k);
|
167 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_16k,fs);
|
168 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_8k,fs8k);
|
169 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_16k,fs);
|
170 |
+
|
171 |
+
if mod(i,10)==0
|
172 |
+
fprintf(1,'.');
|
173 |
+
if mod(i,200)==0
|
174 |
+
fprintf(1,'\n');
|
175 |
+
end
|
176 |
+
end
|
177 |
+
|
178 |
+
end
|
179 |
+
save([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_8k','scaling16bit_8k');
|
180 |
+
save([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_16k','scaling16bit_16k');
|
181 |
+
|
182 |
+
fclose(fid);
|
183 |
+
fclose(fid_s1);
|
184 |
+
fclose(fid_s2);
|
185 |
+
fclose(fid_s3);
|
186 |
+
fclose(fid_m);
|
187 |
+
end
|
188 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_4speakers.m
ADDED
@@ -0,0 +1,214 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
% create_wav_3_speakers.m
|
2 |
+
%
|
3 |
+
% Create 3-speaker mixtures
|
4 |
+
%
|
5 |
+
% This script assumes that WSJ0's wv1 sphere files have already
|
6 |
+
% been converted to wav files, using the original folder structure
|
7 |
+
% under wsj0/, e.g.,
|
8 |
+
% 11-1.1/wsj0/si_tr_s/01t/01to030v.wv1 is converted to wav and
|
9 |
+
% stored in YOUR_PATH/wsj0/si_tr_s/01t/01to030v.wav, and
|
10 |
+
% 11-6.1/wsj0/si_dt_05/050/050a0501.wv1 is converted to wav and
|
11 |
+
% stored in YOUR_PATH/wsj0/si_dt_05/050/050a0501.wav.
|
12 |
+
% Relevant data from all disks are assumed merged under YOUR_PATH/wsj0/
|
13 |
+
%
|
14 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
15 |
+
% Copyright (C) 2016 Mitsubishi Electric Research Labs
|
16 |
+
% (Jonathan Le Roux, John R. Hershey, Zhuo Chen)
|
17 |
+
% Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
|
18 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
19 |
+
|
20 |
+
%addpath('./voicebox')
|
21 |
+
data_type = {'tr','cv','tt'};
|
22 |
+
wsj0root = '/home/joseph/Desktop/WSJ0/'; % YOUR_PATH/, the folder containing wsj0/
|
23 |
+
output_dir16k='/home/joseph/Desktop/WSJ0/dataset/4speakers/wav16k';
|
24 |
+
output_dir8k='/home/joseph/Desktop/WSJ0/dataset/4speakers/wav8k';
|
25 |
+
|
26 |
+
min_max = {'min'}; %{'min','max'};
|
27 |
+
|
28 |
+
for i_mm = 1:length(min_max)
|
29 |
+
for i_type = 1:length(data_type)
|
30 |
+
if ~exist([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
31 |
+
mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}]);
|
32 |
+
end
|
33 |
+
if ~exist([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
34 |
+
mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}]);
|
35 |
+
end
|
36 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
37 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
38 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
39 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s4/']); %#ok<NASGU>
|
40 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']); %#ok<NASGU>
|
41 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
42 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
43 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
44 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s4/']); %#ok<NASGU>
|
45 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']);
|
46 |
+
|
47 |
+
TaskFile = ['mix_4_spk_' data_type{i_type} '.txt'];
|
48 |
+
fid=fopen(TaskFile,'r');
|
49 |
+
C=textscan(fid,'%s %f %s %f %s %f %s %f');
|
50 |
+
|
51 |
+
Source1File = ['mix_4_spk_' min_max{i_mm} '_' data_type{i_type} '_1'];
|
52 |
+
Source2File = ['mix_4_spk_' min_max{i_mm} '_' data_type{i_type} '_2'];
|
53 |
+
Source3File = ['mix_4_spk_' min_max{i_mm} '_' data_type{i_type} '_3'];
|
54 |
+
Source4File = ['mix_4_spk_' min_max{i_mm} '_' data_type{i_type} '_4'];
|
55 |
+
MixFile = ['mix_4_spk_' min_max{i_mm} '_' data_type{i_type} '_mix'];
|
56 |
+
fid_s1 = fopen(Source1File,'w');
|
57 |
+
fid_s2 = fopen(Source2File,'w');
|
58 |
+
fid_s3 = fopen(Source3File,'w');
|
59 |
+
fid_s4 = fopen(Source4File,'w');
|
60 |
+
fid_m = fopen(MixFile,'w');
|
61 |
+
|
62 |
+
num_files = length(C{1});
|
63 |
+
fs8k=8000;
|
64 |
+
|
65 |
+
scaling_16k = zeros(num_files,3);
|
66 |
+
scaling_8k = zeros(num_files,3);
|
67 |
+
scaling16bit_16k = zeros(num_files,1);
|
68 |
+
scaling16bit_8k = zeros(num_files,1);
|
69 |
+
fprintf(1,'%s\n',[min_max{i_mm} '_' data_type{i_type}]);
|
70 |
+
for i = 1:num_files
|
71 |
+
[inwav1_dir,invwav1_name,inwav1_ext] = fileparts(C{1}{i});
|
72 |
+
[inwav2_dir,invwav2_name,inwav2_ext] = fileparts(C{3}{i});
|
73 |
+
[inwav3_dir,invwav3_name,inwav3_ext] = fileparts(C{5}{i});
|
74 |
+
[inwav4_dir,invwav4_name,inwav4_ext] = fileparts(C{7}{i});
|
75 |
+
fprintf(fid_s1,'%s\n',C{1}{i});%[inwav1_dir,'/',invwav1_name,inwav1_ext]);
|
76 |
+
fprintf(fid_s2,'%s\n',C{3}{i});%[inwav2_dir,'/',invwav2_name,inwav2_ext]);
|
77 |
+
fprintf(fid_s3,'%s\n',C{5}{i});%[inwav3_dir,'/',invwav3_name,inwav3_ext]);
|
78 |
+
fprintf(fid_s4,'%s\n',C{7}{i});%[inwav4_dir,'/',invwav4_name,inwav4_ext]);
|
79 |
+
inwav1_snr = C{2}(i);
|
80 |
+
inwav2_snr = C{4}(i);
|
81 |
+
inwav3_snr = C{6}(i);
|
82 |
+
inwav4_snr = C{8}(i);
|
83 |
+
mix_name = [invwav1_name,'_',num2str(inwav1_snr),...
|
84 |
+
'_',invwav2_name,'_',num2str(inwav2_snr),...
|
85 |
+
'_',invwav3_name,'_',num2str(inwav3_snr),...
|
86 |
+
'_',invwav4_name,'_',num2str(inwav4_snr)];
|
87 |
+
fprintf(fid_m,'%s\n',mix_name);
|
88 |
+
|
89 |
+
% get input wavs
|
90 |
+
[s1, fs] = audioread([wsj0root C{1}{i}]);
|
91 |
+
s2 = audioread([wsj0root C{3}{i}]);
|
92 |
+
s3 = audioread([wsj0root C{5}{i}]);
|
93 |
+
s4 = audioread([wsj0root C{7}{i}]);
|
94 |
+
|
95 |
+
% resample, normalize 8 kHz file, save scaling factor
|
96 |
+
s1_8k=resample(s1,fs8k,fs);
|
97 |
+
[s1_8k,lev1]=activlev(s1_8k,fs8k,'n'); % y_norm = y /sqrt(lev);
|
98 |
+
s2_8k=resample(s2,fs8k,fs);
|
99 |
+
[s2_8k,lev2]=activlev(s2_8k,fs8k,'n');
|
100 |
+
s3_8k=resample(s3,fs8k,fs);
|
101 |
+
[s3_8k,lev3]=activlev(s3_8k,fs8k,'n');
|
102 |
+
s4_8k=resample(s4,fs8k,fs);
|
103 |
+
[s4_8k,lev4]=activlev(s4_8k,fs8k,'n');
|
104 |
+
|
105 |
+
weight_1=10^(inwav1_snr/20);
|
106 |
+
weight_2=10^(inwav2_snr/20);
|
107 |
+
weight_3=10^(inwav3_snr/20);
|
108 |
+
weight_4=10^(inwav4_snr/20);
|
109 |
+
|
110 |
+
s1_8k = weight_1 * s1_8k;
|
111 |
+
s2_8k = weight_2 * s2_8k;
|
112 |
+
s3_8k = weight_3 * s3_8k;
|
113 |
+
s4_8k = weight_4 * s4_8k;
|
114 |
+
|
115 |
+
switch min_max{i_mm}
|
116 |
+
case 'max'
|
117 |
+
mix_8k_length = max([length(s1_8k),length(s2_8k),length(s3_8k),length(s4_8k)]);
|
118 |
+
s1_8k = cat(1,s1_8k,zeros(mix_8k_length - length(s1_8k),1));
|
119 |
+
s2_8k = cat(1,s2_8k,zeros(mix_8k_length - length(s2_8k),1));
|
120 |
+
s3_8k = cat(1,s3_8k,zeros(mix_8k_length - length(s3_8k),1));
|
121 |
+
s4_8k = cat(1,s4_8k,zeros(mix_8k_length - length(s4_8k),1));
|
122 |
+
|
123 |
+
case 'min'
|
124 |
+
mix_8k_length = min([length(s1_8k),length(s2_8k),length(s3_8k),length(s4_8k)]);
|
125 |
+
s1_8k = s1_8k(1:mix_8k_length);
|
126 |
+
s2_8k = s2_8k(1:mix_8k_length);
|
127 |
+
s3_8k = s3_8k(1:mix_8k_length);
|
128 |
+
s4_8k = s4_8k(1:mix_8k_length);
|
129 |
+
end
|
130 |
+
mix_8k = s1_8k + s2_8k + s3_8k + s4_8k;
|
131 |
+
|
132 |
+
max_amp_8k = max(cat(1,abs(mix_8k(:)),abs(s1_8k(:)),abs(s2_8k(:)),abs(s3_8k(:)),abs(s4_8k(:))));
|
133 |
+
mix_scaling_8k = 1/max_amp_8k*0.9;
|
134 |
+
s1_8k = mix_scaling_8k * s1_8k;
|
135 |
+
s2_8k = mix_scaling_8k * s2_8k;
|
136 |
+
s3_8k = mix_scaling_8k * s3_8k;
|
137 |
+
s4_8k = mix_scaling_8k * s4_8k;
|
138 |
+
mix_8k = mix_scaling_8k * mix_8k;
|
139 |
+
|
140 |
+
% apply same gain to 16 kHz file
|
141 |
+
s1_16k = weight_1 * s1 / sqrt(lev1);
|
142 |
+
s2_16k = weight_2 * s2 / sqrt(lev2);
|
143 |
+
s3_16k = weight_3 * s3 / sqrt(lev3);
|
144 |
+
s4_16k = weight_4 * s4 / sqrt(lev4);
|
145 |
+
|
146 |
+
switch min_max{i_mm}
|
147 |
+
case 'max'
|
148 |
+
mix_16k_length = max([length(s1_16k),length(s2_16k),length(s3_16k),length(s4_16k)]);
|
149 |
+
s1_16k = cat(1,s1_16k,zeros(mix_16k_length - length(s1_16k),1));
|
150 |
+
s2_16k = cat(1,s2_16k,zeros(mix_16k_length - length(s2_16k),1));
|
151 |
+
s3_16k = cat(1,s3_16k,zeros(mix_16k_length - length(s3_16k),1));
|
152 |
+
s4_16k = cat(1,s4_16k,zeros(mix_16k_length - length(s4_16k),1));
|
153 |
+
case 'min'
|
154 |
+
mix_16k_length = min([length(s1_16k),length(s2_16k),length(s3_16k),length(s4_16k)]);
|
155 |
+
s1_16k = s1_16k(1:mix_16k_length);
|
156 |
+
s2_16k = s2_16k(1:mix_16k_length);
|
157 |
+
s3_16k = s3_16k(1:mix_16k_length);
|
158 |
+
s4_16k = s4_16k(1:mix_16k_length);
|
159 |
+
end
|
160 |
+
mix_16k = s1_16k + s2_16k + s3_16k + s4_16k;
|
161 |
+
|
162 |
+
max_amp_16k = max(cat(1,abs(mix_16k(:)),abs(s1_16k(:)),abs(s2_16k(:)),abs(s3_16k(:)),abs(s4_16k(:))));
|
163 |
+
mix_scaling_16k = 1/max_amp_16k*0.9;
|
164 |
+
s1_16k = mix_scaling_16k * s1_16k;
|
165 |
+
s2_16k = mix_scaling_16k * s2_16k;
|
166 |
+
s3_16k = mix_scaling_16k * s3_16k;
|
167 |
+
s4_16k = mix_scaling_16k * s4_16k;
|
168 |
+
mix_16k = mix_scaling_16k * mix_16k;
|
169 |
+
|
170 |
+
% save 8 kHz and 16 kHz mixtures, as well as
|
171 |
+
% necessary scaling factors
|
172 |
+
|
173 |
+
scaling_16k(i,1) = weight_1 * mix_scaling_16k/ sqrt(lev1);
|
174 |
+
scaling_16k(i,2) = weight_2 * mix_scaling_16k/ sqrt(lev2);
|
175 |
+
scaling_16k(i,3) = weight_3 * mix_scaling_16k/ sqrt(lev3);
|
176 |
+
scaling_16k(i,4) = weight_4 * mix_scaling_16k/ sqrt(lev4);
|
177 |
+
scaling_8k(i,1) = weight_1 * mix_scaling_8k/ sqrt(lev1);
|
178 |
+
scaling_8k(i,2) = weight_2 * mix_scaling_8k/ sqrt(lev2);
|
179 |
+
scaling_8k(i,3) = weight_3 * mix_scaling_8k/ sqrt(lev3);
|
180 |
+
scaling_8k(i,4) = weight_4 * mix_scaling_8k/ sqrt(lev4);
|
181 |
+
|
182 |
+
scaling16bit_16k(i) = mix_scaling_16k;
|
183 |
+
scaling16bit_8k(i) = mix_scaling_8k;
|
184 |
+
|
185 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_8k,fs8k);
|
186 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_16k,fs);
|
187 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_8k,fs8k);
|
188 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_16k,fs);
|
189 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_8k,fs8k);
|
190 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_16k,fs);
|
191 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s4/' mix_name '.wav'], s4_8k,fs8k);
|
192 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s4/' mix_name '.wav'], s4_16k,fs);
|
193 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_8k,fs8k);
|
194 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_16k,fs);
|
195 |
+
|
196 |
+
if mod(i,10)==0
|
197 |
+
fprintf(1,'.');
|
198 |
+
if mod(i,200)==0
|
199 |
+
fprintf(1,'\n');
|
200 |
+
end
|
201 |
+
end
|
202 |
+
|
203 |
+
end
|
204 |
+
save([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_8k','scaling16bit_8k');
|
205 |
+
save([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_16k','scaling16bit_16k');
|
206 |
+
|
207 |
+
fclose(fid);
|
208 |
+
fclose(fid_s1);
|
209 |
+
fclose(fid_s2);
|
210 |
+
fclose(fid_s3);
|
211 |
+
fclose(fid_s4);
|
212 |
+
fclose(fid_m);
|
213 |
+
end
|
214 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/create_wav_5speakers.m
ADDED
@@ -0,0 +1,238 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
% create_wav_3_speakers.m
|
2 |
+
%
|
3 |
+
% Create 3-speaker mixtures
|
4 |
+
%
|
5 |
+
% This script assumes that WSJ0's wv1 sphere files have already
|
6 |
+
% been converted to wav files, using the original folder structure
|
7 |
+
% under wsj0/, e.g.,
|
8 |
+
% 11-1.1/wsj0/si_tr_s/01t/01to030v.wv1 is converted to wav and
|
9 |
+
% stored in YOUR_PATH/wsj0/si_tr_s/01t/01to030v.wav, and
|
10 |
+
% 11-6.1/wsj0/si_dt_05/050/050a0501.wv1 is converted to wav and
|
11 |
+
% stored in YOUR_PATH/wsj0/si_dt_05/050/050a0501.wav.
|
12 |
+
% Relevant data from all disks are assumed merged under YOUR_PATH/wsj0/
|
13 |
+
%
|
14 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
15 |
+
% Copyright (C) 2016 Mitsubishi Electric Research Labs
|
16 |
+
% (Jonathan Le Roux, John R. Hershey, Zhuo Chen)
|
17 |
+
% Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
|
18 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
19 |
+
|
20 |
+
%addpath('./voicebox')
|
21 |
+
data_type = {'tr','cv','tt'};
|
22 |
+
wsj0root = '/home/joseph/Desktop/WSJ0/'; % YOUR_PATH/, the folder containing wsj0/
|
23 |
+
output_dir16k='/home/joseph/Desktop/WSJ0/dataset/5speakers/wav16k';
|
24 |
+
output_dir8k='/home/joseph/Desktop/WSJ0/dataset/5speakers/wav8k';
|
25 |
+
|
26 |
+
min_max = {'min'}; %{'min','max'};
|
27 |
+
|
28 |
+
for i_mm = 1:length(min_max)
|
29 |
+
for i_type = 1:length(data_type)
|
30 |
+
if ~exist([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
31 |
+
mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type}]);
|
32 |
+
end
|
33 |
+
if ~exist([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}],'dir')
|
34 |
+
mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type}]);
|
35 |
+
end
|
36 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
37 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
38 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
39 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s4/']); %#ok<NASGU>
|
40 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s5/']); %#ok<NASGU>
|
41 |
+
status = mkdir([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']); %#ok<NASGU>
|
42 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/']); %#ok<NASGU>
|
43 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/']); %#ok<NASGU>
|
44 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/']); %#ok<NASGU>
|
45 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s4/']); %#ok<NASGU>
|
46 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s5/']); %#ok<NASGU>
|
47 |
+
status = mkdir([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/']);
|
48 |
+
|
49 |
+
TaskFile = ['mix_5_spk_' data_type{i_type} '.txt'];
|
50 |
+
fid=fopen(TaskFile,'r');
|
51 |
+
C=textscan(fid,'%s %f %s %f %s %f %s %f %s %f');
|
52 |
+
|
53 |
+
Source1File = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_1'];
|
54 |
+
Source2File = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_2'];
|
55 |
+
Source3File = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_3'];
|
56 |
+
Source4File = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_4'];
|
57 |
+
Source5File = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_5'];
|
58 |
+
MixFile = ['mix_5_spk_' min_max{i_mm} '_' data_type{i_type} '_mix'];
|
59 |
+
fid_s1 = fopen(Source1File,'w');
|
60 |
+
fid_s2 = fopen(Source2File,'w');
|
61 |
+
fid_s3 = fopen(Source3File,'w');
|
62 |
+
fid_s4 = fopen(Source4File,'w');
|
63 |
+
fid_s5 = fopen(Source5File,'w');
|
64 |
+
fid_m = fopen(MixFile,'w');
|
65 |
+
|
66 |
+
num_files = length(C{1});
|
67 |
+
fs8k=8000;
|
68 |
+
|
69 |
+
scaling_16k = zeros(num_files,3);
|
70 |
+
scaling_8k = zeros(num_files,3);
|
71 |
+
scaling16bit_16k = zeros(num_files,1);
|
72 |
+
scaling16bit_8k = zeros(num_files,1);
|
73 |
+
fprintf(1,'%s\n',[min_max{i_mm} '_' data_type{i_type}]);
|
74 |
+
for i = 1:num_files
|
75 |
+
[inwav1_dir,invwav1_name,inwav1_ext] = fileparts(C{1}{i});
|
76 |
+
[inwav2_dir,invwav2_name,inwav2_ext] = fileparts(C{3}{i});
|
77 |
+
[inwav3_dir,invwav3_name,inwav3_ext] = fileparts(C{5}{i});
|
78 |
+
[inwav4_dir,invwav4_name,inwav4_ext] = fileparts(C{7}{i});
|
79 |
+
[inwav5_dir,invwav5_name,inwav5_ext] = fileparts(C{9}{i});
|
80 |
+
fprintf(fid_s1,'%s\n',C{1}{i});%[inwav1_dir,'/',invwav1_name,inwav1_ext]);
|
81 |
+
fprintf(fid_s2,'%s\n',C{3}{i});%[inwav2_dir,'/',invwav2_name,inwav2_ext]);
|
82 |
+
fprintf(fid_s3,'%s\n',C{5}{i});%[inwav3_dir,'/',invwav3_name,inwav3_ext]);
|
83 |
+
fprintf(fid_s4,'%s\n',C{7}{i});%[inwav4_dir,'/',invwav4_name,inwav4_ext]);
|
84 |
+
fprintf(fid_s5,'%s\n',C{9}{i});%[inwav5_dir,'/',invwav5_name,inwav5_ext]);
|
85 |
+
inwav1_snr = C{2}(i);
|
86 |
+
inwav2_snr = C{4}(i);
|
87 |
+
inwav3_snr = C{6}(i);
|
88 |
+
inwav4_snr = C{8}(i);
|
89 |
+
inwav5_snr = C{10}(i);
|
90 |
+
mix_name = [invwav1_name,'_',num2str(inwav1_snr),...
|
91 |
+
'_',invwav2_name,'_',num2str(inwav2_snr),...
|
92 |
+
'_',invwav3_name,'_',num2str(inwav3_snr),...
|
93 |
+
'_',invwav4_name,'_',num2str(inwav4_snr),...
|
94 |
+
'_',invwav5_name,'_',num2str(inwav5_snr)];
|
95 |
+
fprintf(fid_m,'%s\n',mix_name);
|
96 |
+
|
97 |
+
% get input wavs
|
98 |
+
[s1, fs] = audioread([wsj0root C{1}{i}]);
|
99 |
+
s2 = audioread([wsj0root C{3}{i}]);
|
100 |
+
s3 = audioread([wsj0root C{5}{i}]);
|
101 |
+
s4 = audioread([wsj0root C{7}{i}]);
|
102 |
+
s5 = audioread([wsj0root C{9}{i}]);
|
103 |
+
|
104 |
+
% resample, normalize 8 kHz file, save scaling factor
|
105 |
+
s1_8k=resample(s1,fs8k,fs);
|
106 |
+
[s1_8k,lev1]=activlev(s1_8k,fs8k,'n'); % y_norm = y /sqrt(lev);
|
107 |
+
s2_8k=resample(s2,fs8k,fs);
|
108 |
+
[s2_8k,lev2]=activlev(s2_8k,fs8k,'n');
|
109 |
+
s3_8k=resample(s3,fs8k,fs);
|
110 |
+
[s3_8k,lev3]=activlev(s3_8k,fs8k,'n');
|
111 |
+
s4_8k=resample(s4,fs8k,fs);
|
112 |
+
[s4_8k,lev4]=activlev(s4_8k,fs8k,'n');
|
113 |
+
s5_8k=resample(s5,fs8k,fs);
|
114 |
+
[s5_8k,lev5]=activlev(s5_8k,fs8k,'n');
|
115 |
+
|
116 |
+
weight_1=10^(inwav1_snr/20);
|
117 |
+
weight_2=10^(inwav2_snr/20);
|
118 |
+
weight_3=10^(inwav3_snr/20);
|
119 |
+
weight_4=10^(inwav4_snr/20);
|
120 |
+
weight_5=10^(inwav5_snr/20);
|
121 |
+
|
122 |
+
s1_8k = weight_1 * s1_8k;
|
123 |
+
s2_8k = weight_2 * s2_8k;
|
124 |
+
s3_8k = weight_3 * s3_8k;
|
125 |
+
s4_8k = weight_4 * s4_8k;
|
126 |
+
s5_8k = weight_5 * s5_8k;
|
127 |
+
|
128 |
+
switch min_max{i_mm}
|
129 |
+
case 'max'
|
130 |
+
mix_8k_length = max([length(s1_8k),length(s2_8k),length(s3_8k),length(s4_8k),length(s5_8k)]);
|
131 |
+
s1_8k = cat(1,s1_8k,zeros(mix_8k_length - length(s1_8k),1));
|
132 |
+
s2_8k = cat(1,s2_8k,zeros(mix_8k_length - length(s2_8k),1));
|
133 |
+
s3_8k = cat(1,s3_8k,zeros(mix_8k_length - length(s3_8k),1));
|
134 |
+
s4_8k = cat(1,s4_8k,zeros(mix_8k_length - length(s4_8k),1));
|
135 |
+
s5_8k = cat(1,s5_8k,zeros(mix_8k_length - length(s5_8k),1));
|
136 |
+
|
137 |
+
case 'min'
|
138 |
+
mix_8k_length = min([length(s1_8k),length(s2_8k),length(s3_8k),length(s4_8k),length(s5_8k)]);
|
139 |
+
s1_8k = s1_8k(1:mix_8k_length);
|
140 |
+
s2_8k = s2_8k(1:mix_8k_length);
|
141 |
+
s3_8k = s3_8k(1:mix_8k_length);
|
142 |
+
s4_8k = s4_8k(1:mix_8k_length);
|
143 |
+
s5_8k = s5_8k(1:mix_8k_length);
|
144 |
+
end
|
145 |
+
mix_8k = s1_8k + s2_8k + s3_8k + s4_8k + s5_8k;
|
146 |
+
|
147 |
+
max_amp_8k = max(cat(1,abs(mix_8k(:)),abs(s1_8k(:)),abs(s2_8k(:)),abs(s3_8k(:)),abs(s4_8k(:)),abs(s5_8k(:))));
|
148 |
+
mix_scaling_8k = 1/max_amp_8k*0.9;
|
149 |
+
s1_8k = mix_scaling_8k * s1_8k;
|
150 |
+
s2_8k = mix_scaling_8k * s2_8k;
|
151 |
+
s3_8k = mix_scaling_8k * s3_8k;
|
152 |
+
s4_8k = mix_scaling_8k * s4_8k;
|
153 |
+
s5_8k = mix_scaling_8k * s5_8k;
|
154 |
+
mix_8k = mix_scaling_8k * mix_8k;
|
155 |
+
|
156 |
+
% apply same gain to 16 kHz file
|
157 |
+
s1_16k = weight_1 * s1 / sqrt(lev1);
|
158 |
+
s2_16k = weight_2 * s2 / sqrt(lev2);
|
159 |
+
s3_16k = weight_3 * s3 / sqrt(lev3);
|
160 |
+
s4_16k = weight_4 * s4 / sqrt(lev4);
|
161 |
+
s5_16k = weight_5 * s5 / sqrt(lev5);
|
162 |
+
|
163 |
+
switch min_max{i_mm}
|
164 |
+
case 'max'
|
165 |
+
mix_16k_length = max([length(s1_16k),length(s2_16k),length(s3_16k),length(s4_16k),length(s5_16k)]);
|
166 |
+
s1_16k = cat(1,s1_16k,zeros(mix_16k_length - length(s1_16k),1));
|
167 |
+
s2_16k = cat(1,s2_16k,zeros(mix_16k_length - length(s2_16k),1));
|
168 |
+
s3_16k = cat(1,s3_16k,zeros(mix_16k_length - length(s3_16k),1));
|
169 |
+
s4_16k = cat(1,s4_16k,zeros(mix_16k_length - length(s4_16k),1));
|
170 |
+
s5_16k = cat(1,s5_16k,zeros(mix_16k_length - length(s5_16k),1));
|
171 |
+
case 'min'
|
172 |
+
mix_16k_length = min([length(s1_16k),length(s2_16k),length(s3_16k),length(s4_16k),length(s5_16k)]);
|
173 |
+
s1_16k = s1_16k(1:mix_16k_length);
|
174 |
+
s2_16k = s2_16k(1:mix_16k_length);
|
175 |
+
s3_16k = s3_16k(1:mix_16k_length);
|
176 |
+
s4_16k = s4_16k(1:mix_16k_length);
|
177 |
+
s5_16k = s5_16k(1:mix_16k_length);
|
178 |
+
end
|
179 |
+
mix_16k = s1_16k + s2_16k + s3_16k + s4_16k + s5_16k;
|
180 |
+
|
181 |
+
max_amp_16k = max(cat(1,abs(mix_16k(:)),abs(s1_16k(:)),abs(s2_16k(:)),abs(s3_16k(:)),abs(s4_16k(:)),abs(s5_16k(:))));
|
182 |
+
mix_scaling_16k = 1/max_amp_16k*0.9;
|
183 |
+
s1_16k = mix_scaling_16k * s1_16k;
|
184 |
+
s2_16k = mix_scaling_16k * s2_16k;
|
185 |
+
s3_16k = mix_scaling_16k * s3_16k;
|
186 |
+
s4_16k = mix_scaling_16k * s4_16k;
|
187 |
+
s5_16k = mix_scaling_16k * s5_16k;
|
188 |
+
mix_16k = mix_scaling_16k * mix_16k;
|
189 |
+
|
190 |
+
% save 8 kHz and 16 kHz mixtures, as well as
|
191 |
+
% necessary scaling factors
|
192 |
+
|
193 |
+
scaling_16k(i,1) = weight_1 * mix_scaling_16k/ sqrt(lev1);
|
194 |
+
scaling_16k(i,2) = weight_2 * mix_scaling_16k/ sqrt(lev2);
|
195 |
+
scaling_16k(i,3) = weight_3 * mix_scaling_16k/ sqrt(lev3);
|
196 |
+
scaling_16k(i,4) = weight_4 * mix_scaling_16k/ sqrt(lev4);
|
197 |
+
scaling_16k(i,5) = weight_5 * mix_scaling_16k/ sqrt(lev5);
|
198 |
+
scaling_8k(i,1) = weight_1 * mix_scaling_8k/ sqrt(lev1);
|
199 |
+
scaling_8k(i,2) = weight_2 * mix_scaling_8k/ sqrt(lev2);
|
200 |
+
scaling_8k(i,3) = weight_3 * mix_scaling_8k/ sqrt(lev3);
|
201 |
+
scaling_8k(i,4) = weight_4 * mix_scaling_8k/ sqrt(lev4);
|
202 |
+
scaling_8k(i,5) = weight_5 * mix_scaling_8k/ sqrt(lev5);
|
203 |
+
|
204 |
+
scaling16bit_16k(i) = mix_scaling_16k;
|
205 |
+
scaling16bit_8k(i) = mix_scaling_8k;
|
206 |
+
|
207 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_8k,fs8k);
|
208 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s1/' mix_name '.wav'], s1_16k,fs);
|
209 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_8k,fs8k);
|
210 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s2/' mix_name '.wav'], s2_16k,fs);
|
211 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_8k,fs8k);
|
212 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s3/' mix_name '.wav'], s3_16k,fs);
|
213 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s4/' mix_name '.wav'], s4_8k,fs8k);
|
214 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s4/' mix_name '.wav'], s4_16k,fs);
|
215 |
+
audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/s5/' mix_name '.wav'], s5_8k,fs8k);
|
216 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/s5/' mix_name '.wav'], s5_16k,fs); audiowrite([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_8k,fs8k);
|
217 |
+
audiowrite([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/mix/' mix_name '.wav'], mix_16k,fs);
|
218 |
+
|
219 |
+
if mod(i,10)==0
|
220 |
+
fprintf(1,'.');
|
221 |
+
if mod(i,200)==0
|
222 |
+
fprintf(1,'\n');
|
223 |
+
end
|
224 |
+
end
|
225 |
+
|
226 |
+
end
|
227 |
+
save([output_dir8k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_8k','scaling16bit_8k');
|
228 |
+
save([output_dir16k '/' min_max{i_mm} '/' data_type{i_type} '/scaling.mat'],'scaling_16k','scaling16bit_16k');
|
229 |
+
|
230 |
+
fclose(fid);
|
231 |
+
fclose(fid_s1);
|
232 |
+
fclose(fid_s2);
|
233 |
+
fclose(fid_s3);
|
234 |
+
fclose(fid_s4);
|
235 |
+
fclose(fid_s5);
|
236 |
+
fclose(fid_m);
|
237 |
+
end
|
238 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/maxfilt.m
ADDED
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
function [y,k,y0]=maxfilt(x,f,n,d,x0)
|
2 |
+
%MAXFILT find max of an exponentially weighted sliding window [Y,K,Y0]=(X,F,nn,D,X0)
|
3 |
+
%
|
4 |
+
% Usage: (1) y=maxfilt(x) % maximum filter along first non-singleton dimension
|
5 |
+
% (2) y=maxfilt(x,0.95) % use a forgetting factor of 0.95 (= time const of -1/log(0.95)=19.5 samples)
|
6 |
+
% (3) Two equivalent methods (i.e. you can process x in chunks):
|
7 |
+
% y=maxfilt([u v]); [yu,ku,x0)=maxfilt(u);
|
8 |
+
% yv=maxfilt(v,[],[],[],x0);
|
9 |
+
% y=[yu yv];
|
10 |
+
%
|
11 |
+
% Inputs: X Vector or matrix of input data
|
12 |
+
% F exponential forgetting factor in the range 0 (very forgetful) to 1 (no forgetting)
|
13 |
+
% F=exp(-1/T) gives a time constant of T samples [default = 1]
|
14 |
+
% n Length of sliding window [default = Inf (equivalent to [])]
|
15 |
+
% D Dimension for work along [default = first non-singleton dimension]
|
16 |
+
% X0 Initial values placed in front of the X data
|
17 |
+
%
|
18 |
+
% Outputs: Y Output matrix - same size as X
|
19 |
+
% K Index array: Y=X(K). (Note that these value may be <=0 if input X0 is present)
|
20 |
+
% Y0 Last nn-1 values (used to initialize a subsequent call to
|
21 |
+
% maxfilt()) (or last output if n=Inf)
|
22 |
+
%
|
23 |
+
% This routine calaulates y(p)=max(f^r*x(p-r), r=0:n-1) where x(r)=-inf for r<1
|
24 |
+
% y=x(k) on output
|
25 |
+
|
26 |
+
% Example: find all peaks in x that are not exceeded within +-w samples
|
27 |
+
% w=4;m=100;x=rand(m,1);[y,k]=maxfilt(x,1,2*w+1);p=find(((1:m)-k)==w);plot(1:m,x,'-',p-w,x(p-w),'+')
|
28 |
+
|
29 |
+
% Copyright (C) Mike Brookes 2003
|
30 |
+
% Version: $Id: maxfilt.m 4054 2014-01-12 19:11:46Z dmb $
|
31 |
+
%
|
32 |
+
% VOICEBOX is a MATLAB toolbox for speech processing.
|
33 |
+
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
|
34 |
+
%
|
35 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
36 |
+
% This program is free software; you can redistribute it and/or modify
|
37 |
+
% it under the terms of the GNU General Public License as published by
|
38 |
+
% the Free Software Foundation; either version 2 of the License, or
|
39 |
+
% (at your option) any later version.
|
40 |
+
%
|
41 |
+
% This program is distributed in the hope that it will be useful,
|
42 |
+
% but WITHOUT ANY WARRANTY; without even the implied warranty of
|
43 |
+
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
44 |
+
% GNU General Public License for more details.
|
45 |
+
%
|
46 |
+
% You can obtain a copy of the GNU General Public License from
|
47 |
+
% http://www.gnu.org/copyleft/gpl.html or by writing to
|
48 |
+
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
|
49 |
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
50 |
+
|
51 |
+
s=size(x);
|
52 |
+
if nargin<4 || isempty(d)
|
53 |
+
d=find(s>1,1); % find first non-singleton dimension
|
54 |
+
if isempty(d)
|
55 |
+
d=1;
|
56 |
+
end
|
57 |
+
end
|
58 |
+
if nargin>4 && numel(x0)>0 % initial values specified
|
59 |
+
y=shiftdim(cat(d,x0,x),d-1); % concatenate x0 and x along d
|
60 |
+
nx0=size(x0,d); % number of values added onto front of data
|
61 |
+
else % dimension specified, d
|
62 |
+
y=shiftdim(x,d-1);
|
63 |
+
nx0=0;
|
64 |
+
end
|
65 |
+
s=size(y);
|
66 |
+
s1=s(1);
|
67 |
+
if nargin<3 || isempty(n)
|
68 |
+
n0=Inf;
|
69 |
+
else
|
70 |
+
n0=max(n,1);
|
71 |
+
end
|
72 |
+
if nargin<2 || isempty(f)
|
73 |
+
f=1;
|
74 |
+
end
|
75 |
+
nn=n0;
|
76 |
+
if nargout>2 % we need to output the tail for next time
|
77 |
+
if n0<Inf
|
78 |
+
ny0=min(s1,nn-1);
|
79 |
+
else
|
80 |
+
ny0=min(s1,1);
|
81 |
+
end
|
82 |
+
sy0=s;
|
83 |
+
sy0(1)=ny0;
|
84 |
+
if ny0<=0 || n0==Inf
|
85 |
+
y0=zeros(sy0);
|
86 |
+
else
|
87 |
+
y0=reshape(y(1+s1-ny0:end,:),sy0);
|
88 |
+
y0=shiftdim(y0,ndims(x)-d+1);
|
89 |
+
end
|
90 |
+
end
|
91 |
+
nn=min(nn,s1); % no point in having nn>s1
|
92 |
+
k=repmat((1:s1)',[1 s(2:end)]);
|
93 |
+
if nn>1
|
94 |
+
j=1;
|
95 |
+
j2=1;
|
96 |
+
while j>0
|
97 |
+
g=f^j;
|
98 |
+
m=find(y(j+1:s1,:)<=g*y(1:s1-j,:));
|
99 |
+
m=m+j*fix((m-1)/(s1-j));
|
100 |
+
y(m+j)=g*y(m);
|
101 |
+
k(m+j)=k(m);
|
102 |
+
j2=j2+j;
|
103 |
+
j=min(j2,nn-j2); % j approximately doubles each iteration
|
104 |
+
end
|
105 |
+
end
|
106 |
+
if nargout==0
|
107 |
+
if nargin<3
|
108 |
+
x=shiftdim(x);
|
109 |
+
else
|
110 |
+
x=shiftdim(x,d-1);
|
111 |
+
end
|
112 |
+
ss=min(prod(s(2:end)),5); % maximum of 5 plots
|
113 |
+
plot(1:s1,reshape(y(nx0+1:end,1:ss),s1,ss),'-r',1:s1,reshape(x(:,1:ss),s1,ss),'-b');
|
114 |
+
else
|
115 |
+
if nargout>2 && n0==Inf && ny0==1 % if n0==Inf, we need to save the final output
|
116 |
+
y0=reshape(y(end,:),sy0);
|
117 |
+
y0=shiftdim(y0,ndims(x)-d+1);
|
118 |
+
end
|
119 |
+
if nx0>0 % pre-data specified, x0
|
120 |
+
s(1)=s(1)-nx0;
|
121 |
+
y=shiftdim(reshape(y(nx0+1:end,:),s),ndims(x)-d+1);
|
122 |
+
k=shiftdim(reshape(k(nx0+1:end,:),s),ndims(x)-d+1)-nx0;
|
123 |
+
else % no pre-data
|
124 |
+
y=shiftdim(y,ndims(x)-d+1);
|
125 |
+
k=shiftdim(k,ndims(x)-d+1);
|
126 |
+
end
|
127 |
+
end
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_cv.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_tr.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_2_spk_tt.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_cv.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_tr.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_3_spk_tt.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_cv.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_tr.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_4_spk_tt.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_cv.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_tr.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
create-speaker-mixtures-2345/create-speaker-mixtures-2345/mix_5_spk_tt.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|