HighCWu commited on
Commit
b51a9f3
·
1 Parent(s): b2ce270

init commit.

Browse files
.gitattributes CHANGED
@@ -25,3 +25,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
28
+ *.wav filter=lfs diff=lfs merge=lfs -text
Configs/config.yml ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ log_dir: "Models/VCTK20"
2
+ save_freq: 2
3
+ epochs: 150
4
+ batch_size: 5
5
+ pretrained_model: ""
6
+ load_only_params: false
7
+ fp16_run: true
8
+
9
+ train_data: "Data/train_list.txt"
10
+ val_data: "Data/val_list.txt"
11
+
12
+ F0_path: "Utils/JDC/bst.pd"
13
+ ASR_config: "Utils/ASR/config.yml"
14
+ ASR_path: "Utils/ASR/epoch_00100.pd"
15
+
16
+ preprocess_params:
17
+ sr: 24000
18
+ spect_params:
19
+ n_fft: 2048
20
+ win_length: 1200
21
+ hop_length: 300
22
+
23
+ model_params:
24
+ dim_in: 64
25
+ style_dim: 64
26
+ latent_dim: 16
27
+ num_domains: 20
28
+ max_conv_dim: 512
29
+ n_repeat: 4
30
+ w_hpf: 0
31
+ F0_channel: 256
32
+
33
+ loss_params:
34
+ g_loss:
35
+ lambda_sty: 1.
36
+ lambda_cyc: 5.
37
+ lambda_ds: 1.
38
+ lambda_norm: 1.
39
+ lambda_asr: 10.
40
+ lambda_f0: 5.
41
+ lambda_f0_sty: 0.1
42
+ lambda_adv: 2.
43
+ lambda_adv_cls: 0.5
44
+ norm_bias: 0.5
45
+ d_loss:
46
+ lambda_reg: 1.
47
+ lambda_adv_cls: 0.1
48
+ lambda_con_reg: 10.
49
+
50
+ adv_cls_epoch: 50
51
+ con_reg_epoch: 30
52
+
53
+ optimizer_params:
54
+ lr: 0.0001
Data/train_list.txt ADDED
@@ -0,0 +1,2725 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ./Data/p225/22.wav|0
2
+ ./Data/p239/163.wav|7
3
+ ./Data/p227/144.wav|11
4
+ ./Data/p258/67.wav|16
5
+ ./Data/p259/74.wav|17
6
+ ./Data/p230/103.wav|3
7
+ ./Data/p225/7.wav|0
8
+ ./Data/p226/67.wav|10
9
+ ./Data/p228/27.wav|1
10
+ ./Data/p243/23.wav|13
11
+ ./Data/p228/3.wav|1
12
+ ./Data/p244/61.wav|9
13
+ ./Data/p230/178.wav|3
14
+ ./Data/p240/120.wav|8
15
+ ./Data/p228/154.wav|1
16
+ ./Data/p230/146.wav|3
17
+ ./Data/p240/149.wav|8
18
+ ./Data/p254/56.wav|14
19
+ ./Data/p240/81.wav|8
20
+ ./Data/p226/27.wav|10
21
+ ./Data/p256/59.wav|15
22
+ ./Data/p231/74.wav|4
23
+ ./Data/p231/9.wav|4
24
+ ./Data/p240/1.wav|8
25
+ ./Data/p236/135.wav|6
26
+ ./Data/p232/85.wav|12
27
+ ./Data/p230/69.wav|3
28
+ ./Data/p256/35.wav|15
29
+ ./Data/p239/6.wav|7
30
+ ./Data/p254/70.wav|14
31
+ ./Data/p244/135.wav|9
32
+ ./Data/p254/114.wav|14
33
+ ./Data/p236/117.wav|6
34
+ ./Data/p225/78.wav|0
35
+ ./Data/p236/66.wav|6
36
+ ./Data/p228/155.wav|1
37
+ ./Data/p239/83.wav|7
38
+ ./Data/p240/22.wav|8
39
+ ./Data/p225/2.wav|0
40
+ ./Data/p230/75.wav|3
41
+ ./Data/p239/17.wav|7
42
+ ./Data/p239/147.wav|7
43
+ ./Data/p273/125.wav|19
44
+ ./Data/p270/17.wav|18
45
+ ./Data/p233/74.wav|5
46
+ ./Data/p233/112.wav|5
47
+ ./Data/p228/1.wav|1
48
+ ./Data/p258/125.wav|16
49
+ ./Data/p231/56.wav|4
50
+ ./Data/p227/101.wav|11
51
+ ./Data/p232/108.wav|12
52
+ ./Data/p239/82.wav|7
53
+ ./Data/p270/160.wav|18
54
+ ./Data/p227/58.wav|11
55
+ ./Data/p233/111.wav|5
56
+ ./Data/p259/146.wav|17
57
+ ./Data/p230/77.wav|3
58
+ ./Data/p256/104.wav|15
59
+ ./Data/p228/140.wav|1
60
+ ./Data/p231/143.wav|4
61
+ ./Data/p270/20.wav|18
62
+ ./Data/p225/62.wav|0
63
+ ./Data/p229/11.wav|2
64
+ ./Data/p259/66.wav|17
65
+ ./Data/p239/53.wav|7
66
+ ./Data/p239/23.wav|7
67
+ ./Data/p240/115.wav|8
68
+ ./Data/p233/41.wav|5
69
+ ./Data/p270/61.wav|18
70
+ ./Data/p232/50.wav|12
71
+ ./Data/p239/56.wav|7
72
+ ./Data/p244/117.wav|9
73
+ ./Data/p233/6.wav|5
74
+ ./Data/p227/31.wav|11
75
+ ./Data/p231/134.wav|4
76
+ ./Data/p243/12.wav|13
77
+ ./Data/p226/14.wav|10
78
+ ./Data/p240/45.wav|8
79
+ ./Data/p231/91.wav|4
80
+ ./Data/p259/163.wav|17
81
+ ./Data/p236/41.wav|6
82
+ ./Data/p231/66.wav|4
83
+ ./Data/p233/122.wav|5
84
+ ./Data/p244/8.wav|9
85
+ ./Data/p232/41.wav|12
86
+ ./Data/p232/101.wav|12
87
+ ./Data/p273/70.wav|19
88
+ ./Data/p270/168.wav|18
89
+ ./Data/p226/80.wav|10
90
+ ./Data/p270/178.wav|18
91
+ ./Data/p225/39.wav|0
92
+ ./Data/p258/29.wav|16
93
+ ./Data/p231/46.wav|4
94
+ ./Data/p244/41.wav|9
95
+ ./Data/p227/115.wav|11
96
+ ./Data/p228/14.wav|1
97
+ ./Data/p239/116.wav|7
98
+ ./Data/p259/13.wav|17
99
+ ./Data/p254/51.wav|14
100
+ ./Data/p256/36.wav|15
101
+ ./Data/p254/108.wav|14
102
+ ./Data/p226/46.wav|10
103
+ ./Data/p258/39.wav|16
104
+ ./Data/p273/106.wav|19
105
+ ./Data/p228/104.wav|1
106
+ ./Data/p256/1.wav|15
107
+ ./Data/p258/109.wav|16
108
+ ./Data/p259/123.wav|17
109
+ ./Data/p258/99.wav|16
110
+ ./Data/p256/61.wav|15
111
+ ./Data/p231/17.wav|4
112
+ ./Data/p227/74.wav|11
113
+ ./Data/p256/21.wav|15
114
+ ./Data/p226/1.wav|10
115
+ ./Data/p231/129.wav|4
116
+ ./Data/p231/79.wav|4
117
+ ./Data/p226/74.wav|10
118
+ ./Data/p233/69.wav|5
119
+ ./Data/p227/44.wav|11
120
+ ./Data/p239/141.wav|7
121
+ ./Data/p228/28.wav|1
122
+ ./Data/p239/101.wav|7
123
+ ./Data/p258/106.wav|16
124
+ ./Data/p236/49.wav|6
125
+ ./Data/p230/46.wav|3
126
+ ./Data/p244/118.wav|9
127
+ ./Data/p227/25.wav|11
128
+ ./Data/p243/25.wav|13
129
+ ./Data/p270/96.wav|18
130
+ ./Data/p228/130.wav|1
131
+ ./Data/p230/62.wav|3
132
+ ./Data/p227/114.wav|11
133
+ ./Data/p228/116.wav|1
134
+ ./Data/p233/40.wav|5
135
+ ./Data/p230/147.wav|3
136
+ ./Data/p240/71.wav|8
137
+ ./Data/p233/29.wav|5
138
+ ./Data/p230/13.wav|3
139
+ ./Data/p239/63.wav|7
140
+ ./Data/p228/19.wav|1
141
+ ./Data/p233/63.wav|5
142
+ ./Data/p227/50.wav|11
143
+ ./Data/p270/6.wav|18
144
+ ./Data/p228/106.wav|1
145
+ ./Data/p236/86.wav|6
146
+ ./Data/p240/129.wav|8
147
+ ./Data/p273/21.wav|19
148
+ ./Data/p256/83.wav|15
149
+ ./Data/p240/85.wav|8
150
+ ./Data/p258/43.wav|16
151
+ ./Data/p273/41.wav|19
152
+ ./Data/p270/192.wav|18
153
+ ./Data/p230/134.wav|3
154
+ ./Data/p228/44.wav|1
155
+ ./Data/p231/102.wav|4
156
+ ./Data/p270/73.wav|18
157
+ ./Data/p239/153.wav|7
158
+ ./Data/p270/36.wav|18
159
+ ./Data/p273/145.wav|19
160
+ ./Data/p228/107.wav|1
161
+ ./Data/p244/12.wav|9
162
+ ./Data/p270/181.wav|18
163
+ ./Data/p231/35.wav|4
164
+ ./Data/p233/132.wav|5
165
+ ./Data/p226/19.wav|10
166
+ ./Data/p239/91.wav|7
167
+ ./Data/p225/66.wav|0
168
+ ./Data/p229/40.wav|2
169
+ ./Data/p227/48.wav|11
170
+ ./Data/p225/44.wav|0
171
+ ./Data/p229/108.wav|2
172
+ ./Data/p227/47.wav|11
173
+ ./Data/p270/78.wav|18
174
+ ./Data/p259/77.wav|17
175
+ ./Data/p239/51.wav|7
176
+ ./Data/p230/58.wav|3
177
+ ./Data/p233/81.wav|5
178
+ ./Data/p230/150.wav|3
179
+ ./Data/p227/141.wav|11
180
+ ./Data/p243/121.wav|13
181
+ ./Data/p244/11.wav|9
182
+ ./Data/p270/63.wav|18
183
+ ./Data/p236/81.wav|6
184
+ ./Data/p229/89.wav|2
185
+ ./Data/p231/83.wav|4
186
+ ./Data/p233/33.wav|5
187
+ ./Data/p227/107.wav|11
188
+ ./Data/p228/48.wav|1
189
+ ./Data/p259/44.wav|17
190
+ ./Data/p228/131.wav|1
191
+ ./Data/p227/20.wav|11
192
+ ./Data/p256/98.wav|15
193
+ ./Data/p273/45.wav|19
194
+ ./Data/p239/137.wav|7
195
+ ./Data/p232/93.wav|12
196
+ ./Data/p239/38.wav|7
197
+ ./Data/p243/161.wav|13
198
+ ./Data/p258/3.wav|16
199
+ ./Data/p273/132.wav|19
200
+ ./Data/p230/15.wav|3
201
+ ./Data/p259/155.wav|17
202
+ ./Data/p256/125.wav|15
203
+ ./Data/p256/7.wav|15
204
+ ./Data/p231/58.wav|4
205
+ ./Data/p256/75.wav|15
206
+ ./Data/p236/116.wav|6
207
+ ./Data/p233/10.wav|5
208
+ ./Data/p270/39.wav|18
209
+ ./Data/p254/48.wav|14
210
+ ./Data/p270/121.wav|18
211
+ ./Data/p240/139.wav|8
212
+ ./Data/p240/121.wav|8
213
+ ./Data/p244/13.wav|9
214
+ ./Data/p243/148.wav|13
215
+ ./Data/p240/125.wav|8
216
+ ./Data/p259/14.wav|17
217
+ ./Data/p273/139.wav|19
218
+ ./Data/p233/123.wav|5
219
+ ./Data/p225/34.wav|0
220
+ ./Data/p244/10.wav|9
221
+ ./Data/p258/4.wav|16
222
+ ./Data/p236/111.wav|6
223
+ ./Data/p259/59.wav|17
224
+ ./Data/p258/85.wav|16
225
+ ./Data/p227/73.wav|11
226
+ ./Data/p273/107.wav|19
227
+ ./Data/p231/94.wav|4
228
+ ./Data/p231/34.wav|4
229
+ ./Data/p270/200.wav|18
230
+ ./Data/p273/74.wav|19
231
+ ./Data/p232/45.wav|12
232
+ ./Data/p227/37.wav|11
233
+ ./Data/p256/101.wav|15
234
+ ./Data/p233/64.wav|5
235
+ ./Data/p228/52.wav|1
236
+ ./Data/p254/104.wav|14
237
+ ./Data/p236/103.wav|6
238
+ ./Data/p233/73.wav|5
239
+ ./Data/p243/146.wav|13
240
+ ./Data/p258/60.wav|16
241
+ ./Data/p254/8.wav|14
242
+ ./Data/p226/64.wav|10
243
+ ./Data/p243/63.wav|13
244
+ ./Data/p226/70.wav|10
245
+ ./Data/p233/37.wav|5
246
+ ./Data/p254/78.wav|14
247
+ ./Data/p227/123.wav|11
248
+ ./Data/p240/55.wav|8
249
+ ./Data/p229/126.wav|2
250
+ ./Data/p254/54.wav|14
251
+ ./Data/p243/163.wav|13
252
+ ./Data/p273/79.wav|19
253
+ ./Data/p230/9.wav|3
254
+ ./Data/p270/50.wav|18
255
+ ./Data/p243/64.wav|13
256
+ ./Data/p229/100.wav|2
257
+ ./Data/p240/100.wav|8
258
+ ./Data/p239/139.wav|7
259
+ ./Data/p236/65.wav|6
260
+ ./Data/p243/97.wav|13
261
+ ./Data/p258/37.wav|16
262
+ ./Data/p233/9.wav|5
263
+ ./Data/p243/10.wav|13
264
+ ./Data/p244/53.wav|9
265
+ ./Data/p259/162.wav|17
266
+ ./Data/p236/131.wav|6
267
+ ./Data/p227/134.wav|11
268
+ ./Data/p228/5.wav|1
269
+ ./Data/p273/18.wav|19
270
+ ./Data/p243/115.wav|13
271
+ ./Data/p256/113.wav|15
272
+ ./Data/p243/103.wav|13
273
+ ./Data/p273/133.wav|19
274
+ ./Data/p244/90.wav|9
275
+ ./Data/p258/45.wav|16
276
+ ./Data/p229/16.wav|2
277
+ ./Data/p244/25.wav|9
278
+ ./Data/p225/95.wav|0
279
+ ./Data/p230/18.wav|3
280
+ ./Data/p270/196.wav|18
281
+ ./Data/p229/58.wav|2
282
+ ./Data/p239/125.wav|7
283
+ ./Data/p225/27.wav|0
284
+ ./Data/p239/13.wav|7
285
+ ./Data/p259/121.wav|17
286
+ ./Data/p240/12.wav|8
287
+ ./Data/p270/40.wav|18
288
+ ./Data/p258/17.wav|16
289
+ ./Data/p270/123.wav|18
290
+ ./Data/p258/93.wav|16
291
+ ./Data/p229/43.wav|2
292
+ ./Data/p243/152.wav|13
293
+ ./Data/p236/11.wav|6
294
+ ./Data/p232/38.wav|12
295
+ ./Data/p225/9.wav|0
296
+ ./Data/p270/128.wav|18
297
+ ./Data/p258/22.wav|16
298
+ ./Data/p227/113.wav|11
299
+ ./Data/p228/128.wav|1
300
+ ./Data/p228/56.wav|1
301
+ ./Data/p239/19.wav|7
302
+ ./Data/p273/134.wav|19
303
+ ./Data/p231/144.wav|4
304
+ ./Data/p231/16.wav|4
305
+ ./Data/p259/141.wav|17
306
+ ./Data/p227/9.wav|11
307
+ ./Data/p273/114.wav|19
308
+ ./Data/p225/94.wav|0
309
+ ./Data/p273/42.wav|19
310
+ ./Data/p236/72.wav|6
311
+ ./Data/p240/58.wav|8
312
+ ./Data/p258/78.wav|16
313
+ ./Data/p227/129.wav|11
314
+ ./Data/p254/9.wav|14
315
+ ./Data/p226/43.wav|10
316
+ ./Data/p228/103.wav|1
317
+ ./Data/p232/114.wav|12
318
+ ./Data/p254/80.wav|14
319
+ ./Data/p240/144.wav|8
320
+ ./Data/p227/55.wav|11
321
+ ./Data/p254/2.wav|14
322
+ ./Data/p273/101.wav|19
323
+ ./Data/p243/67.wav|13
324
+ ./Data/p227/94.wav|11
325
+ ./Data/p227/121.wav|11
326
+ ./Data/p259/153.wav|17
327
+ ./Data/p258/40.wav|16
328
+ ./Data/p239/32.wav|7
329
+ ./Data/p270/83.wav|18
330
+ ./Data/p226/103.wav|10
331
+ ./Data/p258/18.wav|16
332
+ ./Data/p243/155.wav|13
333
+ ./Data/p229/117.wav|2
334
+ ./Data/p231/127.wav|4
335
+ ./Data/p256/30.wav|15
336
+ ./Data/p240/90.wav|8
337
+ ./Data/p254/133.wav|14
338
+ ./Data/p240/51.wav|8
339
+ ./Data/p239/105.wav|7
340
+ ./Data/p226/85.wav|10
341
+ ./Data/p254/31.wav|14
342
+ ./Data/p258/57.wav|16
343
+ ./Data/p230/95.wav|3
344
+ ./Data/p226/52.wav|10
345
+ ./Data/p258/79.wav|16
346
+ ./Data/p273/49.wav|19
347
+ ./Data/p259/82.wav|17
348
+ ./Data/p227/126.wav|11
349
+ ./Data/p243/158.wav|13
350
+ ./Data/p273/130.wav|19
351
+ ./Data/p243/7.wav|13
352
+ ./Data/p228/137.wav|1
353
+ ./Data/p233/103.wav|5
354
+ ./Data/p254/37.wav|14
355
+ ./Data/p240/39.wav|8
356
+ ./Data/p225/28.wav|0
357
+ ./Data/p227/139.wav|11
358
+ ./Data/p244/130.wav|9
359
+ ./Data/p243/22.wav|13
360
+ ./Data/p228/69.wav|1
361
+ ./Data/p231/64.wav|4
362
+ ./Data/p233/129.wav|5
363
+ ./Data/p232/68.wav|12
364
+ ./Data/p231/87.wav|4
365
+ ./Data/p240/83.wav|8
366
+ ./Data/p232/55.wav|12
367
+ ./Data/p259/54.wav|17
368
+ ./Data/p270/125.wav|18
369
+ ./Data/p239/169.wav|7
370
+ ./Data/p229/39.wav|2
371
+ ./Data/p273/110.wav|19
372
+ ./Data/p233/93.wav|5
373
+ ./Data/p225/79.wav|0
374
+ ./Data/p230/140.wav|3
375
+ ./Data/p228/36.wav|1
376
+ ./Data/p230/4.wav|3
377
+ ./Data/p259/88.wav|17
378
+ ./Data/p243/86.wav|13
379
+ ./Data/p227/90.wav|11
380
+ ./Data/p254/83.wav|14
381
+ ./Data/p240/150.wav|8
382
+ ./Data/p232/118.wav|12
383
+ ./Data/p270/35.wav|18
384
+ ./Data/p231/126.wav|4
385
+ ./Data/p239/59.wav|7
386
+ ./Data/p243/47.wav|13
387
+ ./Data/p254/105.wav|14
388
+ ./Data/p258/72.wav|16
389
+ ./Data/p228/72.wav|1
390
+ ./Data/p270/136.wav|18
391
+ ./Data/p230/51.wav|3
392
+ ./Data/p227/67.wav|11
393
+ ./Data/p259/151.wav|17
394
+ ./Data/p232/66.wav|12
395
+ ./Data/p254/40.wav|14
396
+ ./Data/p273/12.wav|19
397
+ ./Data/p229/130.wav|2
398
+ ./Data/p270/156.wav|18
399
+ ./Data/p230/177.wav|3
400
+ ./Data/p270/169.wav|18
401
+ ./Data/p258/1.wav|16
402
+ ./Data/p229/103.wav|2
403
+ ./Data/p270/127.wav|18
404
+ ./Data/p226/72.wav|10
405
+ ./Data/p229/99.wav|2
406
+ ./Data/p232/8.wav|12
407
+ ./Data/p236/1.wav|6
408
+ ./Data/p230/85.wav|3
409
+ ./Data/p236/99.wav|6
410
+ ./Data/p231/139.wav|4
411
+ ./Data/p256/67.wav|15
412
+ ./Data/p240/38.wav|8
413
+ ./Data/p233/16.wav|5
414
+ ./Data/p243/13.wav|13
415
+ ./Data/p227/86.wav|11
416
+ ./Data/p233/110.wav|5
417
+ ./Data/p243/77.wav|13
418
+ ./Data/p227/77.wav|11
419
+ ./Data/p230/7.wav|3
420
+ ./Data/p270/175.wav|18
421
+ ./Data/p254/38.wav|14
422
+ ./Data/p227/71.wav|11
423
+ ./Data/p229/104.wav|2
424
+ ./Data/p231/101.wav|4
425
+ ./Data/p229/105.wav|2
426
+ ./Data/p225/49.wav|0
427
+ ./Data/p230/137.wav|3
428
+ ./Data/p226/42.wav|10
429
+ ./Data/p233/92.wav|5
430
+ ./Data/p243/58.wav|13
431
+ ./Data/p239/45.wav|7
432
+ ./Data/p233/135.wav|5
433
+ ./Data/p244/89.wav|9
434
+ ./Data/p243/166.wav|13
435
+ ./Data/p240/59.wav|8
436
+ ./Data/p254/86.wav|14
437
+ ./Data/p243/60.wav|13
438
+ ./Data/p227/19.wav|11
439
+ ./Data/p231/45.wav|4
440
+ ./Data/p227/140.wav|11
441
+ ./Data/p236/129.wav|6
442
+ ./Data/p240/67.wav|8
443
+ ./Data/p227/61.wav|11
444
+ ./Data/p228/77.wav|1
445
+ ./Data/p236/52.wav|6
446
+ ./Data/p258/33.wav|16
447
+ ./Data/p244/104.wav|9
448
+ ./Data/p259/84.wav|17
449
+ ./Data/p236/127.wav|6
450
+ ./Data/p228/150.wav|1
451
+ ./Data/p233/85.wav|5
452
+ ./Data/p270/147.wav|18
453
+ ./Data/p229/83.wav|2
454
+ ./Data/p226/68.wav|10
455
+ ./Data/p229/94.wav|2
456
+ ./Data/p270/46.wav|18
457
+ ./Data/p258/129.wav|16
458
+ ./Data/p270/191.wav|18
459
+ ./Data/p227/106.wav|11
460
+ ./Data/p239/136.wav|7
461
+ ./Data/p239/14.wav|7
462
+ ./Data/p239/71.wav|7
463
+ ./Data/p232/74.wav|12
464
+ ./Data/p225/75.wav|0
465
+ ./Data/p244/143.wav|9
466
+ ./Data/p259/173.wav|17
467
+ ./Data/p243/140.wav|13
468
+ ./Data/p273/48.wav|19
469
+ ./Data/p230/111.wav|3
470
+ ./Data/p240/94.wav|8
471
+ ./Data/p258/20.wav|16
472
+ ./Data/p227/52.wav|11
473
+ ./Data/p244/4.wav|9
474
+ ./Data/p227/109.wav|11
475
+ ./Data/p230/55.wav|3
476
+ ./Data/p232/92.wav|12
477
+ ./Data/p240/75.wav|8
478
+ ./Data/p229/82.wav|2
479
+ ./Data/p270/103.wav|18
480
+ ./Data/p254/87.wav|14
481
+ ./Data/p259/38.wav|17
482
+ ./Data/p240/147.wav|8
483
+ ./Data/p227/111.wav|11
484
+ ./Data/p228/2.wav|1
485
+ ./Data/p230/82.wav|3
486
+ ./Data/p239/33.wav|7
487
+ ./Data/p259/65.wav|17
488
+ ./Data/p273/102.wav|19
489
+ ./Data/p227/116.wav|11
490
+ ./Data/p258/61.wav|16
491
+ ./Data/p228/68.wav|1
492
+ ./Data/p244/116.wav|9
493
+ ./Data/p240/9.wav|8
494
+ ./Data/p273/64.wav|19
495
+ ./Data/p273/9.wav|19
496
+ ./Data/p230/8.wav|3
497
+ ./Data/p230/172.wav|3
498
+ ./Data/p243/32.wav|13
499
+ ./Data/p258/117.wav|16
500
+ ./Data/p236/43.wav|6
501
+ ./Data/p243/29.wav|13
502
+ ./Data/p231/86.wav|4
503
+ ./Data/p231/6.wav|4
504
+ ./Data/p236/166.wav|6
505
+ ./Data/p270/174.wav|18
506
+ ./Data/p229/123.wav|2
507
+ ./Data/p243/132.wav|13
508
+ ./Data/p228/91.wav|1
509
+ ./Data/p273/100.wav|19
510
+ ./Data/p243/61.wav|13
511
+ ./Data/p233/14.wav|5
512
+ ./Data/p256/5.wav|15
513
+ ./Data/p228/135.wav|1
514
+ ./Data/p254/21.wav|14
515
+ ./Data/p230/96.wav|3
516
+ ./Data/p240/142.wav|8
517
+ ./Data/p259/63.wav|17
518
+ ./Data/p243/37.wav|13
519
+ ./Data/p228/136.wav|1
520
+ ./Data/p254/126.wav|14
521
+ ./Data/p225/51.wav|0
522
+ ./Data/p258/9.wav|16
523
+ ./Data/p270/85.wav|18
524
+ ./Data/p228/149.wav|1
525
+ ./Data/p236/152.wav|6
526
+ ./Data/p259/124.wav|17
527
+ ./Data/p244/1.wav|9
528
+ ./Data/p259/104.wav|17
529
+ ./Data/p227/64.wav|11
530
+ ./Data/p230/70.wav|3
531
+ ./Data/p256/122.wav|15
532
+ ./Data/p258/30.wav|16
533
+ ./Data/p244/54.wav|9
534
+ ./Data/p270/198.wav|18
535
+ ./Data/p258/15.wav|16
536
+ ./Data/p254/52.wav|14
537
+ ./Data/p228/85.wav|1
538
+ ./Data/p230/1.wav|3
539
+ ./Data/p230/71.wav|3
540
+ ./Data/p259/147.wav|17
541
+ ./Data/p243/68.wav|13
542
+ ./Data/p226/79.wav|10
543
+ ./Data/p243/123.wav|13
544
+ ./Data/p229/85.wav|2
545
+ ./Data/p270/5.wav|18
546
+ ./Data/p226/12.wav|10
547
+ ./Data/p231/82.wav|4
548
+ ./Data/p230/120.wav|3
549
+ ./Data/p225/31.wav|0
550
+ ./Data/p236/130.wav|6
551
+ ./Data/p239/111.wav|7
552
+ ./Data/p230/60.wav|3
553
+ ./Data/p232/121.wav|12
554
+ ./Data/p259/27.wav|17
555
+ ./Data/p228/65.wav|1
556
+ ./Data/p231/92.wav|4
557
+ ./Data/p236/160.wav|6
558
+ ./Data/p258/145.wav|16
559
+ ./Data/p231/20.wav|4
560
+ ./Data/p226/47.wav|10
561
+ ./Data/p258/110.wav|16
562
+ ./Data/p231/93.wav|4
563
+ ./Data/p270/30.wav|18
564
+ ./Data/p227/97.wav|11
565
+ ./Data/p231/31.wav|4
566
+ ./Data/p273/55.wav|19
567
+ ./Data/p239/12.wav|7
568
+ ./Data/p240/63.wav|8
569
+ ./Data/p254/57.wav|14
570
+ ./Data/p244/35.wav|9
571
+ ./Data/p239/127.wav|7
572
+ ./Data/p226/130.wav|10
573
+ ./Data/p225/83.wav|0
574
+ ./Data/p259/56.wav|17
575
+ ./Data/p273/85.wav|19
576
+ ./Data/p244/129.wav|9
577
+ ./Data/p273/83.wav|19
578
+ ./Data/p270/45.wav|18
579
+ ./Data/p273/23.wav|19
580
+ ./Data/p233/15.wav|5
581
+ ./Data/p256/34.wav|15
582
+ ./Data/p273/38.wav|19
583
+ ./Data/p244/73.wav|9
584
+ ./Data/p243/43.wav|13
585
+ ./Data/p270/26.wav|18
586
+ ./Data/p239/87.wav|7
587
+ ./Data/p233/120.wav|5
588
+ ./Data/p236/14.wav|6
589
+ ./Data/p227/5.wav|11
590
+ ./Data/p258/104.wav|16
591
+ ./Data/p227/45.wav|11
592
+ ./Data/p229/35.wav|2
593
+ ./Data/p273/36.wav|19
594
+ ./Data/p240/82.wav|8
595
+ ./Data/p254/20.wav|14
596
+ ./Data/p232/128.wav|12
597
+ ./Data/p254/47.wav|14
598
+ ./Data/p270/102.wav|18
599
+ ./Data/p230/41.wav|3
600
+ ./Data/p225/23.wav|0
601
+ ./Data/p258/38.wav|16
602
+ ./Data/p233/137.wav|5
603
+ ./Data/p254/94.wav|14
604
+ ./Data/p244/122.wav|9
605
+ ./Data/p229/51.wav|2
606
+ ./Data/p244/96.wav|9
607
+ ./Data/p273/119.wav|19
608
+ ./Data/p227/80.wav|11
609
+ ./Data/p225/1.wav|0
610
+ ./Data/p244/80.wav|9
611
+ ./Data/p233/108.wav|5
612
+ ./Data/p259/119.wav|17
613
+ ./Data/p226/8.wav|10
614
+ ./Data/p228/60.wav|1
615
+ ./Data/p233/71.wav|5
616
+ ./Data/p243/168.wav|13
617
+ ./Data/p226/136.wav|10
618
+ ./Data/p236/110.wav|6
619
+ ./Data/p228/23.wav|1
620
+ ./Data/p244/137.wav|9
621
+ ./Data/p240/33.wav|8
622
+ ./Data/p256/13.wav|15
623
+ ./Data/p243/6.wav|13
624
+ ./Data/p227/30.wav|11
625
+ ./Data/p244/28.wav|9
626
+ ./Data/p228/24.wav|1
627
+ ./Data/p243/147.wav|13
628
+ ./Data/p231/39.wav|4
629
+ ./Data/p254/93.wav|14
630
+ ./Data/p256/29.wav|15
631
+ ./Data/p258/119.wav|16
632
+ ./Data/p240/69.wav|8
633
+ ./Data/p232/102.wav|12
634
+ ./Data/p233/101.wav|5
635
+ ./Data/p270/29.wav|18
636
+ ./Data/p233/47.wav|5
637
+ ./Data/p259/17.wav|17
638
+ ./Data/p228/94.wav|1
639
+ ./Data/p231/21.wav|4
640
+ ./Data/p230/143.wav|3
641
+ ./Data/p270/204.wav|18
642
+ ./Data/p229/71.wav|2
643
+ ./Data/p232/13.wav|12
644
+ ./Data/p227/127.wav|11
645
+ ./Data/p258/105.wav|16
646
+ ./Data/p227/112.wav|11
647
+ ./Data/p270/59.wav|18
648
+ ./Data/p232/47.wav|12
649
+ ./Data/p236/112.wav|6
650
+ ./Data/p273/115.wav|19
651
+ ./Data/p236/20.wav|6
652
+ ./Data/p258/115.wav|16
653
+ ./Data/p256/24.wav|15
654
+ ./Data/p273/76.wav|19
655
+ ./Data/p231/3.wav|4
656
+ ./Data/p225/56.wav|0
657
+ ./Data/p259/150.wav|17
658
+ ./Data/p227/70.wav|11
659
+ ./Data/p230/2.wav|3
660
+ ./Data/p226/22.wav|10
661
+ ./Data/p243/127.wav|13
662
+ ./Data/p258/31.wav|16
663
+ ./Data/p233/89.wav|5
664
+ ./Data/p259/64.wav|17
665
+ ./Data/p259/96.wav|17
666
+ ./Data/p227/57.wav|11
667
+ ./Data/p232/132.wav|12
668
+ ./Data/p236/46.wav|6
669
+ ./Data/p232/70.wav|12
670
+ ./Data/p273/138.wav|19
671
+ ./Data/p244/99.wav|9
672
+ ./Data/p240/18.wav|8
673
+ ./Data/p243/145.wav|13
674
+ ./Data/p230/125.wav|3
675
+ ./Data/p243/49.wav|13
676
+ ./Data/p256/71.wav|15
677
+ ./Data/p258/133.wav|16
678
+ ./Data/p236/50.wav|6
679
+ ./Data/p270/122.wav|18
680
+ ./Data/p230/25.wav|3
681
+ ./Data/p236/124.wav|6
682
+ ./Data/p273/35.wav|19
683
+ ./Data/p258/98.wav|16
684
+ ./Data/p270/51.wav|18
685
+ ./Data/p229/121.wav|2
686
+ ./Data/p270/15.wav|18
687
+ ./Data/p270/193.wav|18
688
+ ./Data/p239/138.wav|7
689
+ ./Data/p273/108.wav|19
690
+ ./Data/p254/139.wav|14
691
+ ./Data/p256/23.wav|15
692
+ ./Data/p243/84.wav|13
693
+ ./Data/p273/93.wav|19
694
+ ./Data/p240/21.wav|8
695
+ ./Data/p240/109.wav|8
696
+ ./Data/p230/76.wav|3
697
+ ./Data/p232/61.wav|12
698
+ ./Data/p233/48.wav|5
699
+ ./Data/p233/133.wav|5
700
+ ./Data/p239/28.wav|7
701
+ ./Data/p230/149.wav|3
702
+ ./Data/p240/46.wav|8
703
+ ./Data/p243/74.wav|13
704
+ ./Data/p256/88.wav|15
705
+ ./Data/p228/61.wav|1
706
+ ./Data/p236/87.wav|6
707
+ ./Data/p236/2.wav|6
708
+ ./Data/p239/159.wav|7
709
+ ./Data/p231/44.wav|4
710
+ ./Data/p236/161.wav|6
711
+ ./Data/p256/19.wav|15
712
+ ./Data/p258/5.wav|16
713
+ ./Data/p243/83.wav|13
714
+ ./Data/p228/30.wav|1
715
+ ./Data/p226/65.wav|10
716
+ ./Data/p258/127.wav|16
717
+ ./Data/p254/60.wav|14
718
+ ./Data/p273/97.wav|19
719
+ ./Data/p228/50.wav|1
720
+ ./Data/p243/135.wav|13
721
+ ./Data/p228/111.wav|1
722
+ ./Data/p229/7.wav|2
723
+ ./Data/p229/3.wav|2
724
+ ./Data/p258/11.wav|16
725
+ ./Data/p258/6.wav|16
726
+ ./Data/p259/148.wav|17
727
+ ./Data/p232/30.wav|12
728
+ ./Data/p256/70.wav|15
729
+ ./Data/p259/160.wav|17
730
+ ./Data/p239/113.wav|7
731
+ ./Data/p229/109.wav|2
732
+ ./Data/p231/29.wav|4
733
+ ./Data/p258/25.wav|16
734
+ ./Data/p239/148.wav|7
735
+ ./Data/p239/78.wav|7
736
+ ./Data/p239/107.wav|7
737
+ ./Data/p239/99.wav|7
738
+ ./Data/p259/32.wav|17
739
+ ./Data/p239/11.wav|7
740
+ ./Data/p226/139.wav|10
741
+ ./Data/p229/88.wav|2
742
+ ./Data/p239/9.wav|7
743
+ ./Data/p229/26.wav|2
744
+ ./Data/p229/128.wav|2
745
+ ./Data/p244/119.wav|9
746
+ ./Data/p259/76.wav|17
747
+ ./Data/p239/129.wav|7
748
+ ./Data/p256/115.wav|15
749
+ ./Data/p230/102.wav|3
750
+ ./Data/p236/42.wav|6
751
+ ./Data/p225/16.wav|0
752
+ ./Data/p240/140.wav|8
753
+ ./Data/p226/36.wav|10
754
+ ./Data/p226/78.wav|10
755
+ ./Data/p225/37.wav|0
756
+ ./Data/p256/51.wav|15
757
+ ./Data/p254/112.wav|14
758
+ ./Data/p236/24.wav|6
759
+ ./Data/p228/164.wav|1
760
+ ./Data/p225/63.wav|0
761
+ ./Data/p259/25.wav|17
762
+ ./Data/p226/133.wav|10
763
+ ./Data/p244/107.wav|9
764
+ ./Data/p270/32.wav|18
765
+ ./Data/p270/56.wav|18
766
+ ./Data/p226/62.wav|10
767
+ ./Data/p228/95.wav|1
768
+ ./Data/p259/112.wav|17
769
+ ./Data/p229/114.wav|2
770
+ ./Data/p273/16.wav|19
771
+ ./Data/p236/60.wav|6
772
+ ./Data/p256/128.wav|15
773
+ ./Data/p273/144.wav|19
774
+ ./Data/p236/142.wav|6
775
+ ./Data/p231/130.wav|4
776
+ ./Data/p258/7.wav|16
777
+ ./Data/p225/96.wav|0
778
+ ./Data/p225/91.wav|0
779
+ ./Data/p232/115.wav|12
780
+ ./Data/p270/157.wav|18
781
+ ./Data/p273/104.wav|19
782
+ ./Data/p233/136.wav|5
783
+ ./Data/p240/78.wav|8
784
+ ./Data/p243/17.wav|13
785
+ ./Data/p240/62.wav|8
786
+ ./Data/p243/48.wav|13
787
+ ./Data/p232/29.wav|12
788
+ ./Data/p244/42.wav|9
789
+ ./Data/p259/93.wav|17
790
+ ./Data/p240/136.wav|8
791
+ ./Data/p226/117.wav|10
792
+ ./Data/p239/131.wav|7
793
+ ./Data/p270/54.wav|18
794
+ ./Data/p228/98.wav|1
795
+ ./Data/p270/166.wav|18
796
+ ./Data/p240/145.wav|8
797
+ ./Data/p270/14.wav|18
798
+ ./Data/p240/43.wav|8
799
+ ./Data/p258/107.wav|16
800
+ ./Data/p270/167.wav|18
801
+ ./Data/p259/62.wav|17
802
+ ./Data/p231/65.wav|4
803
+ ./Data/p240/5.wav|8
804
+ ./Data/p230/50.wav|3
805
+ ./Data/p256/3.wav|15
806
+ ./Data/p231/27.wav|4
807
+ ./Data/p229/27.wav|2
808
+ ./Data/p240/96.wav|8
809
+ ./Data/p225/82.wav|0
810
+ ./Data/p236/125.wav|6
811
+ ./Data/p254/71.wav|14
812
+ ./Data/p244/138.wav|9
813
+ ./Data/p254/89.wav|14
814
+ ./Data/p236/91.wav|6
815
+ ./Data/p244/38.wav|9
816
+ ./Data/p232/116.wav|12
817
+ ./Data/p270/11.wav|18
818
+ ./Data/p236/162.wav|6
819
+ ./Data/p228/127.wav|1
820
+ ./Data/p227/96.wav|11
821
+ ./Data/p226/98.wav|10
822
+ ./Data/p270/155.wav|18
823
+ ./Data/p236/143.wav|6
824
+ ./Data/p254/77.wav|14
825
+ ./Data/p273/26.wav|19
826
+ ./Data/p270/1.wav|18
827
+ ./Data/p273/51.wav|19
828
+ ./Data/p243/21.wav|13
829
+ ./Data/p231/68.wav|4
830
+ ./Data/p230/169.wav|3
831
+ ./Data/p226/56.wav|10
832
+ ./Data/p233/79.wav|5
833
+ ./Data/p273/58.wav|19
834
+ ./Data/p231/70.wav|4
835
+ ./Data/p228/42.wav|1
836
+ ./Data/p273/141.wav|19
837
+ ./Data/p256/91.wav|15
838
+ ./Data/p259/70.wav|17
839
+ ./Data/p236/69.wav|6
840
+ ./Data/p228/16.wav|1
841
+ ./Data/p270/44.wav|18
842
+ ./Data/p230/16.wav|3
843
+ ./Data/p244/97.wav|9
844
+ ./Data/p254/42.wav|14
845
+ ./Data/p225/53.wav|0
846
+ ./Data/p230/59.wav|3
847
+ ./Data/p226/140.wav|10
848
+ ./Data/p232/7.wav|12
849
+ ./Data/p229/47.wav|2
850
+ ./Data/p231/13.wav|4
851
+ ./Data/p258/49.wav|16
852
+ ./Data/p226/92.wav|10
853
+ ./Data/p227/81.wav|11
854
+ ./Data/p230/162.wav|3
855
+ ./Data/p240/20.wav|8
856
+ ./Data/p236/88.wav|6
857
+ ./Data/p236/79.wav|6
858
+ ./Data/p236/39.wav|6
859
+ ./Data/p233/97.wav|5
860
+ ./Data/p232/96.wav|12
861
+ ./Data/p273/82.wav|19
862
+ ./Data/p230/123.wav|3
863
+ ./Data/p230/126.wav|3
864
+ ./Data/p258/75.wav|16
865
+ ./Data/p232/78.wav|12
866
+ ./Data/p231/48.wav|4
867
+ ./Data/p244/110.wav|9
868
+ ./Data/p258/71.wav|16
869
+ ./Data/p256/116.wav|15
870
+ ./Data/p231/63.wav|4
871
+ ./Data/p258/26.wav|16
872
+ ./Data/p243/18.wav|13
873
+ ./Data/p243/55.wav|13
874
+ ./Data/p270/162.wav|18
875
+ ./Data/p244/33.wav|9
876
+ ./Data/p226/77.wav|10
877
+ ./Data/p270/98.wav|18
878
+ ./Data/p230/121.wav|3
879
+ ./Data/p226/94.wav|10
880
+ ./Data/p270/84.wav|18
881
+ ./Data/p270/53.wav|18
882
+ ./Data/p243/124.wav|13
883
+ ./Data/p228/86.wav|1
884
+ ./Data/p229/25.wav|2
885
+ ./Data/p230/68.wav|3
886
+ ./Data/p240/29.wav|8
887
+ ./Data/p236/63.wav|6
888
+ ./Data/p270/129.wav|18
889
+ ./Data/p229/79.wav|2
890
+ ./Data/p233/102.wav|5
891
+ ./Data/p228/34.wav|1
892
+ ./Data/p230/163.wav|3
893
+ ./Data/p228/64.wav|1
894
+ ./Data/p233/115.wav|5
895
+ ./Data/p243/88.wav|13
896
+ ./Data/p244/14.wav|9
897
+ ./Data/p243/174.wav|13
898
+ ./Data/p229/74.wav|2
899
+ ./Data/p258/27.wav|16
900
+ ./Data/p259/86.wav|17
901
+ ./Data/p273/92.wav|19
902
+ ./Data/p239/81.wav|7
903
+ ./Data/p254/109.wav|14
904
+ ./Data/p232/103.wav|12
905
+ ./Data/p230/21.wav|3
906
+ ./Data/p226/10.wav|10
907
+ ./Data/p240/2.wav|8
908
+ ./Data/p256/102.wav|15
909
+ ./Data/p240/127.wav|8
910
+ ./Data/p259/138.wav|17
911
+ ./Data/p254/123.wav|14
912
+ ./Data/p270/92.wav|18
913
+ ./Data/p254/30.wav|14
914
+ ./Data/p273/86.wav|19
915
+ ./Data/p244/106.wav|9
916
+ ./Data/p226/107.wav|10
917
+ ./Data/p240/4.wav|8
918
+ ./Data/p228/97.wav|1
919
+ ./Data/p258/32.wav|16
920
+ ./Data/p232/79.wav|12
921
+ ./Data/p259/154.wav|17
922
+ ./Data/p231/19.wav|4
923
+ ./Data/p259/91.wav|17
924
+ ./Data/p244/45.wav|9
925
+ ./Data/p240/97.wav|8
926
+ ./Data/p259/45.wav|17
927
+ ./Data/p270/197.wav|18
928
+ ./Data/p229/1.wav|2
929
+ ./Data/p259/11.wav|17
930
+ ./Data/p228/29.wav|1
931
+ ./Data/p230/72.wav|3
932
+ ./Data/p228/145.wav|1
933
+ ./Data/p244/71.wav|9
934
+ ./Data/p230/66.wav|3
935
+ ./Data/p226/51.wav|10
936
+ ./Data/p270/10.wav|18
937
+ ./Data/p254/96.wav|14
938
+ ./Data/p256/64.wav|15
939
+ ./Data/p243/65.wav|13
940
+ ./Data/p228/148.wav|1
941
+ ./Data/p243/41.wav|13
942
+ ./Data/p228/57.wav|1
943
+ ./Data/p239/92.wav|7
944
+ ./Data/p256/124.wav|15
945
+ ./Data/p259/116.wav|17
946
+ ./Data/p233/70.wav|5
947
+ ./Data/p227/1.wav|11
948
+ ./Data/p231/59.wav|4
949
+ ./Data/p243/30.wav|13
950
+ ./Data/p254/41.wav|14
951
+ ./Data/p228/123.wav|1
952
+ ./Data/p239/20.wav|7
953
+ ./Data/p229/77.wav|2
954
+ ./Data/p239/132.wav|7
955
+ ./Data/p243/144.wav|13
956
+ ./Data/p227/137.wav|11
957
+ ./Data/p239/134.wav|7
958
+ ./Data/p240/108.wav|8
959
+ ./Data/p256/118.wav|15
960
+ ./Data/p256/126.wav|15
961
+ ./Data/p226/110.wav|10
962
+ ./Data/p236/29.wav|6
963
+ ./Data/p236/74.wav|6
964
+ ./Data/p231/77.wav|4
965
+ ./Data/p256/45.wav|15
966
+ ./Data/p256/39.wav|15
967
+ ./Data/p228/66.wav|1
968
+ ./Data/p232/35.wav|12
969
+ ./Data/p273/37.wav|19
970
+ ./Data/p240/135.wav|8
971
+ ./Data/p236/73.wav|6
972
+ ./Data/p256/38.wav|15
973
+ ./Data/p243/109.wav|13
974
+ ./Data/p227/33.wav|11
975
+ ./Data/p259/87.wav|17
976
+ ./Data/p225/55.wav|0
977
+ ./Data/p243/138.wav|13
978
+ ./Data/p227/3.wav|11
979
+ ./Data/p254/74.wav|14
980
+ ./Data/p254/137.wav|14
981
+ ./Data/p228/43.wav|1
982
+ ./Data/p270/71.wav|18
983
+ ./Data/p243/56.wav|13
984
+ ./Data/p228/119.wav|1
985
+ ./Data/p244/136.wav|9
986
+ ./Data/p259/94.wav|17
987
+ ./Data/p259/120.wav|17
988
+ ./Data/p230/74.wav|3
989
+ ./Data/p227/100.wav|11
990
+ ./Data/p228/143.wav|1
991
+ ./Data/p225/98.wav|0
992
+ ./Data/p256/2.wav|15
993
+ ./Data/p273/146.wav|19
994
+ ./Data/p230/99.wav|3
995
+ ./Data/p243/20.wav|13
996
+ ./Data/p258/96.wav|16
997
+ ./Data/p226/87.wav|10
998
+ ./Data/p240/64.wav|8
999
+ ./Data/p243/114.wav|13
1000
+ ./Data/p273/77.wav|19
1001
+ ./Data/p256/48.wav|15
1002
+ ./Data/p258/120.wav|16
1003
+ ./Data/p240/111.wav|8
1004
+ ./Data/p226/73.wav|10
1005
+ ./Data/p229/15.wav|2
1006
+ ./Data/p270/165.wav|18
1007
+ ./Data/p226/124.wav|10
1008
+ ./Data/p254/53.wav|14
1009
+ ./Data/p239/97.wav|7
1010
+ ./Data/p236/71.wav|6
1011
+ ./Data/p243/66.wav|13
1012
+ ./Data/p230/26.wav|3
1013
+ ./Data/p233/17.wav|5
1014
+ ./Data/p273/143.wav|19
1015
+ ./Data/p229/6.wav|2
1016
+ ./Data/p258/41.wav|16
1017
+ ./Data/p240/10.wav|8
1018
+ ./Data/p244/115.wav|9
1019
+ ./Data/p256/8.wav|15
1020
+ ./Data/p243/133.wav|13
1021
+ ./Data/p236/145.wav|6
1022
+ ./Data/p240/110.wav|8
1023
+ ./Data/p270/100.wav|18
1024
+ ./Data/p230/167.wav|3
1025
+ ./Data/p270/27.wav|18
1026
+ ./Data/p243/149.wav|13
1027
+ ./Data/p228/139.wav|1
1028
+ ./Data/p256/96.wav|15
1029
+ ./Data/p230/61.wav|3
1030
+ ./Data/p258/42.wav|16
1031
+ ./Data/p236/94.wav|6
1032
+ ./Data/p230/42.wav|3
1033
+ ./Data/p270/144.wav|18
1034
+ ./Data/p228/141.wav|1
1035
+ ./Data/p232/4.wav|12
1036
+ ./Data/p229/8.wav|2
1037
+ ./Data/p230/39.wav|3
1038
+ ./Data/p256/47.wav|15
1039
+ ./Data/p229/54.wav|2
1040
+ ./Data/p239/168.wav|7
1041
+ ./Data/p227/7.wav|11
1042
+ ./Data/p227/93.wav|11
1043
+ ./Data/p240/13.wav|8
1044
+ ./Data/p270/172.wav|18
1045
+ ./Data/p243/45.wav|13
1046
+ ./Data/p259/30.wav|17
1047
+ ./Data/p270/116.wav|18
1048
+ ./Data/p240/48.wav|8
1049
+ ./Data/p227/24.wav|11
1050
+ ./Data/p229/80.wav|2
1051
+ ./Data/p233/2.wav|5
1052
+ ./Data/p228/87.wav|1
1053
+ ./Data/p240/105.wav|8
1054
+ ./Data/p239/60.wav|7
1055
+ ./Data/p244/39.wav|9
1056
+ ./Data/p240/124.wav|8
1057
+ ./Data/p259/145.wav|17
1058
+ ./Data/p227/76.wav|11
1059
+ ./Data/p254/58.wav|14
1060
+ ./Data/p230/156.wav|3
1061
+ ./Data/p229/42.wav|2
1062
+ ./Data/p273/68.wav|19
1063
+ ./Data/p228/146.wav|1
1064
+ ./Data/p236/165.wav|6
1065
+ ./Data/p229/34.wav|2
1066
+ ./Data/p239/123.wav|7
1067
+ ./Data/p273/121.wav|19
1068
+ ./Data/p270/176.wav|18
1069
+ ./Data/p258/74.wav|16
1070
+ ./Data/p254/84.wav|14
1071
+ ./Data/p259/157.wav|17
1072
+ ./Data/p258/130.wav|16
1073
+ ./Data/p244/18.wav|9
1074
+ ./Data/p229/59.wav|2
1075
+ ./Data/p229/10.wav|2
1076
+ ./Data/p273/89.wav|19
1077
+ ./Data/p259/23.wav|17
1078
+ ./Data/p256/6.wav|15
1079
+ ./Data/p227/8.wav|11
1080
+ ./Data/p258/59.wav|16
1081
+ ./Data/p232/91.wav|12
1082
+ ./Data/p258/137.wav|16
1083
+ ./Data/p258/122.wav|16
1084
+ ./Data/p230/89.wav|3
1085
+ ./Data/p232/58.wav|12
1086
+ ./Data/p231/11.wav|4
1087
+ ./Data/p273/120.wav|19
1088
+ ./Data/p232/39.wav|12
1089
+ ./Data/p236/44.wav|6
1090
+ ./Data/p254/12.wav|14
1091
+ ./Data/p270/95.wav|18
1092
+ ./Data/p270/153.wav|18
1093
+ ./Data/p230/164.wav|3
1094
+ ./Data/p225/30.wav|0
1095
+ ./Data/p240/126.wav|8
1096
+ ./Data/p230/54.wav|3
1097
+ ./Data/p270/87.wav|18
1098
+ ./Data/p225/14.wav|0
1099
+ ./Data/p231/145.wav|4
1100
+ ./Data/p254/81.wav|14
1101
+ ./Data/p244/55.wav|9
1102
+ ./Data/p259/3.wav|17
1103
+ ./Data/p273/50.wav|19
1104
+ ./Data/p228/84.wav|1
1105
+ ./Data/p244/3.wav|9
1106
+ ./Data/p239/55.wav|7
1107
+ ./Data/p232/5.wav|12
1108
+ ./Data/p229/111.wav|2
1109
+ ./Data/p236/141.wav|6
1110
+ ./Data/p233/54.wav|5
1111
+ ./Data/p240/88.wav|8
1112
+ ./Data/p236/16.wav|6
1113
+ ./Data/p239/154.wav|7
1114
+ ./Data/p240/72.wav|8
1115
+ ./Data/p236/75.wav|6
1116
+ ./Data/p230/166.wav|3
1117
+ ./Data/p231/122.wav|4
1118
+ ./Data/p273/24.wav|19
1119
+ ./Data/p233/30.wav|5
1120
+ ./Data/p226/9.wav|10
1121
+ ./Data/p240/65.wav|8
1122
+ ./Data/p228/80.wav|1
1123
+ ./Data/p232/46.wav|12
1124
+ ./Data/p239/109.wav|7
1125
+ ./Data/p231/67.wav|4
1126
+ ./Data/p233/67.wav|5
1127
+ ./Data/p228/162.wav|1
1128
+ ./Data/p229/134.wav|2
1129
+ ./Data/p239/27.wav|7
1130
+ ./Data/p227/145.wav|11
1131
+ ./Data/p225/67.wav|0
1132
+ ./Data/p232/99.wav|12
1133
+ ./Data/p270/140.wav|18
1134
+ ./Data/p225/70.wav|0
1135
+ ./Data/p259/21.wav|17
1136
+ ./Data/p230/28.wav|3
1137
+ ./Data/p230/80.wav|3
1138
+ ./Data/p243/34.wav|13
1139
+ ./Data/p254/61.wav|14
1140
+ ./Data/p236/58.wav|6
1141
+ ./Data/p239/21.wav|7
1142
+ ./Data/p230/91.wav|3
1143
+ ./Data/p256/68.wav|15
1144
+ ./Data/p225/21.wav|0
1145
+ ./Data/p233/49.wav|5
1146
+ ./Data/p236/114.wav|6
1147
+ ./Data/p228/134.wav|1
1148
+ ./Data/p231/114.wav|4
1149
+ ./Data/p239/18.wav|7
1150
+ ./Data/p227/132.wav|11
1151
+ ./Data/p236/115.wav|6
1152
+ ./Data/p254/99.wav|14
1153
+ ./Data/p243/143.wav|13
1154
+ ./Data/p270/49.wav|18
1155
+ ./Data/p239/152.wav|7
1156
+ ./Data/p232/120.wav|12
1157
+ ./Data/p256/25.wav|15
1158
+ ./Data/p229/116.wav|2
1159
+ ./Data/p239/130.wav|7
1160
+ ./Data/p254/124.wav|14
1161
+ ./Data/p270/118.wav|18
1162
+ ./Data/p244/46.wav|9
1163
+ ./Data/p231/105.wav|4
1164
+ ./Data/p231/115.wav|4
1165
+ ./Data/p239/144.wav|7
1166
+ ./Data/p226/39.wav|10
1167
+ ./Data/p233/78.wav|5
1168
+ ./Data/p227/53.wav|11
1169
+ ./Data/p239/146.wav|7
1170
+ ./Data/p256/77.wav|15
1171
+ ./Data/p259/37.wav|17
1172
+ ./Data/p258/36.wav|16
1173
+ ./Data/p254/13.wav|14
1174
+ ./Data/p229/69.wav|2
1175
+ ./Data/p231/90.wav|4
1176
+ ./Data/p226/84.wav|10
1177
+ ./Data/p259/48.wav|17
1178
+ ./Data/p233/88.wav|5
1179
+ ./Data/p228/153.wav|1
1180
+ ./Data/p254/43.wav|14
1181
+ ./Data/p231/97.wav|4
1182
+ ./Data/p273/44.wav|19
1183
+ ./Data/p233/27.wav|5
1184
+ ./Data/p232/90.wav|12
1185
+ ./Data/p254/36.wav|14
1186
+ ./Data/p232/27.wav|12
1187
+ ./Data/p230/113.wav|3
1188
+ ./Data/p254/130.wav|14
1189
+ ./Data/p254/62.wav|14
1190
+ ./Data/p239/118.wav|7
1191
+ ./Data/p230/109.wav|3
1192
+ ./Data/p227/102.wav|11
1193
+ ./Data/p226/48.wav|10
1194
+ ./Data/p230/175.wav|3
1195
+ ./Data/p231/60.wav|4
1196
+ ./Data/p259/105.wav|17
1197
+ ./Data/p233/28.wav|5
1198
+ ./Data/p229/36.wav|2
1199
+ ./Data/p256/111.wav|15
1200
+ ./Data/p230/133.wav|3
1201
+ ./Data/p233/125.wav|5
1202
+ ./Data/p228/59.wav|1
1203
+ ./Data/p239/58.wav|7
1204
+ ./Data/p273/116.wav|19
1205
+ ./Data/p230/97.wav|3
1206
+ ./Data/p273/88.wav|19
1207
+ ./Data/p228/93.wav|1
1208
+ ./Data/p259/81.wav|17
1209
+ ./Data/p228/144.wav|1
1210
+ ./Data/p230/32.wav|3
1211
+ ./Data/p240/6.wav|8
1212
+ ./Data/p230/17.wav|3
1213
+ ./Data/p259/98.wav|17
1214
+ ./Data/p227/75.wav|11
1215
+ ./Data/p231/26.wav|4
1216
+ ./Data/p231/103.wav|4
1217
+ ./Data/p236/67.wav|6
1218
+ ./Data/p270/107.wav|18
1219
+ ./Data/p226/24.wav|10
1220
+ ./Data/p273/34.wav|19
1221
+ ./Data/p236/90.wav|6
1222
+ ./Data/p256/14.wav|15
1223
+ ./Data/p236/140.wav|6
1224
+ ./Data/p273/39.wav|19
1225
+ ./Data/p270/163.wav|18
1226
+ ./Data/p239/77.wav|7
1227
+ ./Data/p230/148.wav|3
1228
+ ./Data/p273/113.wav|19
1229
+ ./Data/p254/140.wav|14
1230
+ ./Data/p239/46.wav|7
1231
+ ./Data/p243/51.wav|13
1232
+ ./Data/p231/10.wav|4
1233
+ ./Data/p231/104.wav|4
1234
+ ./Data/p270/132.wav|18
1235
+ ./Data/p228/108.wav|1
1236
+ ./Data/p233/39.wav|5
1237
+ ./Data/p259/130.wav|17
1238
+ ./Data/p239/85.wav|7
1239
+ ./Data/p240/37.wav|8
1240
+ ./Data/p270/58.wav|18
1241
+ ./Data/p243/78.wav|13
1242
+ ./Data/p273/61.wav|19
1243
+ ./Data/p230/144.wav|3
1244
+ ./Data/p233/21.wav|5
1245
+ ./Data/p225/35.wav|0
1246
+ ./Data/p228/158.wav|1
1247
+ ./Data/p259/26.wav|17
1248
+ ./Data/p230/33.wav|3
1249
+ ./Data/p258/128.wav|16
1250
+ ./Data/p233/61.wav|5
1251
+ ./Data/p225/97.wav|0
1252
+ ./Data/p259/143.wav|17
1253
+ ./Data/p226/50.wav|10
1254
+ ./Data/p243/71.wav|13
1255
+ ./Data/p230/22.wav|3
1256
+ ./Data/p226/58.wav|10
1257
+ ./Data/p239/110.wav|7
1258
+ ./Data/p258/136.wav|16
1259
+ ./Data/p226/102.wav|10
1260
+ ./Data/p258/88.wav|16
1261
+ ./Data/p233/94.wav|5
1262
+ ./Data/p258/77.wav|16
1263
+ ./Data/p231/2.wav|4
1264
+ ./Data/p273/40.wav|19
1265
+ ./Data/p239/133.wav|7
1266
+ ./Data/p270/33.wav|18
1267
+ ./Data/p254/132.wav|14
1268
+ ./Data/p270/99.wav|18
1269
+ ./Data/p227/84.wav|11
1270
+ ./Data/p226/132.wav|10
1271
+ ./Data/p239/165.wav|7
1272
+ ./Data/p270/23.wav|18
1273
+ ./Data/p270/41.wav|18
1274
+ ./Data/p236/28.wav|6
1275
+ ./Data/p231/76.wav|4
1276
+ ./Data/p231/28.wav|4
1277
+ ./Data/p236/56.wav|6
1278
+ ./Data/p236/146.wav|6
1279
+ ./Data/p244/125.wav|9
1280
+ ./Data/p256/55.wav|15
1281
+ ./Data/p232/40.wav|12
1282
+ ./Data/p239/64.wav|7
1283
+ ./Data/p240/130.wav|8
1284
+ ./Data/p239/41.wav|7
1285
+ ./Data/p240/138.wav|8
1286
+ ./Data/p226/118.wav|10
1287
+ ./Data/p228/62.wav|1
1288
+ ./Data/p244/16.wav|9
1289
+ ./Data/p244/20.wav|9
1290
+ ./Data/p226/125.wav|10
1291
+ ./Data/p270/74.wav|18
1292
+ ./Data/p229/129.wav|2
1293
+ ./Data/p227/142.wav|11
1294
+ ./Data/p228/38.wav|1
1295
+ ./Data/p258/97.wav|16
1296
+ ./Data/p233/77.wav|5
1297
+ ./Data/p232/84.wav|12
1298
+ ./Data/p229/17.wav|2
1299
+ ./Data/p227/18.wav|11
1300
+ ./Data/p239/94.wav|7
1301
+ ./Data/p239/1.wav|7
1302
+ ./Data/p225/52.wav|0
1303
+ ./Data/p270/82.wav|18
1304
+ ./Data/p232/53.wav|12
1305
+ ./Data/p258/51.wav|16
1306
+ ./Data/p258/132.wav|16
1307
+ ./Data/p229/66.wav|2
1308
+ ./Data/p270/19.wav|18
1309
+ ./Data/p227/88.wav|11
1310
+ ./Data/p231/96.wav|4
1311
+ ./Data/p239/72.wav|7
1312
+ ./Data/p225/73.wav|0
1313
+ ./Data/p240/146.wav|8
1314
+ ./Data/p236/97.wav|6
1315
+ ./Data/p227/43.wav|11
1316
+ ./Data/p232/119.wav|12
1317
+ ./Data/p231/53.wav|4
1318
+ ./Data/p239/42.wav|7
1319
+ ./Data/p259/115.wav|17
1320
+ ./Data/p244/105.wav|9
1321
+ ./Data/p256/33.wav|15
1322
+ ./Data/p231/100.wav|4
1323
+ ./Data/p240/8.wav|8
1324
+ ./Data/p256/57.wav|15
1325
+ ./Data/p227/130.wav|11
1326
+ ./Data/p226/30.wav|10
1327
+ ./Data/p233/80.wav|5
1328
+ ./Data/p232/17.wav|12
1329
+ ./Data/p259/167.wav|17
1330
+ ./Data/p227/122.wav|11
1331
+ ./Data/p239/128.wav|7
1332
+ ./Data/p231/133.wav|4
1333
+ ./Data/p273/129.wav|19
1334
+ ./Data/p243/15.wav|13
1335
+ ./Data/p243/44.wav|13
1336
+ ./Data/p259/161.wav|17
1337
+ ./Data/p243/94.wav|13
1338
+ ./Data/p244/62.wav|9
1339
+ ./Data/p270/180.wav|18
1340
+ ./Data/p258/126.wav|16
1341
+ ./Data/p229/137.wav|2
1342
+ ./Data/p233/105.wav|5
1343
+ ./Data/p244/79.wav|9
1344
+ ./Data/p254/46.wav|14
1345
+ ./Data/p240/95.wav|8
1346
+ ./Data/p259/135.wav|17
1347
+ ./Data/p259/52.wav|17
1348
+ ./Data/p229/68.wav|2
1349
+ ./Data/p254/33.wav|14
1350
+ ./Data/p230/83.wav|3
1351
+ ./Data/p256/89.wav|15
1352
+ ./Data/p254/90.wav|14
1353
+ ./Data/p270/182.wav|18
1354
+ ./Data/p226/18.wav|10
1355
+ ./Data/p270/145.wav|18
1356
+ ./Data/p231/128.wav|4
1357
+ ./Data/p239/140.wav|7
1358
+ ./Data/p228/100.wav|1
1359
+ ./Data/p227/49.wav|11
1360
+ ./Data/p240/53.wav|8
1361
+ ./Data/p258/108.wav|16
1362
+ ./Data/p226/83.wav|10
1363
+ ./Data/p270/106.wav|18
1364
+ ./Data/p243/11.wav|13
1365
+ ./Data/p229/12.wav|2
1366
+ ./Data/p228/7.wav|1
1367
+ ./Data/p243/8.wav|13
1368
+ ./Data/p227/128.wav|11
1369
+ ./Data/p230/118.wav|3
1370
+ ./Data/p227/78.wav|11
1371
+ ./Data/p244/30.wav|9
1372
+ ./Data/p231/98.wav|4
1373
+ ./Data/p230/38.wav|3
1374
+ ./Data/p244/47.wav|9
1375
+ ./Data/p270/138.wav|18
1376
+ ./Data/p259/109.wav|17
1377
+ ./Data/p270/112.wav|18
1378
+ ./Data/p227/82.wav|11
1379
+ ./Data/p228/161.wav|1
1380
+ ./Data/p273/127.wav|19
1381
+ ./Data/p232/72.wav|12
1382
+ ./Data/p227/95.wav|11
1383
+ ./Data/p236/105.wav|6
1384
+ ./Data/p239/52.wav|7
1385
+ ./Data/p273/135.wav|19
1386
+ ./Data/p236/136.wav|6
1387
+ ./Data/p228/113.wav|1
1388
+ ./Data/p229/56.wav|2
1389
+ ./Data/p240/34.wav|8
1390
+ ./Data/p230/79.wav|3
1391
+ ./Data/p232/48.wav|12
1392
+ ./Data/p240/101.wav|8
1393
+ ./Data/p229/112.wav|2
1394
+ ./Data/p273/46.wav|19
1395
+ ./Data/p273/27.wav|19
1396
+ ./Data/p239/103.wav|7
1397
+ ./Data/p259/117.wav|17
1398
+ ./Data/p230/37.wav|3
1399
+ ./Data/p233/138.wav|5
1400
+ ./Data/p228/125.wav|1
1401
+ ./Data/p230/115.wav|3
1402
+ ./Data/p240/42.wav|8
1403
+ ./Data/p231/99.wav|4
1404
+ ./Data/p236/54.wav|6
1405
+ ./Data/p233/104.wav|5
1406
+ ./Data/p270/4.wav|18
1407
+ ./Data/p226/122.wav|10
1408
+ ./Data/p230/56.wav|3
1409
+ ./Data/p244/58.wav|9
1410
+ ./Data/p229/133.wav|2
1411
+ ./Data/p270/64.wav|18
1412
+ ./Data/p225/88.wav|0
1413
+ ./Data/p240/104.wav|8
1414
+ ./Data/p244/78.wav|9
1415
+ ./Data/p254/113.wav|14
1416
+ ./Data/p259/144.wav|17
1417
+ ./Data/p236/100.wav|6
1418
+ ./Data/p230/81.wav|3
1419
+ ./Data/p259/53.wav|17
1420
+ ./Data/p239/155.wav|7
1421
+ ./Data/p236/148.wav|6
1422
+ ./Data/p270/8.wav|18
1423
+ ./Data/p225/90.wav|0
1424
+ ./Data/p236/64.wav|6
1425
+ ./Data/p236/159.wav|6
1426
+ ./Data/p232/63.wav|12
1427
+ ./Data/p244/2.wav|9
1428
+ ./Data/p258/28.wav|16
1429
+ ./Data/p259/5.wav|17
1430
+ ./Data/p225/42.wav|0
1431
+ ./Data/p256/49.wav|15
1432
+ ./Data/p233/24.wav|5
1433
+ ./Data/p270/146.wav|18
1434
+ ./Data/p243/131.wav|13
1435
+ ./Data/p229/91.wav|2
1436
+ ./Data/p229/76.wav|2
1437
+ ./Data/p227/22.wav|11
1438
+ ./Data/p244/59.wav|9
1439
+ ./Data/p236/17.wav|6
1440
+ ./Data/p240/32.wav|8
1441
+ ./Data/p232/23.wav|12
1442
+ ./Data/p230/20.wav|3
1443
+ ./Data/p232/111.wav|12
1444
+ ./Data/p230/159.wav|3
1445
+ ./Data/p244/15.wav|9
1446
+ ./Data/p229/86.wav|2
1447
+ ./Data/p240/54.wav|8
1448
+ ./Data/p229/132.wav|2
1449
+ ./Data/p239/126.wav|7
1450
+ ./Data/p240/91.wav|8
1451
+ ./Data/p244/51.wav|9
1452
+ ./Data/p254/19.wav|14
1453
+ ./Data/p244/32.wav|9
1454
+ ./Data/p258/114.wav|16
1455
+ ./Data/p254/106.wav|14
1456
+ ./Data/p243/111.wav|13
1457
+ ./Data/p226/106.wav|10
1458
+ ./Data/p244/26.wav|9
1459
+ ./Data/p225/57.wav|0
1460
+ ./Data/p243/24.wav|13
1461
+ ./Data/p259/127.wav|17
1462
+ ./Data/p256/50.wav|15
1463
+ ./Data/p239/100.wav|7
1464
+ ./Data/p273/10.wav|19
1465
+ ./Data/p229/2.wav|2
1466
+ ./Data/p270/70.wav|18
1467
+ ./Data/p254/95.wav|14
1468
+ ./Data/p256/120.wav|15
1469
+ ./Data/p233/107.wav|5
1470
+ ./Data/p226/90.wav|10
1471
+ ./Data/p258/55.wav|16
1472
+ ./Data/p233/99.wav|5
1473
+ ./Data/p230/6.wav|3
1474
+ ./Data/p273/131.wav|19
1475
+ ./Data/p273/52.wav|19
1476
+ ./Data/p236/158.wav|6
1477
+ ./Data/p232/62.wav|12
1478
+ ./Data/p233/20.wav|5
1479
+ ./Data/p270/90.wav|18
1480
+ ./Data/p240/11.wav|8
1481
+ ./Data/p258/66.wav|16
1482
+ ./Data/p258/65.wav|16
1483
+ ./Data/p270/94.wav|18
1484
+ ./Data/p270/9.wav|18
1485
+ ./Data/p228/82.wav|1
1486
+ ./Data/p236/96.wav|6
1487
+ ./Data/p229/33.wav|2
1488
+ ./Data/p229/19.wav|2
1489
+ ./Data/p239/54.wav|7
1490
+ ./Data/p232/106.wav|12
1491
+ ./Data/p231/138.wav|4
1492
+ ./Data/p230/57.wav|3
1493
+ ./Data/p270/89.wav|18
1494
+ ./Data/p273/95.wav|19
1495
+ ./Data/p231/131.wav|4
1496
+ ./Data/p236/107.wav|6
1497
+ ./Data/p228/122.wav|1
1498
+ ./Data/p226/109.wav|10
1499
+ ./Data/p270/117.wav|18
1500
+ ./Data/p230/110.wav|3
1501
+ ./Data/p270/37.wav|18
1502
+ ./Data/p225/29.wav|0
1503
+ ./Data/p233/8.wav|5
1504
+ ./Data/p227/4.wav|11
1505
+ ./Data/p232/97.wav|12
1506
+ ./Data/p243/14.wav|13
1507
+ ./Data/p254/91.wav|14
1508
+ ./Data/p256/62.wav|15
1509
+ ./Data/p229/110.wav|2
1510
+ ./Data/p233/34.wav|5
1511
+ ./Data/p226/81.wav|10
1512
+ ./Data/p230/29.wav|3
1513
+ ./Data/p240/84.wav|8
1514
+ ./Data/p270/201.wav|18
1515
+ ./Data/p239/157.wav|7
1516
+ ./Data/p270/158.wav|18
1517
+ ./Data/p236/80.wav|6
1518
+ ./Data/p232/54.wav|12
1519
+ ./Data/p239/29.wav|7
1520
+ ./Data/p225/33.wav|0
1521
+ ./Data/p273/7.wav|19
1522
+ ./Data/p273/98.wav|19
1523
+ ./Data/p227/63.wav|11
1524
+ ./Data/p230/174.wav|3
1525
+ ./Data/p270/28.wav|18
1526
+ ./Data/p233/13.wav|5
1527
+ ./Data/p273/99.wav|19
1528
+ ./Data/p229/81.wav|2
1529
+ ./Data/p273/124.wav|19
1530
+ ./Data/p230/129.wav|3
1531
+ ./Data/p259/133.wav|17
1532
+ ./Data/p270/24.wav|18
1533
+ ./Data/p226/35.wav|10
1534
+ ./Data/p236/118.wav|6
1535
+ ./Data/p254/121.wav|14
1536
+ ./Data/p270/120.wav|18
1537
+ ./Data/p231/30.wav|4
1538
+ ./Data/p240/102.wav|8
1539
+ ./Data/p243/53.wav|13
1540
+ ./Data/p230/47.wav|3
1541
+ ./Data/p233/55.wav|5
1542
+ ./Data/p226/11.wav|10
1543
+ ./Data/p239/120.wav|7
1544
+ ./Data/p226/49.wav|10
1545
+ ./Data/p239/44.wav|7
1546
+ ./Data/p244/140.wav|9
1547
+ ./Data/p258/63.wav|16
1548
+ ./Data/p232/52.wav|12
1549
+ ./Data/p273/109.wav|19
1550
+ ./Data/p259/72.wav|17
1551
+ ./Data/p259/164.wav|17
1552
+ ./Data/p256/78.wav|15
1553
+ ./Data/p243/107.wav|13
1554
+ ./Data/p258/62.wav|16
1555
+ ./Data/p239/31.wav|7
1556
+ ./Data/p256/41.wav|15
1557
+ ./Data/p273/63.wav|19
1558
+ ./Data/p258/112.wav|16
1559
+ ./Data/p243/116.wav|13
1560
+ ./Data/p254/29.wav|14
1561
+ ./Data/p229/45.wav|2
1562
+ ./Data/p244/101.wav|9
1563
+ ./Data/p232/34.wav|12
1564
+ ./Data/p243/154.wav|13
1565
+ ./Data/p231/33.wav|4
1566
+ ./Data/p243/35.wav|13
1567
+ ./Data/p236/38.wav|6
1568
+ ./Data/p270/16.wav|18
1569
+ ./Data/p270/187.wav|18
1570
+ ./Data/p239/114.wav|7
1571
+ ./Data/p244/24.wav|9
1572
+ ./Data/p228/75.wav|1
1573
+ ./Data/p226/26.wav|10
1574
+ ./Data/p259/136.wav|17
1575
+ ./Data/p236/147.wav|6
1576
+ ./Data/p239/135.wav|7
1577
+ ./Data/p270/43.wav|18
1578
+ ./Data/p244/132.wav|9
1579
+ ./Data/p243/129.wav|13
1580
+ ./Data/p236/9.wav|6
1581
+ ./Data/p232/109.wav|12
1582
+ ./Data/p225/84.wav|0
1583
+ ./Data/p227/27.wav|11
1584
+ ./Data/p259/8.wav|17
1585
+ ./Data/p259/67.wav|17
1586
+ ./Data/p239/57.wav|7
1587
+ ./Data/p243/69.wav|13
1588
+ ./Data/p231/62.wav|4
1589
+ ./Data/p259/140.wav|17
1590
+ ./Data/p227/66.wav|11
1591
+ ./Data/p230/44.wav|3
1592
+ ./Data/p229/63.wav|2
1593
+ ./Data/p256/4.wav|15
1594
+ ./Data/p258/24.wav|16
1595
+ ./Data/p240/80.wav|8
1596
+ ./Data/p270/72.wav|18
1597
+ ./Data/p240/47.wav|8
1598
+ ./Data/p229/98.wav|2
1599
+ ./Data/p244/111.wav|9
1600
+ ./Data/p231/111.wav|4
1601
+ ./Data/p243/91.wav|13
1602
+ ./Data/p239/36.wav|7
1603
+ ./Data/p259/103.wav|17
1604
+ ./Data/p232/2.wav|12
1605
+ ./Data/p236/3.wav|6
1606
+ ./Data/p236/57.wav|6
1607
+ ./Data/p233/109.wav|5
1608
+ ./Data/p236/122.wav|6
1609
+ ./Data/p270/76.wav|18
1610
+ ./Data/p243/167.wav|13
1611
+ ./Data/p228/20.wav|1
1612
+ ./Data/p243/72.wav|13
1613
+ ./Data/p239/2.wav|7
1614
+ ./Data/p226/21.wav|10
1615
+ ./Data/p256/43.wav|15
1616
+ ./Data/p259/129.wav|17
1617
+ ./Data/p231/15.wav|4
1618
+ ./Data/p231/85.wav|4
1619
+ ./Data/p226/29.wav|10
1620
+ ./Data/p230/131.wav|3
1621
+ ./Data/p259/97.wav|17
1622
+ ./Data/p240/68.wav|8
1623
+ ./Data/p233/84.wav|5
1624
+ ./Data/p236/10.wav|6
1625
+ ./Data/p244/120.wav|9
1626
+ ./Data/p270/18.wav|18
1627
+ ./Data/p231/24.wav|4
1628
+ ./Data/p256/37.wav|15
1629
+ ./Data/p233/11.wav|5
1630
+ ./Data/p230/93.wav|3
1631
+ ./Data/p230/73.wav|3
1632
+ ./Data/p239/66.wav|7
1633
+ ./Data/p230/40.wav|3
1634
+ ./Data/p228/13.wav|1
1635
+ ./Data/p231/49.wav|4
1636
+ ./Data/p270/62.wav|18
1637
+ ./Data/p236/78.wav|6
1638
+ ./Data/p258/73.wav|16
1639
+ ./Data/p236/35.wav|6
1640
+ ./Data/p254/120.wav|14
1641
+ ./Data/p258/53.wav|16
1642
+ ./Data/p227/16.wav|11
1643
+ ./Data/p232/33.wav|12
1644
+ ./Data/p256/42.wav|15
1645
+ ./Data/p233/68.wav|5
1646
+ ./Data/p225/74.wav|0
1647
+ ./Data/p244/127.wav|9
1648
+ ./Data/p243/118.wav|13
1649
+ ./Data/p273/128.wav|19
1650
+ ./Data/p239/7.wav|7
1651
+ ./Data/p243/50.wav|13
1652
+ ./Data/p226/23.wav|10
1653
+ ./Data/p270/199.wav|18
1654
+ ./Data/p254/45.wav|14
1655
+ ./Data/p254/11.wav|14
1656
+ ./Data/p244/66.wav|9
1657
+ ./Data/p270/152.wav|18
1658
+ ./Data/p227/131.wav|11
1659
+ ./Data/p270/38.wav|18
1660
+ ./Data/p229/57.wav|2
1661
+ ./Data/p227/35.wav|11
1662
+ ./Data/p244/7.wav|9
1663
+ ./Data/p226/32.wav|10
1664
+ ./Data/p230/152.wav|3
1665
+ ./Data/p239/161.wav|7
1666
+ ./Data/p256/123.wav|15
1667
+ ./Data/p231/14.wav|4
1668
+ ./Data/p243/38.wav|13
1669
+ ./Data/p229/102.wav|2
1670
+ ./Data/p229/38.wav|2
1671
+ ./Data/p233/116.wav|5
1672
+ ./Data/p254/35.wav|14
1673
+ ./Data/p254/118.wav|14
1674
+ ./Data/p225/15.wav|0
1675
+ ./Data/p230/132.wav|3
1676
+ ./Data/p273/84.wav|19
1677
+ ./Data/p254/122.wav|14
1678
+ ./Data/p273/3.wav|19
1679
+ ./Data/p270/68.wav|18
1680
+ ./Data/p232/42.wav|12
1681
+ ./Data/p225/93.wav|0
1682
+ ./Data/p227/34.wav|11
1683
+ ./Data/p270/22.wav|18
1684
+ ./Data/p231/4.wav|4
1685
+ ./Data/p227/125.wav|11
1686
+ ./Data/p244/95.wav|9
1687
+ ./Data/p236/18.wav|6
1688
+ ./Data/p273/25.wav|19
1689
+ ./Data/p259/169.wav|17
1690
+ ./Data/p233/56.wav|5
1691
+ ./Data/p270/203.wav|18
1692
+ ./Data/p259/41.wav|17
1693
+ ./Data/p233/38.wav|5
1694
+ ./Data/p229/22.wav|2
1695
+ ./Data/p256/17.wav|15
1696
+ ./Data/p270/3.wav|18
1697
+ ./Data/p231/5.wav|4
1698
+ ./Data/p240/60.wav|8
1699
+ ./Data/p227/21.wav|11
1700
+ ./Data/p259/1.wav|17
1701
+ ./Data/p259/4.wav|17
1702
+ ./Data/p232/11.wav|12
1703
+ ./Data/p259/114.wav|17
1704
+ ./Data/p226/45.wav|10
1705
+ ./Data/p236/27.wav|6
1706
+ ./Data/p239/47.wav|7
1707
+ ./Data/p244/85.wav|9
1708
+ ./Data/p243/87.wav|13
1709
+ ./Data/p258/89.wav|16
1710
+ ./Data/p233/57.wav|5
1711
+ ./Data/p228/78.wav|1
1712
+ ./Data/p256/60.wav|15
1713
+ ./Data/p232/83.wav|12
1714
+ ./Data/p232/88.wav|12
1715
+ ./Data/p231/120.wav|4
1716
+ ./Data/p226/101.wav|10
1717
+ ./Data/p236/102.wav|6
1718
+ ./Data/p226/123.wav|10
1719
+ ./Data/p259/85.wav|17
1720
+ ./Data/p227/124.wav|11
1721
+ ./Data/p259/80.wav|17
1722
+ ./Data/p227/10.wav|11
1723
+ ./Data/p233/26.wav|5
1724
+ ./Data/p273/75.wav|19
1725
+ ./Data/p243/73.wav|13
1726
+ ./Data/p244/22.wav|9
1727
+ ./Data/p243/126.wav|13
1728
+ ./Data/p244/108.wav|9
1729
+ ./Data/p243/134.wav|13
1730
+ ./Data/p226/100.wav|10
1731
+ ./Data/p231/123.wav|4
1732
+ ./Data/p228/47.wav|1
1733
+ ./Data/p243/42.wav|13
1734
+ ./Data/p233/131.wav|5
1735
+ ./Data/p273/2.wav|19
1736
+ ./Data/p254/24.wav|14
1737
+ ./Data/p236/123.wav|6
1738
+ ./Data/p240/24.wav|8
1739
+ ./Data/p244/63.wav|9
1740
+ ./Data/p236/149.wav|6
1741
+ ./Data/p236/83.wav|6
1742
+ ./Data/p258/131.wav|16
1743
+ ./Data/p243/120.wav|13
1744
+ ./Data/p259/159.wav|17
1745
+ ./Data/p258/8.wav|16
1746
+ ./Data/p258/34.wav|16
1747
+ ./Data/p243/33.wav|13
1748
+ ./Data/p256/18.wav|15
1749
+ ./Data/p232/73.wav|12
1750
+ ./Data/p244/49.wav|9
1751
+ ./Data/p258/12.wav|16
1752
+ ./Data/p225/18.wav|0
1753
+ ./Data/p258/68.wav|16
1754
+ ./Data/p270/134.wav|18
1755
+ ./Data/p228/54.wav|1
1756
+ ./Data/p236/139.wav|6
1757
+ ./Data/p225/6.wav|0
1758
+ ./Data/p259/57.wav|17
1759
+ ./Data/p243/70.wav|13
1760
+ ./Data/p240/122.wav|8
1761
+ ./Data/p259/69.wav|17
1762
+ ./Data/p258/124.wav|16
1763
+ ./Data/p226/138.wav|10
1764
+ ./Data/p231/51.wav|4
1765
+ ./Data/p259/126.wav|17
1766
+ ./Data/p227/119.wav|11
1767
+ ./Data/p254/136.wav|14
1768
+ ./Data/p240/107.wav|8
1769
+ ./Data/p254/4.wav|14
1770
+ ./Data/p228/117.wav|1
1771
+ ./Data/p244/92.wav|9
1772
+ ./Data/p239/151.wav|7
1773
+ ./Data/p259/131.wav|17
1774
+ ./Data/p273/96.wav|19
1775
+ ./Data/p254/69.wav|14
1776
+ ./Data/p259/16.wav|17
1777
+ ./Data/p244/86.wav|9
1778
+ ./Data/p236/30.wav|6
1779
+ ./Data/p230/34.wav|3
1780
+ ./Data/p230/142.wav|3
1781
+ ./Data/p244/37.wav|9
1782
+ ./Data/p239/40.wav|7
1783
+ ./Data/p232/87.wav|12
1784
+ ./Data/p270/115.wav|18
1785
+ ./Data/p232/124.wav|12
1786
+ ./Data/p233/127.wav|5
1787
+ ./Data/p228/70.wav|1
1788
+ ./Data/p254/66.wav|14
1789
+ ./Data/p232/16.wav|12
1790
+ ./Data/p256/109.wav|15
1791
+ ./Data/p243/169.wav|13
1792
+ ./Data/p228/112.wav|1
1793
+ ./Data/p254/82.wav|14
1794
+ ./Data/p231/119.wav|4
1795
+ ./Data/p236/59.wav|6
1796
+ ./Data/p239/69.wav|7
1797
+ ./Data/p225/12.wav|0
1798
+ ./Data/p232/18.wav|12
1799
+ ./Data/p229/32.wav|2
1800
+ ./Data/p228/126.wav|1
1801
+ ./Data/p270/171.wav|18
1802
+ ./Data/p236/13.wav|6
1803
+ ./Data/p228/12.wav|1
1804
+ ./Data/p228/96.wav|1
1805
+ ./Data/p256/11.wav|15
1806
+ ./Data/p233/83.wav|5
1807
+ ./Data/p256/99.wav|15
1808
+ ./Data/p225/69.wav|0
1809
+ ./Data/p254/7.wav|14
1810
+ ./Data/p227/59.wav|11
1811
+ ./Data/p273/136.wav|19
1812
+ ./Data/p239/3.wav|7
1813
+ ./Data/p256/119.wav|15
1814
+ ./Data/p226/99.wav|10
1815
+ ./Data/p256/56.wav|15
1816
+ ./Data/p243/82.wav|13
1817
+ ./Data/p227/69.wav|11
1818
+ ./Data/p273/29.wav|19
1819
+ ./Data/p233/100.wav|5
1820
+ ./Data/p230/173.wav|3
1821
+ ./Data/p240/132.wav|8
1822
+ ./Data/p239/143.wav|7
1823
+ ./Data/p231/40.wav|4
1824
+ ./Data/p256/10.wav|15
1825
+ ./Data/p229/75.wav|2
1826
+ ./Data/p240/15.wav|8
1827
+ ./Data/p228/102.wav|1
1828
+ ./Data/p270/52.wav|18
1829
+ ./Data/p270/7.wav|18
1830
+ ./Data/p270/164.wav|18
1831
+ ./Data/p233/91.wav|5
1832
+ ./Data/p244/27.wav|9
1833
+ ./Data/p244/48.wav|9
1834
+ ./Data/p239/24.wav|7
1835
+ ./Data/p226/113.wav|10
1836
+ ./Data/p227/72.wav|11
1837
+ ./Data/p270/67.wav|18
1838
+ ./Data/p231/25.wav|4
1839
+ ./Data/p229/120.wav|2
1840
+ ./Data/p273/67.wav|19
1841
+ ./Data/p230/67.wav|3
1842
+ ./Data/p227/120.wav|11
1843
+ ./Data/p239/121.wav|7
1844
+ ./Data/p228/88.wav|1
1845
+ ./Data/p254/15.wav|14
1846
+ ./Data/p270/114.wav|18
1847
+ ./Data/p254/14.wav|14
1848
+ ./Data/p259/75.wav|17
1849
+ ./Data/p236/126.wav|6
1850
+ ./Data/p228/92.wav|1
1851
+ ./Data/p230/127.wav|3
1852
+ ./Data/p229/93.wav|2
1853
+ ./Data/p233/82.wav|5
1854
+ ./Data/p239/122.wav|7
1855
+ ./Data/p229/72.wav|2
1856
+ ./Data/p232/131.wav|12
1857
+ ./Data/p239/67.wav|7
1858
+ ./Data/p225/36.wav|0
1859
+ ./Data/p254/3.wav|14
1860
+ ./Data/p244/109.wav|9
1861
+ ./Data/p230/112.wav|3
1862
+ ./Data/p230/5.wav|3
1863
+ ./Data/p256/87.wav|15
1864
+ ./Data/p232/15.wav|12
1865
+ ./Data/p244/67.wav|9
1866
+ ./Data/p236/48.wav|6
1867
+ ./Data/p232/110.wav|12
1868
+ ./Data/p243/156.wav|13
1869
+ ./Data/p231/140.wav|4
1870
+ ./Data/p239/89.wav|7
1871
+ ./Data/p229/53.wav|2
1872
+ ./Data/p256/97.wav|15
1873
+ ./Data/p256/79.wav|15
1874
+ ./Data/p236/6.wav|6
1875
+ ./Data/p236/106.wav|6
1876
+ ./Data/p227/15.wav|11
1877
+ ./Data/p273/20.wav|19
1878
+ ./Data/p239/49.wav|7
1879
+ ./Data/p254/134.wav|14
1880
+ ./Data/p228/4.wav|1
1881
+ ./Data/p227/117.wav|11
1882
+ ./Data/p259/7.wav|17
1883
+ ./Data/p258/91.wav|16
1884
+ ./Data/p259/128.wav|17
1885
+ ./Data/p236/61.wav|6
1886
+ ./Data/p230/165.wav|3
1887
+ ./Data/p225/20.wav|0
1888
+ ./Data/p232/122.wav|12
1889
+ ./Data/p230/130.wav|3
1890
+ ./Data/p228/58.wav|1
1891
+ ./Data/p227/38.wav|11
1892
+ ./Data/p239/34.wav|7
1893
+ ./Data/p240/137.wav|8
1894
+ ./Data/p258/90.wav|16
1895
+ ./Data/p258/138.wav|16
1896
+ ./Data/p244/124.wav|9
1897
+ ./Data/p239/167.wav|7
1898
+ ./Data/p233/90.wav|5
1899
+ ./Data/p239/172.wav|7
1900
+ ./Data/p254/97.wav|14
1901
+ ./Data/p259/29.wav|17
1902
+ ./Data/p229/92.wav|2
1903
+ ./Data/p227/11.wav|11
1904
+ ./Data/p258/118.wav|16
1905
+ ./Data/p244/69.wav|9
1906
+ ./Data/p232/3.wav|12
1907
+ ./Data/p256/28.wav|15
1908
+ ./Data/p229/49.wav|2
1909
+ ./Data/p236/82.wav|6
1910
+ ./Data/p239/171.wav|7
1911
+ ./Data/p254/127.wav|14
1912
+ ./Data/p259/43.wav|17
1913
+ ./Data/p228/21.wav|1
1914
+ ./Data/p256/74.wav|15
1915
+ ./Data/p226/76.wav|10
1916
+ ./Data/p243/170.wav|13
1917
+ ./Data/p239/39.wav|7
1918
+ ./Data/p233/124.wav|5
1919
+ ./Data/p229/13.wav|2
1920
+ ./Data/p231/71.wav|4
1921
+ ./Data/p229/118.wav|2
1922
+ ./Data/p231/88.wav|4
1923
+ ./Data/p231/55.wav|4
1924
+ ./Data/p270/104.wav|18
1925
+ ./Data/p270/110.wav|18
1926
+ ./Data/p228/41.wav|1
1927
+ ./Data/p258/2.wav|16
1928
+ ./Data/p230/78.wav|3
1929
+ ./Data/p231/80.wav|4
1930
+ ./Data/p243/9.wav|13
1931
+ ./Data/p239/16.wav|7
1932
+ ./Data/p239/76.wav|7
1933
+ ./Data/p226/126.wav|10
1934
+ ./Data/p226/63.wav|10
1935
+ ./Data/p233/46.wav|5
1936
+ ./Data/p270/202.wav|18
1937
+ ./Data/p239/164.wav|7
1938
+ ./Data/p231/22.wav|4
1939
+ ./Data/p259/24.wav|17
1940
+ ./Data/p256/73.wav|15
1941
+ ./Data/p259/10.wav|17
1942
+ ./Data/p232/94.wav|12
1943
+ ./Data/p273/30.wav|19
1944
+ ./Data/p244/29.wav|9
1945
+ ./Data/p226/129.wav|10
1946
+ ./Data/p243/81.wav|13
1947
+ ./Data/p236/121.wav|6
1948
+ ./Data/p228/89.wav|1
1949
+ ./Data/p231/81.wav|4
1950
+ ./Data/p243/57.wav|13
1951
+ ./Data/p236/40.wav|6
1952
+ ./Data/p226/89.wav|10
1953
+ ./Data/p244/44.wav|9
1954
+ ./Data/p254/88.wav|14
1955
+ ./Data/p227/108.wav|11
1956
+ ./Data/p258/123.wav|16
1957
+ ./Data/p233/95.wav|5
1958
+ ./Data/p259/142.wav|17
1959
+ ./Data/p231/73.wav|4
1960
+ ./Data/p258/52.wav|16
1961
+ ./Data/p236/89.wav|6
1962
+ ./Data/p229/67.wav|2
1963
+ ./Data/p258/46.wav|16
1964
+ ./Data/p231/132.wav|4
1965
+ ./Data/p227/41.wav|11
1966
+ ./Data/p256/114.wav|15
1967
+ ./Data/p232/10.wav|12
1968
+ ./Data/p225/46.wav|0
1969
+ ./Data/p231/61.wav|4
1970
+ ./Data/p229/30.wav|2
1971
+ ./Data/p236/101.wav|6
1972
+ ./Data/p256/20.wav|15
1973
+ ./Data/p226/60.wav|10
1974
+ ./Data/p259/18.wav|17
1975
+ ./Data/p236/151.wav|6
1976
+ ./Data/p233/130.wav|5
1977
+ ./Data/p273/91.wav|19
1978
+ ./Data/p225/59.wav|0
1979
+ ./Data/p227/83.wav|11
1980
+ ./Data/p226/127.wav|10
1981
+ ./Data/p270/137.wav|18
1982
+ ./Data/p258/95.wav|16
1983
+ ./Data/p227/42.wav|11
1984
+ ./Data/p230/108.wav|3
1985
+ ./Data/p243/137.wav|13
1986
+ ./Data/p228/157.wav|1
1987
+ ./Data/p243/105.wav|13
1988
+ ./Data/p228/133.wav|1
1989
+ ./Data/p270/93.wav|18
1990
+ ./Data/p256/86.wav|15
1991
+ ./Data/p254/17.wav|14
1992
+ ./Data/p227/135.wav|11
1993
+ ./Data/p228/118.wav|1
1994
+ ./Data/p239/142.wav|7
1995
+ ./Data/p273/137.wav|19
1996
+ ./Data/p259/79.wav|17
1997
+ ./Data/p259/108.wav|17
1998
+ ./Data/p226/15.wav|10
1999
+ ./Data/p231/43.wav|4
2000
+ ./Data/p256/16.wav|15
2001
+ ./Data/p232/20.wav|12
2002
+ ./Data/p258/35.wav|16
2003
+ ./Data/p243/141.wav|13
2004
+ ./Data/p232/104.wav|12
2005
+ ./Data/p259/58.wav|17
2006
+ ./Data/p258/82.wav|16
2007
+ ./Data/p233/76.wav|5
2008
+ ./Data/p270/126.wav|18
2009
+ ./Data/p236/70.wav|6
2010
+ ./Data/p240/49.wav|8
2011
+ ./Data/p256/106.wav|15
2012
+ ./Data/p254/55.wav|14
2013
+ ./Data/p270/2.wav|18
2014
+ ./Data/p270/143.wav|18
2015
+ ./Data/p229/48.wav|2
2016
+ ./Data/p244/6.wav|9
2017
+ ./Data/p233/65.wav|5
2018
+ ./Data/p233/18.wav|5
2019
+ ./Data/p244/87.wav|9
2020
+ ./Data/p236/133.wav|6
2021
+ ./Data/p227/2.wav|11
2022
+ ./Data/p227/17.wav|11
2023
+ ./Data/p273/111.wav|19
2024
+ ./Data/p230/98.wav|3
2025
+ ./Data/p226/120.wav|10
2026
+ ./Data/p226/112.wav|10
2027
+ ./Data/p230/161.wav|3
2028
+ ./Data/p254/79.wav|14
2029
+ ./Data/p230/101.wav|3
2030
+ ./Data/p239/96.wav|7
2031
+ ./Data/p228/159.wav|1
2032
+ ./Data/p230/24.wav|3
2033
+ ./Data/p240/28.wav|8
2034
+ ./Data/p254/125.wav|14
2035
+ ./Data/p259/168.wav|17
2036
+ ./Data/p228/18.wav|1
2037
+ ./Data/p270/88.wav|18
2038
+ ./Data/p270/25.wav|18
2039
+ ./Data/p231/89.wav|4
2040
+ ./Data/p230/14.wav|3
2041
+ ./Data/p254/63.wav|14
2042
+ ./Data/p233/53.wav|5
2043
+ ./Data/p225/54.wav|0
2044
+ ./Data/p243/19.wav|13
2045
+ ./Data/p259/139.wav|17
2046
+ ./Data/p229/87.wav|2
2047
+ ./Data/p232/56.wav|12
2048
+ ./Data/p270/97.wav|18
2049
+ ./Data/p232/95.wav|12
2050
+ ./Data/p232/86.wav|12
2051
+ ./Data/p259/137.wav|17
2052
+ ./Data/p228/147.wav|1
2053
+ ./Data/p273/112.wav|19
2054
+ ./Data/p243/80.wav|13
2055
+ ./Data/p233/72.wav|5
2056
+ ./Data/p233/114.wav|5
2057
+ ./Data/p240/23.wav|8
2058
+ ./Data/p236/164.wav|6
2059
+ ./Data/p236/144.wav|6
2060
+ ./Data/p254/116.wav|14
2061
+ ./Data/p273/105.wav|19
2062
+ ./Data/p239/48.wav|7
2063
+ ./Data/p236/68.wav|6
2064
+ ./Data/p233/87.wav|5
2065
+ ./Data/p239/50.wav|7
2066
+ ./Data/p256/66.wav|15
2067
+ ./Data/p270/159.wav|18
2068
+ ./Data/p273/53.wav|19
2069
+ ./Data/p254/28.wav|14
2070
+ ./Data/p259/28.wav|17
2071
+ ./Data/p227/89.wav|11
2072
+ ./Data/p243/1.wav|13
2073
+ ./Data/p239/61.wav|7
2074
+ ./Data/p226/28.wav|10
2075
+ ./Data/p232/113.wav|12
2076
+ ./Data/p225/38.wav|0
2077
+ ./Data/p236/128.wav|6
2078
+ ./Data/p225/3.wav|0
2079
+ ./Data/p258/83.wav|16
2080
+ ./Data/p270/195.wav|18
2081
+ ./Data/p231/69.wav|4
2082
+ ./Data/p254/49.wav|14
2083
+ ./Data/p226/135.wav|10
2084
+ ./Data/p230/3.wav|3
2085
+ ./Data/p228/124.wav|1
2086
+ ./Data/p233/119.wav|5
2087
+ ./Data/p229/31.wav|2
2088
+ ./Data/p256/54.wav|15
2089
+ ./Data/p258/121.wav|16
2090
+ ./Data/p231/57.wav|4
2091
+ ./Data/p244/84.wav|9
2092
+ ./Data/p244/113.wav|9
2093
+ ./Data/p228/71.wav|1
2094
+ ./Data/p270/86.wav|18
2095
+ ./Data/p254/98.wav|14
2096
+ ./Data/p225/19.wav|0
2097
+ ./Data/p258/21.wav|16
2098
+ ./Data/p259/60.wav|17
2099
+ ./Data/p227/105.wav|11
2100
+ ./Data/p258/142.wav|16
2101
+ ./Data/p230/52.wav|3
2102
+ ./Data/p227/6.wav|11
2103
+ ./Data/p244/139.wav|9
2104
+ ./Data/p226/128.wav|10
2105
+ ./Data/p239/70.wav|7
2106
+ ./Data/p273/28.wav|19
2107
+ ./Data/p230/171.wav|3
2108
+ ./Data/p270/113.wav|18
2109
+ ./Data/p259/19.wav|17
2110
+ ./Data/p225/68.wav|0
2111
+ ./Data/p239/73.wav|7
2112
+ ./Data/p254/44.wav|14
2113
+ ./Data/p240/113.wav|8
2114
+ ./Data/p244/77.wav|9
2115
+ ./Data/p259/49.wav|17
2116
+ ./Data/p225/86.wav|0
2117
+ ./Data/p258/94.wav|16
2118
+ ./Data/p244/17.wav|9
2119
+ ./Data/p227/12.wav|11
2120
+ ./Data/p239/150.wav|7
2121
+ ./Data/p225/10.wav|0
2122
+ ./Data/p230/114.wav|3
2123
+ ./Data/p258/69.wav|16
2124
+ ./Data/p231/117.wav|4
2125
+ ./Data/p244/23.wav|9
2126
+ ./Data/p273/60.wav|19
2127
+ ./Data/p259/156.wav|17
2128
+ ./Data/p239/158.wav|7
2129
+ ./Data/p244/102.wav|9
2130
+ ./Data/p236/85.wav|6
2131
+ ./Data/p259/2.wav|17
2132
+ ./Data/p259/83.wav|17
2133
+ ./Data/p226/40.wav|10
2134
+ ./Data/p270/34.wav|18
2135
+ ./Data/p240/99.wav|8
2136
+ ./Data/p259/95.wav|17
2137
+ ./Data/p240/79.wav|8
2138
+ ./Data/p239/102.wav|7
2139
+ ./Data/p273/57.wav|19
2140
+ ./Data/p243/85.wav|13
2141
+ ./Data/p239/149.wav|7
2142
+ ./Data/p232/28.wav|12
2143
+ ./Data/p254/25.wav|14
2144
+ ./Data/p233/42.wav|5
2145
+ ./Data/p227/39.wav|11
2146
+ ./Data/p270/77.wav|18
2147
+ ./Data/p233/51.wav|5
2148
+ ./Data/p256/100.wav|15
2149
+ ./Data/p258/140.wav|16
2150
+ ./Data/p229/131.wav|2
2151
+ ./Data/p243/52.wav|13
2152
+ ./Data/p258/84.wav|16
2153
+ ./Data/p229/138.wav|2
2154
+ ./Data/p240/61.wav|8
2155
+ ./Data/p254/27.wav|14
2156
+ ./Data/p232/21.wav|12
2157
+ ./Data/p226/38.wav|10
2158
+ ./Data/p230/158.wav|3
2159
+ ./Data/p256/52.wav|15
2160
+ ./Data/p243/95.wav|13
2161
+ ./Data/p243/89.wav|13
2162
+ ./Data/p226/61.wav|10
2163
+ ./Data/p230/117.wav|3
2164
+ ./Data/p230/92.wav|3
2165
+ ./Data/p236/55.wav|6
2166
+ ./Data/p254/18.wav|14
2167
+ ./Data/p254/129.wav|14
2168
+ ./Data/p259/113.wav|17
2169
+ ./Data/p225/25.wav|0
2170
+ ./Data/p240/134.wav|8
2171
+ ./Data/p230/86.wav|3
2172
+ ./Data/p256/84.wav|15
2173
+ ./Data/p228/99.wav|1
2174
+ ./Data/p239/90.wav|7
2175
+ ./Data/p230/155.wav|3
2176
+ ./Data/p228/40.wav|1
2177
+ ./Data/p254/72.wav|14
2178
+ ./Data/p231/38.wav|4
2179
+ ./Data/p225/32.wav|0
2180
+ ./Data/p228/22.wav|1
2181
+ ./Data/p231/7.wav|4
2182
+ ./Data/p254/39.wav|14
2183
+ ./Data/p240/112.wav|8
2184
+ ./Data/p270/183.wav|18
2185
+ ./Data/p270/60.wav|18
2186
+ ./Data/p236/120.wav|6
2187
+ ./Data/p239/145.wav|7
2188
+ ./Data/p240/31.wav|8
2189
+ ./Data/p229/115.wav|2
2190
+ ./Data/p233/121.wav|5
2191
+ ./Data/p228/33.wav|1
2192
+ ./Data/p228/83.wav|1
2193
+ ./Data/p258/58.wav|16
2194
+ ./Data/p239/106.wav|7
2195
+ ./Data/p273/123.wav|19
2196
+ ./Data/p244/50.wav|9
2197
+ ./Data/p229/50.wav|2
2198
+ ./Data/p270/131.wav|18
2199
+ ./Data/p236/8.wav|6
2200
+ ./Data/p244/114.wav|9
2201
+ ./Data/p230/153.wav|3
2202
+ ./Data/p226/53.wav|10
2203
+ ./Data/p240/93.wav|8
2204
+ ./Data/p229/122.wav|2
2205
+ ./Data/p256/90.wav|15
2206
+ ./Data/p231/112.wav|4
2207
+ ./Data/p270/48.wav|18
2208
+ ./Data/p230/36.wav|3
2209
+ ./Data/p230/135.wav|3
2210
+ ./Data/p259/172.wav|17
2211
+ ./Data/p229/55.wav|2
2212
+ ./Data/p244/60.wav|9
2213
+ ./Data/p232/75.wav|12
2214
+ ./Data/p259/68.wav|17
2215
+ ./Data/p233/7.wav|5
2216
+ ./Data/p233/3.wav|5
2217
+ ./Data/p226/141.wav|10
2218
+ ./Data/p254/32.wav|14
2219
+ ./Data/p239/26.wav|7
2220
+ ./Data/p226/119.wav|10
2221
+ ./Data/p239/173.wav|7
2222
+ ./Data/p230/157.wav|3
2223
+ ./Data/p236/157.wav|6
2224
+ ./Data/p226/13.wav|10
2225
+ ./Data/p254/68.wav|14
2226
+ ./Data/p225/87.wav|0
2227
+ ./Data/p231/118.wav|4
2228
+ ./Data/p240/98.wav|8
2229
+ ./Data/p233/5.wav|5
2230
+ ./Data/p227/56.wav|11
2231
+ ./Data/p239/93.wav|7
2232
+ ./Data/p240/25.wav|8
2233
+ ./Data/p243/142.wav|13
2234
+ ./Data/p254/110.wav|14
2235
+ ./Data/p230/138.wav|3
2236
+ ./Data/p226/16.wav|10
2237
+ ./Data/p270/189.wav|18
2238
+ ./Data/p229/95.wav|2
2239
+ ./Data/p231/37.wav|4
2240
+ ./Data/p240/44.wav|8
2241
+ ./Data/p228/46.wav|1
2242
+ ./Data/p236/62.wav|6
2243
+ ./Data/p226/20.wav|10
2244
+ ./Data/p228/105.wav|1
2245
+ ./Data/p258/44.wav|16
2246
+ ./Data/p258/23.wav|16
2247
+ ./Data/p270/108.wav|18
2248
+ ./Data/p243/151.wav|13
2249
+ ./Data/p239/170.wav|7
2250
+ ./Data/p244/100.wav|9
2251
+ ./Data/p258/81.wav|16
2252
+ ./Data/p236/153.wav|6
2253
+ ./Data/p229/5.wav|2
2254
+ ./Data/p256/112.wav|15
2255
+ ./Data/p258/70.wav|16
2256
+ ./Data/p240/57.wav|8
2257
+ ./Data/p244/36.wav|9
2258
+ ./Data/p273/19.wav|19
2259
+ ./Data/p233/75.wav|5
2260
+ ./Data/p259/111.wav|17
2261
+ ./Data/p243/100.wav|13
2262
+ ./Data/p226/86.wav|10
2263
+ ./Data/p256/26.wav|15
2264
+ ./Data/p236/22.wav|6
2265
+ ./Data/p229/124.wav|2
2266
+ ./Data/p229/62.wav|2
2267
+ ./Data/p258/87.wav|16
2268
+ ./Data/p232/22.wav|12
2269
+ ./Data/p259/158.wav|17
2270
+ ./Data/p229/135.wav|2
2271
+ ./Data/p233/118.wav|5
2272
+ ./Data/p236/134.wav|6
2273
+ ./Data/p226/34.wav|10
2274
+ ./Data/p236/93.wav|6
2275
+ ./Data/p243/108.wav|13
2276
+ ./Data/p270/177.wav|18
2277
+ ./Data/p239/30.wav|7
2278
+ ./Data/p273/17.wav|19
2279
+ ./Data/p231/110.wav|4
2280
+ ./Data/p229/119.wav|2
2281
+ ./Data/p243/130.wav|13
2282
+ ./Data/p256/127.wav|15
2283
+ ./Data/p226/105.wav|10
2284
+ ./Data/p229/52.wav|2
2285
+ ./Data/p226/54.wav|10
2286
+ ./Data/p273/87.wav|19
2287
+ ./Data/p270/57.wav|18
2288
+ ./Data/p240/131.wav|8
2289
+ ./Data/p273/117.wav|19
2290
+ ./Data/p240/77.wav|8
2291
+ ./Data/p233/32.wav|5
2292
+ ./Data/p236/25.wav|6
2293
+ ./Data/p227/79.wav|11
2294
+ ./Data/p258/64.wav|16
2295
+ ./Data/p240/92.wav|8
2296
+ ./Data/p244/74.wav|9
2297
+ ./Data/p228/120.wav|1
2298
+ ./Data/p230/45.wav|3
2299
+ ./Data/p225/89.wav|0
2300
+ ./Data/p226/95.wav|10
2301
+ ./Data/p270/80.wav|18
2302
+ ./Data/p226/111.wav|10
2303
+ ./Data/p243/2.wav|13
2304
+ ./Data/p259/6.wav|17
2305
+ ./Data/p227/85.wav|11
2306
+ ./Data/p233/106.wav|5
2307
+ ./Data/p227/14.wav|11
2308
+ ./Data/p231/50.wav|4
2309
+ ./Data/p230/139.wav|3
2310
+ ./Data/p229/70.wav|2
2311
+ ./Data/p258/14.wav|16
2312
+ ./Data/p240/116.wav|8
2313
+ ./Data/p225/64.wav|0
2314
+ ./Data/p225/8.wav|0
2315
+ ./Data/p243/113.wav|13
2316
+ ./Data/p254/102.wav|14
2317
+ ./Data/p270/148.wav|18
2318
+ ./Data/p232/12.wav|12
2319
+ ./Data/p259/22.wav|17
2320
+ ./Data/p273/4.wav|19
2321
+ ./Data/p244/133.wav|9
2322
+ ./Data/p228/101.wav|1
2323
+ ./Data/p273/31.wav|19
2324
+ ./Data/p258/76.wav|16
2325
+ ./Data/p227/146.wav|11
2326
+ ./Data/p231/54.wav|4
2327
+ ./Data/p236/37.wav|6
2328
+ ./Data/p244/82.wav|9
2329
+ ./Data/p225/17.wav|0
2330
+ ./Data/p243/76.wav|13
2331
+ ./Data/p273/140.wav|19
2332
+ ./Data/p239/15.wav|7
2333
+ ./Data/p230/19.wav|3
2334
+ ./Data/p240/117.wav|8
2335
+ ./Data/p244/94.wav|9
2336
+ ./Data/p236/26.wav|6
2337
+ ./Data/p259/99.wav|17
2338
+ ./Data/p225/77.wav|0
2339
+ ./Data/p244/31.wav|9
2340
+ ./Data/p244/98.wav|9
2341
+ ./Data/p243/59.wav|13
2342
+ ./Data/p228/163.wav|1
2343
+ ./Data/p270/141.wav|18
2344
+ ./Data/p230/94.wav|3
2345
+ ./Data/p228/110.wav|1
2346
+ ./Data/p243/160.wav|13
2347
+ ./Data/p239/162.wav|7
2348
+ ./Data/p232/112.wav|12
2349
+ ./Data/p273/54.wav|19
2350
+ ./Data/p259/110.wav|17
2351
+ ./Data/p244/64.wav|9
2352
+ ./Data/p259/170.wav|17
2353
+ ./Data/p230/53.wav|3
2354
+ ./Data/p228/8.wav|1
2355
+ ./Data/p232/80.wav|12
2356
+ ./Data/p273/56.wav|19
2357
+ ./Data/p256/93.wav|15
2358
+ ./Data/p258/50.wav|16
2359
+ ./Data/p231/41.wav|4
2360
+ ./Data/p236/76.wav|6
2361
+ ./Data/p229/65.wav|2
2362
+ ./Data/p243/46.wav|13
2363
+ ./Data/p228/31.wav|1
2364
+ ./Data/p240/89.wav|8
2365
+ ./Data/p240/119.wav|8
2366
+ ./Data/p243/31.wav|13
2367
+ ./Data/p273/122.wav|19
2368
+ ./Data/p236/113.wav|6
2369
+ ./Data/p232/67.wav|12
2370
+ ./Data/p270/188.wav|18
2371
+ ./Data/p256/46.wav|15
2372
+ ./Data/p230/12.wav|3
2373
+ ./Data/p236/156.wav|6
2374
+ ./Data/p243/157.wav|13
2375
+ ./Data/p239/22.wav|7
2376
+ ./Data/p232/107.wav|12
2377
+ ./Data/p229/28.wav|2
2378
+ ./Data/p236/31.wav|6
2379
+ ./Data/p254/50.wav|14
2380
+ ./Data/p232/43.wav|12
2381
+ ./Data/p244/142.wav|9
2382
+ ./Data/p270/186.wav|18
2383
+ ./Data/p258/139.wav|16
2384
+ ./Data/p228/156.wav|1
2385
+ ./Data/p256/12.wav|15
2386
+ ./Data/p256/63.wav|15
2387
+ ./Data/p230/116.wav|3
2388
+ ./Data/p254/131.wav|14
2389
+ ./Data/p243/139.wav|13
2390
+ ./Data/p226/93.wav|10
2391
+ ./Data/p239/98.wav|7
2392
+ ./Data/p256/94.wav|15
2393
+ ./Data/p243/102.wav|13
2394
+ ./Data/p240/3.wav|8
2395
+ ./Data/p258/86.wav|16
2396
+ ./Data/p227/28.wav|11
2397
+ ./Data/p228/49.wav|1
2398
+ ./Data/p270/47.wav|18
2399
+ ./Data/p226/88.wav|10
2400
+ ./Data/p232/36.wav|12
2401
+ ./Data/p259/90.wav|17
2402
+ ./Data/p244/5.wav|9
2403
+ ./Data/p243/122.wav|13
2404
+ ./Data/p254/10.wav|14
2405
+ ./Data/p254/64.wav|14
2406
+ ./Data/p273/47.wav|19
2407
+ ./Data/p243/171.wav|13
2408
+ ./Data/p243/4.wav|13
2409
+ ./Data/p230/128.wav|3
2410
+ ./Data/p229/84.wav|2
2411
+ ./Data/p259/39.wav|17
2412
+ ./Data/p236/5.wav|6
2413
+ ./Data/p225/47.wav|0
2414
+ ./Data/p258/134.wav|16
2415
+ ./Data/p259/12.wav|17
2416
+ ./Data/p244/9.wav|9
2417
+ ./Data/p227/98.wav|11
2418
+ ./Data/p227/65.wav|11
2419
+ ./Data/p226/114.wav|10
2420
+ ./Data/p229/113.wav|2
2421
+ ./Data/p240/16.wav|8
2422
+ ./Data/p227/118.wav|11
2423
+ ./Data/p258/56.wav|16
2424
+ ./Data/p270/150.wav|18
2425
+ ./Data/p256/85.wav|15
2426
+ ./Data/p259/92.wav|17
2427
+ ./Data/p239/84.wav|7
2428
+ ./Data/p240/86.wav|8
2429
+ ./Data/p225/11.wav|0
2430
+ ./Data/p226/25.wav|10
2431
+ ./Data/p270/65.wav|18
2432
+ ./Data/p239/79.wav|7
2433
+ ./Data/p240/76.wav|8
2434
+ ./Data/p270/190.wav|18
2435
+ ./Data/p236/163.wav|6
2436
+ ./Data/p236/36.wav|6
2437
+ ./Data/p240/41.wav|8
2438
+ ./Data/p226/2.wav|10
2439
+ ./Data/p230/104.wav|3
2440
+ ./Data/p243/106.wav|13
2441
+ ./Data/p243/90.wav|13
2442
+ ./Data/p240/27.wav|8
2443
+ ./Data/p240/30.wav|8
2444
+ ./Data/p231/121.wav|4
2445
+ ./Data/p239/8.wav|7
2446
+ ./Data/p230/10.wav|3
2447
+ ./Data/p239/104.wav|7
2448
+ ./Data/p233/96.wav|5
2449
+ ./Data/p236/33.wav|6
2450
+ ./Data/p254/85.wav|14
2451
+ ./Data/p227/26.wav|11
2452
+ ./Data/p233/134.wav|5
2453
+ ./Data/p230/48.wav|3
2454
+ ./Data/p232/59.wav|12
2455
+ ./Data/p239/156.wav|7
2456
+ ./Data/p236/84.wav|6
2457
+ ./Data/p228/63.wav|1
2458
+ ./Data/p229/24.wav|2
2459
+ ./Data/p236/155.wav|6
2460
+ ./Data/p228/138.wav|1
2461
+ ./Data/p270/185.wav|18
2462
+ ./Data/p228/76.wav|1
2463
+ ./Data/p254/115.wav|14
2464
+ ./Data/p231/52.wav|4
2465
+ ./Data/p273/8.wav|19
2466
+ ./Data/p228/132.wav|1
2467
+ ./Data/p273/59.wav|19
2468
+ ./Data/p229/73.wav|2
2469
+ ./Data/p259/152.wav|17
2470
+ ./Data/p230/31.wav|3
2471
+ ./Data/p230/35.wav|3
2472
+ ./Data/p258/80.wav|16
2473
+ ./Data/p225/61.wav|0
2474
+ ./Data/p236/21.wav|6
2475
+ ./Data/p232/127.wav|12
2476
+ ./Data/p256/72.wav|15
2477
+ ./Data/p244/123.wav|9
2478
+ ./Data/p244/141.wav|9
2479
+ ./Data/p270/69.wav|18
2480
+ ./Data/p227/51.wav|11
2481
+ ./Data/p273/11.wav|19
2482
+ ./Data/p243/112.wav|13
2483
+ ./Data/p254/16.wav|14
2484
+ ./Data/p226/3.wav|10
2485
+ ./Data/p231/36.wav|4
2486
+ ./Data/p243/159.wav|13
2487
+ ./Data/p228/55.wav|1
2488
+ ./Data/p229/18.wav|2
2489
+ ./Data/p273/22.wav|19
2490
+ ./Data/p270/101.wav|18
2491
+ ./Data/p227/62.wav|11
2492
+ ./Data/p270/111.wav|18
2493
+ ./Data/p254/73.wav|14
2494
+ ./Data/p256/81.wav|15
2495
+ ./Data/p226/116.wav|10
2496
+ ./Data/p236/154.wav|6
2497
+ ./Data/p233/98.wav|5
2498
+ ./Data/p239/68.wav|7
2499
+ ./Data/p273/69.wav|19
2500
+ ./Data/p236/92.wav|6
2501
+ ./Data/p273/81.wav|19
2502
+ ./Data/p225/43.wav|0
2503
+ ./Data/p230/27.wav|3
2504
+ ./Data/p227/54.wav|11
2505
+ ./Data/p233/113.wav|5
2506
+ ./Data/p236/23.wav|6
2507
+ ./Data/p236/51.wav|6
2508
+ ./Data/p233/50.wav|5
2509
+ ./Data/p225/76.wav|0
2510
+ ./Data/p244/21.wav|9
2511
+ ./Data/p228/53.wav|1
2512
+ ./Data/p240/148.wav|8
2513
+ ./Data/p243/173.wav|13
2514
+ ./Data/p270/105.wav|18
2515
+ ./Data/p227/13.wav|11
2516
+ ./Data/p228/121.wav|1
2517
+ ./Data/p233/128.wav|5
2518
+ ./Data/p256/82.wav|15
2519
+ ./Data/p244/76.wav|9
2520
+ ./Data/p232/9.wav|12
2521
+ ./Data/p239/4.wav|7
2522
+ ./Data/p240/106.wav|8
2523
+ ./Data/p270/81.wav|18
2524
+ ./Data/p225/48.wav|0
2525
+ ./Data/p254/67.wav|14
2526
+ ./Data/p240/66.wav|8
2527
+ ./Data/p259/47.wav|17
2528
+ ./Data/p230/63.wav|3
2529
+ ./Data/p230/141.wav|3
2530
+ ./Data/p231/137.wav|4
2531
+ ./Data/p227/133.wav|11
2532
+ ./Data/p259/100.wav|17
2533
+ ./Data/p259/171.wav|17
2534
+ ./Data/p240/56.wav|8
2535
+ ./Data/p273/126.wav|19
2536
+ ./Data/p256/32.wav|15
2537
+ ./Data/p270/79.wav|18
2538
+ ./Data/p227/46.wav|11
2539
+ ./Data/p228/51.wav|1
2540
+ ./Data/p243/54.wav|13
2541
+ ./Data/p258/141.wav|16
2542
+ ./Data/p226/31.wav|10
2543
+ ./Data/p236/137.wav|6
2544
+ ./Data/p230/30.wav|3
2545
+ ./Data/p236/34.wav|6
2546
+ ./Data/p228/35.wav|1
2547
+ ./Data/p244/56.wav|9
2548
+ ./Data/p230/107.wav|3
2549
+ ./Data/p240/36.wav|8
2550
+ ./Data/p233/62.wav|5
2551
+ ./Data/p239/112.wav|7
2552
+ ./Data/p231/42.wav|4
2553
+ ./Data/p256/9.wav|15
2554
+ ./Data/p227/23.wav|11
2555
+ ./Data/p236/32.wav|6
2556
+ ./Data/p228/67.wav|1
2557
+ ./Data/p225/72.wav|0
2558
+ ./Data/p232/82.wav|12
2559
+ ./Data/p244/68.wav|9
2560
+ ./Data/p230/145.wav|3
2561
+ ./Data/p239/5.wav|7
2562
+ ./Data/p230/154.wav|3
2563
+ ./Data/p232/98.wav|12
2564
+ ./Data/p243/136.wav|13
2565
+ ./Data/p228/115.wav|1
2566
+ ./Data/p226/5.wav|10
2567
+ ./Data/p240/52.wav|8
2568
+ ./Data/p270/170.wav|18
2569
+ ./Data/p243/93.wav|13
2570
+ ./Data/p243/26.wav|13
2571
+ ./Data/p230/136.wav|3
2572
+ ./Data/p226/97.wav|10
2573
+ ./Data/p229/136.wav|2
2574
+ ./Data/p227/136.wav|11
2575
+ ./Data/p236/119.wav|6
2576
+ ./Data/p232/14.wav|12
2577
+ ./Data/p254/138.wav|14
2578
+ ./Data/p240/143.wav|8
2579
+ ./Data/p259/122.wav|17
2580
+ ./Data/p270/205.wav|18
2581
+ ./Data/p254/100.wav|14
2582
+ ./Data/p270/149.wav|18
2583
+ ./Data/p259/9.wav|17
2584
+ ./Data/p226/96.wav|10
2585
+ ./Data/p230/23.wav|3
2586
+ ./Data/p244/72.wav|9
2587
+ ./Data/p259/73.wav|17
2588
+ ./Data/p227/68.wav|11
2589
+ ./Data/p226/75.wav|10
2590
+ ./Data/p236/109.wav|6
2591
+ ./Data/p258/102.wav|16
2592
+ ./Data/p232/44.wav|12
2593
+ ./Data/p243/27.wav|13
2594
+ ./Data/p232/126.wav|12
2595
+ ./Data/p240/14.wav|8
2596
+ ./Data/p226/71.wav|10
2597
+ ./Data/p230/88.wav|3
2598
+ ./Data/p233/45.wav|5
2599
+ ./Data/p244/103.wav|9
2600
+ ./Data/p232/26.wav|12
2601
+ ./Data/p229/101.wav|2
2602
+ ./Data/p229/44.wav|2
2603
+ ./Data/p232/123.wav|12
2604
+ ./Data/p228/129.wav|1
2605
+ ./Data/p273/32.wav|19
2606
+ ./Data/p232/125.wav|12
2607
+ ./Data/p240/103.wav|8
2608
+ ./Data/p254/128.wav|14
2609
+ ./Data/p254/34.wav|14
2610
+ ./Data/p240/19.wav|8
2611
+ ./Data/p232/89.wav|12
2612
+ ./Data/p273/73.wav|19
2613
+ ./Data/p231/109.wav|4
2614
+ ./Data/p270/124.wav|18
2615
+ ./Data/p244/112.wav|9
2616
+ ./Data/p256/117.wav|15
2617
+ ./Data/p244/88.wav|9
2618
+ ./Data/p228/17.wav|1
2619
+ ./Data/p233/86.wav|5
2620
+ ./Data/p254/23.wav|14
2621
+ ./Data/p233/59.wav|5
2622
+ ./Data/p232/25.wav|12
2623
+ ./Data/p231/108.wav|4
2624
+ ./Data/p258/103.wav|16
2625
+ ./Data/p232/69.wav|12
2626
+ ./Data/p230/65.wav|3
2627
+ ./Data/p240/73.wav|8
2628
+ ./Data/p243/125.wav|13
2629
+ ./Data/p256/92.wav|15
2630
+ ./Data/p270/31.wav|18
2631
+ ./Data/p256/44.wav|15
2632
+ ./Data/p236/98.wav|6
2633
+ ./Data/p228/90.wav|1
2634
+ ./Data/p231/125.wav|4
2635
+ ./Data/p232/64.wav|12
2636
+ ./Data/p273/80.wav|19
2637
+ ./Data/p227/32.wav|11
2638
+ ./Data/p226/17.wav|10
2639
+ ./Data/p226/69.wav|10
2640
+ ./Data/p231/142.wav|4
2641
+ ./Data/p225/65.wav|0
2642
+ ./Data/p229/64.wav|2
2643
+ ./Data/p240/70.wav|8
2644
+ ./Data/p225/85.wav|0
2645
+ ./Data/p259/166.wav|17
2646
+ ./Data/p230/119.wav|3
2647
+ ./Data/p258/135.wav|16
2648
+ ./Data/p225/60.wav|0
2649
+ ./Data/p239/74.wav|7
2650
+ ./Data/p233/117.wav|5
2651
+ ./Data/p226/44.wav|10
2652
+ ./Data/p227/103.wav|11
2653
+ ./Data/p228/45.wav|1
2654
+ ./Data/p244/52.wav|9
2655
+ ./Data/p230/168.wav|3
2656
+ ./Data/p259/71.wav|17
2657
+ ./Data/p270/109.wav|18
2658
+ ./Data/p243/164.wav|13
2659
+ ./Data/p243/36.wav|13
2660
+ ./Data/p270/12.wav|18
2661
+ ./Data/p229/125.wav|2
2662
+ ./Data/p259/51.wav|17
2663
+ ./Data/p225/81.wav|0
2664
+ ./Data/p240/133.wav|8
2665
+ ./Data/p270/130.wav|18
2666
+ ./Data/p228/37.wav|1
2667
+ ./Data/p228/39.wav|1
2668
+ ./Data/p240/35.wav|8
2669
+ ./Data/p231/124.wav|4
2670
+ ./Data/p244/121.wav|9
2671
+ ./Data/p270/133.wav|18
2672
+ ./Data/p227/110.wav|11
2673
+ ./Data/p244/134.wav|9
2674
+ ./Data/p254/59.wav|14
2675
+ ./Data/p239/35.wav|7
2676
+ ./Data/p236/150.wav|6
2677
+ ./Data/p227/40.wav|11
2678
+ ./Data/p258/13.wav|16
2679
+ ./Data/p240/123.wav|8
2680
+ ./Data/p231/141.wav|4
2681
+ ./Data/p228/151.wav|1
2682
+ ./Data/p236/45.wav|6
2683
+ ./Data/p273/5.wav|19
2684
+ ./Data/p231/113.wav|4
2685
+ ./Data/p256/103.wav|15
2686
+ ./Data/p227/87.wav|11
2687
+ ./Data/p270/173.wav|18
2688
+ ./Data/p243/104.wav|13
2689
+ ./Data/p240/141.wav|8
2690
+ ./Data/p240/128.wav|8
2691
+ ./Data/p259/50.wav|17
2692
+ ./Data/p231/8.wav|4
2693
+ ./Data/p226/82.wav|10
2694
+ ./Data/p243/110.wav|13
2695
+ ./Data/p243/101.wav|13
2696
+ ./Data/p259/132.wav|17
2697
+ ./Data/p227/99.wav|11
2698
+ ./Data/p259/42.wav|17
2699
+ ./Data/p229/29.wav|2
2700
+ ./Data/p236/104.wav|6
2701
+ ./Data/p259/34.wav|17
2702
+ ./Data/p254/117.wav|14
2703
+ ./Data/p227/29.wav|11
2704
+ ./Data/p258/111.wav|16
2705
+ ./Data/p229/9.wav|2
2706
+ ./Data/p240/26.wav|8
2707
+ ./Data/p259/89.wav|17
2708
+ ./Data/p270/21.wav|18
2709
+ ./Data/p254/101.wav|14
2710
+ ./Data/p259/40.wav|17
2711
+ ./Data/p240/7.wav|8
2712
+ ./Data/p240/114.wav|8
2713
+ ./Data/p230/176.wav|3
2714
+ ./Data/p231/47.wav|4
2715
+ ./Data/p239/37.wav|7
2716
+ ./Data/p232/51.wav|12
2717
+ ./Data/p270/142.wav|18
2718
+ ./Data/p254/6.wav|14
2719
+ ./Data/p225/50.wav|0
2720
+ ./Data/p227/91.wav|11
2721
+ ./Data/p259/149.wav|17
2722
+ ./Data/p259/125.wav|17
2723
+ ./Data/p229/107.wav|2
2724
+ ./Data/p228/10.wav|1
2725
+ ./Data/p231/107.wav|4
Data/val_list.txt ADDED
@@ -0,0 +1,303 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ./Data/p270/13.wav|18
2
+ ./Data/p273/94.wav|19
3
+ ./Data/p229/97.wav|2
4
+ ./Data/p232/117.wav|12
5
+ ./Data/p226/55.wav|10
6
+ ./Data/p259/102.wav|17
7
+ ./Data/p226/7.wav|10
8
+ ./Data/p254/26.wav|14
9
+ ./Data/p239/115.wav|7
10
+ ./Data/p239/86.wav|7
11
+ ./Data/p229/106.wav|2
12
+ ./Data/p244/43.wav|9
13
+ ./Data/p270/179.wav|18
14
+ ./Data/p273/6.wav|19
15
+ ./Data/p258/101.wav|16
16
+ ./Data/p273/62.wav|19
17
+ ./Data/p228/11.wav|1
18
+ ./Data/p273/103.wav|19
19
+ ./Data/p230/49.wav|3
20
+ ./Data/p233/23.wav|5
21
+ ./Data/p230/122.wav|3
22
+ ./Data/p239/80.wav|7
23
+ ./Data/p226/4.wav|10
24
+ ./Data/p240/50.wav|8
25
+ ./Data/p243/28.wav|13
26
+ ./Data/p236/95.wav|6
27
+ ./Data/p244/126.wav|9
28
+ ./Data/p244/40.wav|9
29
+ ./Data/p239/108.wav|7
30
+ ./Data/p273/72.wav|19
31
+ ./Data/p254/92.wav|14
32
+ ./Data/p231/116.wav|4
33
+ ./Data/p231/32.wav|4
34
+ ./Data/p243/117.wav|13
35
+ ./Data/p256/121.wav|15
36
+ ./Data/p243/3.wav|13
37
+ ./Data/p226/91.wav|10
38
+ ./Data/p256/53.wav|15
39
+ ./Data/p254/75.wav|14
40
+ ./Data/p243/150.wav|13
41
+ ./Data/p231/95.wav|4
42
+ ./Data/p228/81.wav|1
43
+ ./Data/p226/33.wav|10
44
+ ./Data/p232/71.wav|12
45
+ ./Data/p236/4.wav|6
46
+ ./Data/p236/132.wav|6
47
+ ./Data/p254/119.wav|14
48
+ ./Data/p236/7.wav|6
49
+ ./Data/p227/104.wav|11
50
+ ./Data/p226/59.wav|10
51
+ ./Data/p233/35.wav|5
52
+ ./Data/p231/23.wav|4
53
+ ./Data/p273/71.wav|19
54
+ ./Data/p240/74.wav|8
55
+ ./Data/p259/33.wav|17
56
+ ./Data/p259/118.wav|17
57
+ ./Data/p273/15.wav|19
58
+ ./Data/p226/115.wav|10
59
+ ./Data/p236/19.wav|6
60
+ ./Data/p226/57.wav|10
61
+ ./Data/p229/14.wav|2
62
+ ./Data/p243/98.wav|13
63
+ ./Data/p243/79.wav|13
64
+ ./Data/p231/12.wav|4
65
+ ./Data/p230/170.wav|3
66
+ ./Data/p228/114.wav|1
67
+ ./Data/p254/103.wav|14
68
+ ./Data/p256/108.wav|15
69
+ ./Data/p256/58.wav|15
70
+ ./Data/p229/23.wav|2
71
+ ./Data/p270/151.wav|18
72
+ ./Data/p259/36.wav|17
73
+ ./Data/p230/64.wav|3
74
+ ./Data/p226/134.wav|10
75
+ ./Data/p230/84.wav|3
76
+ ./Data/p270/91.wav|18
77
+ ./Data/p230/160.wav|3
78
+ ./Data/p236/15.wav|6
79
+ ./Data/p225/45.wav|0
80
+ ./Data/p239/62.wav|7
81
+ ./Data/p256/107.wav|15
82
+ ./Data/p258/144.wav|16
83
+ ./Data/p229/37.wav|2
84
+ ./Data/p226/108.wav|10
85
+ ./Data/p225/92.wav|0
86
+ ./Data/p227/138.wav|11
87
+ ./Data/p230/151.wav|3
88
+ ./Data/p229/90.wav|2
89
+ ./Data/p244/131.wav|9
90
+ ./Data/p231/1.wav|4
91
+ ./Data/p243/40.wav|13
92
+ ./Data/p226/131.wav|10
93
+ ./Data/p226/121.wav|10
94
+ ./Data/p270/119.wav|18
95
+ ./Data/p225/4.wav|0
96
+ ./Data/p243/39.wav|13
97
+ ./Data/p233/1.wav|5
98
+ ./Data/p239/117.wav|7
99
+ ./Data/p259/101.wav|17
100
+ ./Data/p228/73.wav|1
101
+ ./Data/p273/78.wav|19
102
+ ./Data/p256/22.wav|15
103
+ ./Data/p244/65.wav|9
104
+ ./Data/p240/17.wav|8
105
+ ./Data/p258/47.wav|16
106
+ ./Data/p239/95.wav|7
107
+ ./Data/p243/119.wav|13
108
+ ./Data/p259/106.wav|17
109
+ ./Data/p233/22.wav|5
110
+ ./Data/p232/60.wav|12
111
+ ./Data/p270/55.wav|18
112
+ ./Data/p230/87.wav|3
113
+ ./Data/p270/139.wav|18
114
+ ./Data/p225/5.wav|0
115
+ ./Data/p243/128.wav|13
116
+ ./Data/p258/10.wav|16
117
+ ./Data/p230/100.wav|3
118
+ ./Data/p239/43.wav|7
119
+ ./Data/p232/57.wav|12
120
+ ./Data/p256/27.wav|15
121
+ ./Data/p232/130.wav|12
122
+ ./Data/p243/153.wav|13
123
+ ./Data/p258/92.wav|16
124
+ ./Data/p232/81.wav|12
125
+ ./Data/p256/65.wav|15
126
+ ./Data/p259/107.wav|17
127
+ ./Data/p239/10.wav|7
128
+ ./Data/p233/4.wav|5
129
+ ./Data/p259/165.wav|17
130
+ ./Data/p225/41.wav|0
131
+ ./Data/p229/61.wav|2
132
+ ./Data/p227/36.wav|11
133
+ ./Data/p243/62.wav|13
134
+ ./Data/p259/31.wav|17
135
+ ./Data/p231/75.wav|4
136
+ ./Data/p233/31.wav|5
137
+ ./Data/p273/66.wav|19
138
+ ./Data/p226/6.wav|10
139
+ ./Data/p243/162.wav|13
140
+ ./Data/p229/21.wav|2
141
+ ./Data/p230/11.wav|3
142
+ ./Data/p231/84.wav|4
143
+ ./Data/p273/118.wav|19
144
+ ./Data/p227/92.wav|11
145
+ ./Data/p256/110.wav|15
146
+ ./Data/p230/105.wav|3
147
+ ./Data/p239/75.wav|7
148
+ ./Data/p229/78.wav|2
149
+ ./Data/p254/111.wav|14
150
+ ./Data/p232/24.wav|12
151
+ ./Data/p233/19.wav|5
152
+ ./Data/p233/52.wav|5
153
+ ./Data/p258/143.wav|16
154
+ ./Data/p254/135.wav|14
155
+ ./Data/p232/37.wav|12
156
+ ./Data/p244/81.wav|9
157
+ ./Data/p270/161.wav|18
158
+ ./Data/p233/43.wav|5
159
+ ./Data/p240/40.wav|8
160
+ ./Data/p244/70.wav|9
161
+ ./Data/p254/1.wav|14
162
+ ./Data/p229/96.wav|2
163
+ ./Data/p243/99.wav|13
164
+ ./Data/p259/20.wav|17
165
+ ./Data/p233/66.wav|5
166
+ ./Data/p239/88.wav|7
167
+ ./Data/p225/71.wav|0
168
+ ./Data/p227/143.wav|11
169
+ ./Data/p228/142.wav|1
170
+ ./Data/p231/135.wav|4
171
+ ./Data/p254/107.wav|14
172
+ ./Data/p233/36.wav|5
173
+ ./Data/p232/19.wav|12
174
+ ./Data/p258/113.wav|16
175
+ ./Data/p243/96.wav|13
176
+ ./Data/p273/90.wav|19
177
+ ./Data/p225/13.wav|0
178
+ ./Data/p228/32.wav|1
179
+ ./Data/p229/60.wav|2
180
+ ./Data/p273/14.wav|19
181
+ ./Data/p239/25.wav|7
182
+ ./Data/p256/31.wav|15
183
+ ./Data/p225/40.wav|0
184
+ ./Data/p273/43.wav|19
185
+ ./Data/p270/206.wav|18
186
+ ./Data/p244/19.wav|9
187
+ ./Data/p244/83.wav|9
188
+ ./Data/p259/134.wav|17
189
+ ./Data/p244/91.wav|9
190
+ ./Data/p225/80.wav|0
191
+ ./Data/p227/60.wav|11
192
+ ./Data/p244/128.wav|9
193
+ ./Data/p256/80.wav|15
194
+ ./Data/p256/15.wav|15
195
+ ./Data/p244/34.wav|9
196
+ ./Data/p256/69.wav|15
197
+ ./Data/p228/15.wav|1
198
+ ./Data/p232/65.wav|12
199
+ ./Data/p273/65.wav|19
200
+ ./Data/p239/124.wav|7
201
+ ./Data/p259/15.wav|17
202
+ ./Data/p226/137.wav|10
203
+ ./Data/p243/75.wav|13
204
+ ./Data/p258/16.wav|16
205
+ ./Data/p232/6.wav|12
206
+ ./Data/p231/106.wav|4
207
+ ./Data/p228/6.wav|1
208
+ ./Data/p243/172.wav|13
209
+ ./Data/p236/77.wav|6
210
+ ./Data/p256/95.wav|15
211
+ ./Data/p256/76.wav|15
212
+ ./Data/p239/119.wav|7
213
+ ./Data/p236/108.wav|6
214
+ ./Data/p243/92.wav|13
215
+ ./Data/p232/129.wav|12
216
+ ./Data/p230/124.wav|3
217
+ ./Data/p228/9.wav|1
218
+ ./Data/p232/100.wav|12
219
+ ./Data/p254/5.wav|14
220
+ ./Data/p273/1.wav|19
221
+ ./Data/p236/47.wav|6
222
+ ./Data/p240/87.wav|8
223
+ ./Data/p229/127.wav|2
224
+ ./Data/p228/152.wav|1
225
+ ./Data/p225/24.wav|0
226
+ ./Data/p229/20.wav|2
227
+ ./Data/p233/12.wav|5
228
+ ./Data/p259/46.wav|17
229
+ ./Data/p231/72.wav|4
230
+ ./Data/p254/65.wav|14
231
+ ./Data/p231/18.wav|4
232
+ ./Data/p270/66.wav|18
233
+ ./Data/p233/44.wav|5
234
+ ./Data/p233/126.wav|5
235
+ ./Data/p233/58.wav|5
236
+ ./Data/p273/142.wav|19
237
+ ./Data/p228/26.wav|1
238
+ ./Data/p230/106.wav|3
239
+ ./Data/p228/109.wav|1
240
+ ./Data/p232/76.wav|12
241
+ ./Data/p226/37.wav|10
242
+ ./Data/p226/66.wav|10
243
+ ./Data/p270/75.wav|18
244
+ ./Data/p229/4.wav|2
245
+ ./Data/p239/166.wav|7
246
+ ./Data/p228/79.wav|1
247
+ ./Data/p230/43.wav|3
248
+ ./Data/p258/100.wav|16
249
+ ./Data/p244/93.wav|9
250
+ ./Data/p256/105.wav|15
251
+ ./Data/p236/12.wav|6
252
+ ./Data/p270/154.wav|18
253
+ ./Data/p244/75.wav|9
254
+ ./Data/p239/160.wav|7
255
+ ./Data/p239/174.wav|7
256
+ ./Data/p225/26.wav|0
257
+ ./Data/p232/49.wav|12
258
+ ./Data/p258/19.wav|16
259
+ ./Data/p273/13.wav|19
260
+ ./Data/p232/32.wav|12
261
+ ./Data/p270/42.wav|18
262
+ ./Data/p270/194.wav|18
263
+ ./Data/p259/174.wav|17
264
+ ./Data/p236/53.wav|6
265
+ ./Data/p232/77.wav|12
266
+ ./Data/p240/118.wav|8
267
+ ./Data/p239/175.wav|7
268
+ ./Data/p225/58.wav|0
269
+ ./Data/p232/1.wav|12
270
+ ./Data/p243/5.wav|13
271
+ ./Data/p229/41.wav|2
272
+ ./Data/p233/60.wav|5
273
+ ./Data/p236/138.wav|6
274
+ ./Data/p258/54.wav|16
275
+ ./Data/p254/22.wav|14
276
+ ./Data/p254/76.wav|14
277
+ ./Data/p228/25.wav|1
278
+ ./Data/p259/61.wav|17
279
+ ./Data/p270/135.wav|18
280
+ ./Data/p231/136.wav|4
281
+ ./Data/p232/105.wav|12
282
+ ./Data/p259/35.wav|17
283
+ ./Data/p244/57.wav|9
284
+ ./Data/p226/104.wav|10
285
+ ./Data/p258/48.wav|16
286
+ ./Data/p229/139.wav|2
287
+ ./Data/p239/65.wav|7
288
+ ./Data/p228/74.wav|1
289
+ ./Data/p233/25.wav|5
290
+ ./Data/p243/16.wav|13
291
+ ./Data/p243/165.wav|13
292
+ ./Data/p229/46.wav|2
293
+ ./Data/p226/41.wav|10
294
+ ./Data/p228/160.wav|1
295
+ ./Data/p230/90.wav|3
296
+ ./Data/p270/184.wav|18
297
+ ./Data/p259/55.wav|17
298
+ ./Data/p232/31.wav|12
299
+ ./Data/p231/78.wav|4
300
+ ./Data/p259/78.wav|17
301
+ ./Data/p273/33.wav|19
302
+ ./Data/p256/40.wav|15
303
+ ./Data/p258/116.wav|16
Demo/inference.ipynb ADDED
@@ -0,0 +1,471 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "id": "HwaQq4GRU_Nw"
7
+ },
8
+ "source": [
9
+ "# StarGANv2-VC Demo (VCTK 20 Speakers)"
10
+ ]
11
+ },
12
+ {
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "id": "hCpoXuZeGKAn"
16
+ },
17
+ "source": [
18
+ "### Utils"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "code",
23
+ "execution_count": null,
24
+ "metadata": {},
25
+ "outputs": [],
26
+ "source": [
27
+ "%cd .."
28
+ ]
29
+ },
30
+ {
31
+ "cell_type": "code",
32
+ "execution_count": null,
33
+ "metadata": {
34
+ "colab": {
35
+ "base_uri": "https://localhost:8080/"
36
+ },
37
+ "executionInfo": {
38
+ "elapsed": 24923,
39
+ "status": "ok",
40
+ "timestamp": 1613984920200,
41
+ "user": {
42
+ "displayName": "Yinghao Li",
43
+ "photoUrl": "",
44
+ "userId": "12798981472803960591"
45
+ },
46
+ "user_tz": 300
47
+ },
48
+ "id": "3on9IjGhVGTP",
49
+ "outputId": "63a799f8-564d-48c2-fb0f-e66c0cd9fdb8"
50
+ },
51
+ "outputs": [],
52
+ "source": [
53
+ "# load packages\n",
54
+ "import random\n",
55
+ "import yaml\n",
56
+ "from munch import Munch\n",
57
+ "import numpy as np\n",
58
+ "import paddle\n",
59
+ "from paddle import nn\n",
60
+ "import paddle.nn.functional as F\n",
61
+ "import paddleaudio\n",
62
+ "import librosa\n",
63
+ "\n",
64
+ "from starganv2vc_paddle.Utils.ASR.models import ASRCNN\n",
65
+ "from starganv2vc_paddle.Utils.JDC.model import JDCNet\n",
66
+ "from starganv2vc_paddle.models import Generator, MappingNetwork, StyleEncoder\n",
67
+ "\n",
68
+ "%matplotlib inline"
69
+ ]
70
+ },
71
+ {
72
+ "cell_type": "code",
73
+ "execution_count": null,
74
+ "metadata": {},
75
+ "outputs": [],
76
+ "source": [
77
+ "# Source: http://speech.ee.ntu.edu.tw/~jjery2243542/resource/model/is18/en_speaker_used.txt\n",
78
+ "# Source: https://github.com/jjery2243542/voice_conversion\n",
79
+ "\n",
80
+ "speakers = [225,228,229,230,231,233,236,239,240,244,226,227,232,243,254,256,258,259,270,273]\n",
81
+ "\n",
82
+ "to_mel = paddleaudio.features.MelSpectrogram(\n",
83
+ " n_mels=80, n_fft=2048, win_length=1200, hop_length=300)\n",
84
+ "to_mel.fbank_matrix[:] = paddle.load('starganv2vc_paddle/fbank_matrix.pd')['fbank_matrix']\n",
85
+ "mean, std = -4, 4\n",
86
+ "\n",
87
+ "def preprocess(wave):\n",
88
+ " wave_tensor = paddle.to_tensor(wave).astype(paddle.float32)\n",
89
+ " mel_tensor = to_mel(wave_tensor)\n",
90
+ " mel_tensor = (paddle.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std\n",
91
+ " return mel_tensor\n",
92
+ "\n",
93
+ "def build_model(model_params={}):\n",
94
+ " args = Munch(model_params)\n",
95
+ " generator = Generator(args.dim_in, args.style_dim, args.max_conv_dim, w_hpf=args.w_hpf, F0_channel=args.F0_channel)\n",
96
+ " mapping_network = MappingNetwork(args.latent_dim, args.style_dim, args.num_domains, hidden_dim=args.max_conv_dim)\n",
97
+ " style_encoder = StyleEncoder(args.dim_in, args.style_dim, args.num_domains, args.max_conv_dim)\n",
98
+ " \n",
99
+ " nets_ema = Munch(generator=generator,\n",
100
+ " mapping_network=mapping_network,\n",
101
+ " style_encoder=style_encoder)\n",
102
+ "\n",
103
+ " return nets_ema\n",
104
+ "\n",
105
+ "def compute_style(speaker_dicts):\n",
106
+ " reference_embeddings = {}\n",
107
+ " for key, (path, speaker) in speaker_dicts.items():\n",
108
+ " if path == \"\":\n",
109
+ " label = paddle.to_tensor([speaker], dtype=paddle.int64)\n",
110
+ " latent_dim = starganv2.mapping_network.shared[0].weight.shape[0]\n",
111
+ " ref = starganv2.mapping_network(paddle.randn([1, latent_dim]), label)\n",
112
+ " else:\n",
113
+ " wave, sr = librosa.load(path, sr=24000)\n",
114
+ " audio, index = librosa.effects.trim(wave, top_db=30)\n",
115
+ " if sr != 24000:\n",
116
+ " wave = librosa.resample(wave, sr, 24000)\n",
117
+ " mel_tensor = preprocess(wave)\n",
118
+ "\n",
119
+ " with paddle.no_grad():\n",
120
+ " label = paddle.to_tensor([speaker], dtype=paddle.int64)\n",
121
+ " ref = starganv2.style_encoder(mel_tensor.unsqueeze(1), label)\n",
122
+ " reference_embeddings[key] = (ref, label)\n",
123
+ " \n",
124
+ " return reference_embeddings"
125
+ ]
126
+ },
127
+ {
128
+ "cell_type": "markdown",
129
+ "metadata": {},
130
+ "source": [
131
+ "### Load models"
132
+ ]
133
+ },
134
+ {
135
+ "cell_type": "code",
136
+ "execution_count": null,
137
+ "metadata": {},
138
+ "outputs": [],
139
+ "source": [
140
+ "# load F0 model\n",
141
+ "\n",
142
+ "F0_model = JDCNet(num_class=1, seq_len=192)\n",
143
+ "params = paddle.load(\"Models/bst.pd\")['net']\n",
144
+ "F0_model.set_state_dict(params)\n",
145
+ "_ = F0_model.eval()"
146
+ ]
147
+ },
148
+ {
149
+ "cell_type": "code",
150
+ "execution_count": null,
151
+ "metadata": {
152
+ "executionInfo": {
153
+ "elapsed": 43003,
154
+ "status": "ok",
155
+ "timestamp": 1613984938321,
156
+ "user": {
157
+ "displayName": "Yinghao Li",
158
+ "photoUrl": "",
159
+ "userId": "12798981472803960591"
160
+ },
161
+ "user_tz": 300
162
+ },
163
+ "id": "NZA3ot-oF5t-"
164
+ },
165
+ "outputs": [],
166
+ "source": [
167
+ "# load vocoder\n",
168
+ "\n",
169
+ "import yaml\n",
170
+ "import paddle\n",
171
+ "\n",
172
+ "from yacs.config import CfgNode\n",
173
+ "from paddlespeech.t2s.models.parallel_wavegan import PWGGenerator\n",
174
+ "\n",
175
+ "with open('Vocoder/config.yml') as f:\n",
176
+ " voc_config = CfgNode(yaml.safe_load(f))\n",
177
+ "voc_config[\"generator_params\"].pop(\"upsample_net\")\n",
178
+ "voc_config[\"generator_params\"][\"upsample_scales\"] = voc_config[\"generator_params\"].pop(\"upsample_params\")[\"upsample_scales\"]\n",
179
+ "vocoder = PWGGenerator(**voc_config[\"generator_params\"])\n",
180
+ "vocoder.remove_weight_norm()\n",
181
+ "vocoder.eval()\n",
182
+ "vocoder.set_state_dict(paddle.load('Vocoder/checkpoint-400000steps.pd'))"
183
+ ]
184
+ },
185
+ {
186
+ "cell_type": "code",
187
+ "execution_count": null,
188
+ "metadata": {
189
+ "colab": {
190
+ "base_uri": "https://localhost:8080/"
191
+ },
192
+ "executionInfo": {
193
+ "elapsed": 24462,
194
+ "status": "ok",
195
+ "timestamp": 1613985522414,
196
+ "user": {
197
+ "displayName": "Yinghao Li",
198
+ "photoUrl": "",
199
+ "userId": "12798981472803960591"
200
+ },
201
+ "user_tz": 300
202
+ },
203
+ "id": "Ou4367LCyefA",
204
+ "outputId": "19c61f6f-f39a-43b9-9275-09418c2aebb4"
205
+ },
206
+ "outputs": [],
207
+ "source": [
208
+ "# load starganv2\n",
209
+ "\n",
210
+ "model_path = 'Models/vc_ema.pd'\n",
211
+ "\n",
212
+ "with open('Models/config.yml') as f:\n",
213
+ " starganv2_config = yaml.safe_load(f)\n",
214
+ "starganv2 = build_model(model_params=starganv2_config[\"model_params\"])\n",
215
+ "params = paddle.load(model_path)\n",
216
+ "params = params['model_ema']\n",
217
+ "_ = [starganv2[key].set_state_dict(params[key]) for key in starganv2]\n",
218
+ "_ = [starganv2[key].eval() for key in starganv2]\n",
219
+ "starganv2.style_encoder = starganv2.style_encoder\n",
220
+ "starganv2.mapping_network = starganv2.mapping_network\n",
221
+ "starganv2.generator = starganv2.generator"
222
+ ]
223
+ },
224
+ {
225
+ "cell_type": "markdown",
226
+ "metadata": {},
227
+ "source": [
228
+ "### Conversion"
229
+ ]
230
+ },
231
+ {
232
+ "cell_type": "code",
233
+ "execution_count": null,
234
+ "metadata": {},
235
+ "outputs": [],
236
+ "source": [
237
+ "# load input wave\n",
238
+ "selected_speakers = [273, 259, 258, 243, 254, 244, 236, 233, 230, 228]\n",
239
+ "k = random.choice(selected_speakers)\n",
240
+ "wav_path = 'Demo/VCTK-corpus/p' + str(k) + '/p' + str(k) + '_023.wav'\n",
241
+ "audio, source_sr = librosa.load(wav_path, sr=24000)\n",
242
+ "audio = audio / np.max(np.abs(audio))\n",
243
+ "audio.dtype = np.float32"
244
+ ]
245
+ },
246
+ {
247
+ "cell_type": "markdown",
248
+ "metadata": {},
249
+ "source": [
250
+ "#### Convert by style encoder"
251
+ ]
252
+ },
253
+ {
254
+ "cell_type": "code",
255
+ "execution_count": null,
256
+ "metadata": {},
257
+ "outputs": [],
258
+ "source": [
259
+ "# with reference, using style encoder\n",
260
+ "speaker_dicts = {}\n",
261
+ "for s in selected_speakers:\n",
262
+ " k = s\n",
263
+ " speaker_dicts['p' + str(s)] = ('Demo/VCTK-corpus/p' + str(k) + '/p' + str(k) + '_023.wav', speakers.index(s))\n",
264
+ "\n",
265
+ "reference_embeddings = compute_style(speaker_dicts)"
266
+ ]
267
+ },
268
+ {
269
+ "cell_type": "code",
270
+ "execution_count": null,
271
+ "metadata": {
272
+ "colab": {
273
+ "base_uri": "https://localhost:8080/",
274
+ "height": 333
275
+ },
276
+ "executionInfo": {
277
+ "elapsed": 1424,
278
+ "status": "ok",
279
+ "timestamp": 1613986299525,
280
+ "user": {
281
+ "displayName": "Yinghao Li",
282
+ "photoUrl": "",
283
+ "userId": "12798981472803960591"
284
+ },
285
+ "user_tz": 300
286
+ },
287
+ "id": "T5tahObUyN-d",
288
+ "outputId": "f4f38742-2235-4f59-cb2a-5008912cd870",
289
+ "scrolled": true
290
+ },
291
+ "outputs": [],
292
+ "source": [
293
+ "# conversion \n",
294
+ "import time\n",
295
+ "start = time.time()\n",
296
+ " \n",
297
+ "source = preprocess(audio)\n",
298
+ "keys = []\n",
299
+ "converted_samples = {}\n",
300
+ "reconstructed_samples = {}\n",
301
+ "converted_mels = {}\n",
302
+ "\n",
303
+ "for key, (ref, _) in reference_embeddings.items():\n",
304
+ " with paddle.no_grad():\n",
305
+ " f0_feat = F0_model.get_feature_GAN(source.unsqueeze(1))\n",
306
+ " out = starganv2.generator(source.unsqueeze(1), ref, F0=f0_feat)\n",
307
+ " \n",
308
+ " c = out.transpose([0,1,3,2]).squeeze()\n",
309
+ " y_out = vocoder.inference(c)\n",
310
+ " y_out = y_out.reshape([-1])\n",
311
+ "\n",
312
+ " if key not in speaker_dicts or speaker_dicts[key][0] == \"\":\n",
313
+ " recon = None\n",
314
+ " else:\n",
315
+ " wave, sr = librosa.load(speaker_dicts[key][0], sr=24000)\n",
316
+ " mel = preprocess(wave)\n",
317
+ " c = mel.transpose([0,2,1]).squeeze()\n",
318
+ " recon = vocoder.inference(c)\n",
319
+ " recon = recon.reshape([-1]).numpy()\n",
320
+ "\n",
321
+ " converted_samples[key] = y_out.numpy()\n",
322
+ " reconstructed_samples[key] = recon\n",
323
+ "\n",
324
+ " converted_mels[key] = out\n",
325
+ " \n",
326
+ " keys.append(key)\n",
327
+ "end = time.time()\n",
328
+ "print('total processing time: %.3f sec' % (end - start) )\n",
329
+ "\n",
330
+ "import IPython.display as ipd\n",
331
+ "for key, wave in converted_samples.items():\n",
332
+ " print('Converted: %s' % key)\n",
333
+ " display(ipd.Audio(wave, rate=24000))\n",
334
+ " print('Reference (vocoder): %s' % key)\n",
335
+ " if reconstructed_samples[key] is not None:\n",
336
+ " display(ipd.Audio(reconstructed_samples[key], rate=24000))\n",
337
+ "\n",
338
+ "print('Original (vocoder):')\n",
339
+ "wave, sr = librosa.load(wav_path, sr=24000)\n",
340
+ "mel = preprocess(wave)\n",
341
+ "c = mel.transpose([0,2,1]).squeeze()\n",
342
+ "with paddle.no_grad():\n",
343
+ " recon = vocoder.inference(c)\n",
344
+ " recon = recon.reshape([-1]).numpy()\n",
345
+ "display(ipd.Audio(recon, rate=24000))\n",
346
+ "print('Original:')\n",
347
+ "display(ipd.Audio(wav_path, rate=24000))"
348
+ ]
349
+ },
350
+ {
351
+ "cell_type": "markdown",
352
+ "metadata": {
353
+ "id": "SWh3o9hvGvJt"
354
+ },
355
+ "source": [
356
+ "#### Convert by mapping network"
357
+ ]
358
+ },
359
+ {
360
+ "cell_type": "code",
361
+ "execution_count": null,
362
+ "metadata": {},
363
+ "outputs": [],
364
+ "source": [
365
+ "# no reference, using mapping network\n",
366
+ "speaker_dicts = {}\n",
367
+ "selected_speakers = [273, 259, 258, 243, 254, 244, 236, 233, 230, 228]\n",
368
+ "for s in selected_speakers:\n",
369
+ " k = s\n",
370
+ " speaker_dicts['p' + str(s)] = ('', speakers.index(s))\n",
371
+ "\n",
372
+ "reference_embeddings = compute_style(speaker_dicts)"
373
+ ]
374
+ },
375
+ {
376
+ "cell_type": "code",
377
+ "execution_count": null,
378
+ "metadata": {
379
+ "scrolled": true
380
+ },
381
+ "outputs": [],
382
+ "source": [
383
+ "# conversion \n",
384
+ "import time\n",
385
+ "start = time.time()\n",
386
+ " \n",
387
+ "source = preprocess(audio)\n",
388
+ "keys = []\n",
389
+ "converted_samples = {}\n",
390
+ "reconstructed_samples = {}\n",
391
+ "converted_mels = {}\n",
392
+ "\n",
393
+ "for key, (ref, _) in reference_embeddings.items():\n",
394
+ " with paddle.no_grad():\n",
395
+ " f0_feat = F0_model.get_feature_GAN(source.unsqueeze(1))\n",
396
+ " out = starganv2.generator(source.unsqueeze(1), ref, F0=f0_feat)\n",
397
+ " \n",
398
+ " c = out.transpose([0,1,3,2]).squeeze()\n",
399
+ " y_out = vocoder.inference(c)\n",
400
+ " y_out = y_out.reshape([-1])\n",
401
+ "\n",
402
+ " if key not in speaker_dicts or speaker_dicts[key][0] == \"\":\n",
403
+ " recon = None\n",
404
+ " else:\n",
405
+ " wave, sr = librosa.load(speaker_dicts[key][0], sr=24000)\n",
406
+ " mel = preprocess(wave)\n",
407
+ " c = mel.transpose([0,2,1]).squeeze()\n",
408
+ " recon = vocoder.inference(c)\n",
409
+ " recon = recon.reshape([-1]).numpy()\n",
410
+ "\n",
411
+ " converted_samples[key] = y_out.numpy()\n",
412
+ " reconstructed_samples[key] = recon\n",
413
+ "\n",
414
+ " converted_mels[key] = out\n",
415
+ " \n",
416
+ " keys.append(key)\n",
417
+ "end = time.time()\n",
418
+ "print('total processing time: %.3f sec' % (end - start) )\n",
419
+ "\n",
420
+ "import IPython.display as ipd\n",
421
+ "for key, wave in converted_samples.items():\n",
422
+ " print('Converted: %s' % key)\n",
423
+ " display(ipd.Audio(wave, rate=24000))\n",
424
+ " print('Reference (vocoder): %s' % key)\n",
425
+ " if reconstructed_samples[key] is not None:\n",
426
+ " display(ipd.Audio(reconstructed_samples[key], rate=24000))\n",
427
+ "\n",
428
+ "print('Original (vocoder):')\n",
429
+ "wave, sr = librosa.load(wav_path, sr=24000)\n",
430
+ "mel = preprocess(wave)\n",
431
+ "c = mel.transpose([0,2,1]).squeeze().to('cuda')\n",
432
+ "with paddle.no_grad():\n",
433
+ " recon = vocoder.inference(c)\n",
434
+ " recon = recon.reshape([-1]).numpy()\n",
435
+ "display(ipd.Audio(recon, rate=24000))\n",
436
+ "print('Original:')\n",
437
+ "display(ipd.Audio(wav_path, rate=24000))"
438
+ ]
439
+ }
440
+ ],
441
+ "metadata": {
442
+ "accelerator": "GPU",
443
+ "colab": {
444
+ "collapsed_sections": [
445
+ "hCpoXuZeGKAn"
446
+ ],
447
+ "name": "Starganv2_vc.ipynb",
448
+ "provenance": [],
449
+ "toc_visible": true
450
+ },
451
+ "kernelspec": {
452
+ "display_name": "Python 3",
453
+ "language": "python",
454
+ "name": "python3"
455
+ },
456
+ "language_info": {
457
+ "codemirror_mode": {
458
+ "name": "ipython",
459
+ "version": 3
460
+ },
461
+ "file_extension": ".py",
462
+ "mimetype": "text/x-python",
463
+ "name": "python",
464
+ "nbconvert_exporter": "python",
465
+ "pygments_lexer": "ipython3",
466
+ "version": "3.7.10"
467
+ }
468
+ },
469
+ "nbformat": 4,
470
+ "nbformat_minor": 1
471
+ }
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2022 Wu Hecong
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- title: Starganv2vc Paddle
3
- emoji: 🐨
4
- colorFrom: blue
5
- colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 2.9.4
8
  app_file: app.py
@@ -10,4 +10,75 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: StarGANv2 Voice Conversion on PaddlePaddle
3
+ emoji: 🗣️
4
+ colorFrom: green
5
+ colorTo: blue
6
  sdk: gradio
7
  sdk_version: 2.9.4
8
  app_file: app.py
 
10
  license: mit
11
  ---
12
 
13
+ # StarGANv2-VC-Paddle
14
+ [![Baidu AI Studio](https://img.shields.io/static/v1?label=Baidu&message=AI%20Studio%20Free%20A100&color=blue)](https://aistudio.baidu.com/aistudio/projectdetail/3955253)
15
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/HighCWu/starganv2vc-paddle)
16
+
17
+ A paddlepaddle version of [StarGANv2-VC](https://github.com/yl4579/StarGANv2-VC).
18
+
19
+ Download pretrained models [here](https://aistudio.baidu.com/aistudio/datasetdetail/145012).
20
+
21
+ Getting started with free v100/a100 in [AI Studio](https://aistudio.baidu.com/aistudio/projectdetail/3955253) or fast try with [HugginFace Spaces](https://huggingface.co/spaces/HighCWu/starganv2vc-paddle).
22
+
23
+ ---
24
+
25
+ Original PyTorch Repo [README](https://github.com/yl4579/StarGANv2-VC) 👇
26
+
27
+ ---
28
+
29
+
30
+ # StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
31
+
32
+ ### Yinghao Aaron Li, Ali Zare, Nima Mesgarani
33
+
34
+ > We present an unsupervised non-parallel many-to-many voice conversion (VC) method using a generative adversarial network (GAN) called StarGAN v2. Using a combination of adversarial source classifier loss and perceptual loss, our model significantly outperforms previous VC models. Although our model is trained only with 20 English speakers, it generalizes to a variety of voice conversion tasks, such as any-to-many, cross-lingual, and singing conversion. Using a style encoder, our framework can also convert plain reading speech into stylistic speech, such as emotional and falsetto speech. Subjective and objective evaluation experiments on a non-parallel many-to-many voice conversion task revealed that our model produces natural sounding voices, close to the sound quality of state-of-the-art text-tospeech (TTS) based voice conversion methods without the need for text labels. Moreover, our model is completely convolutional and with a faster-than-real-time vocoder such as Parallel WaveGAN can perform real-time voice conversion.
35
+
36
+ Paper: https://arxiv.org/abs/2107.10394
37
+
38
+ Audio samples: https://starganv2-vc.github.io/
39
+
40
+ ## Pre-requisites
41
+ 1. Python >= 3.7
42
+ 2. Clone this repository:
43
+ ```bash
44
+ git https://github.com/yl4579/StarGANv2-VC.git
45
+ cd StarGANv2-VC
46
+ ```
47
+ 3. Install python requirements:
48
+ ```bash
49
+ pip install SoundFile torchaudio munch parallel_wavegan torch pydub
50
+ ```
51
+ 4. Download and extract the [VCTK dataset](https://datashare.ed.ac.uk/handle/10283/3443)
52
+ and use [VCTK.ipynb](https://github.com/yl4579/StarGANv2-VC/blob/main/Data/VCTK.ipynb) to prepare the data (downsample to 24 kHz etc.). You can also [download the dataset](https://drive.google.com/file/d/1t7QQbu4YC_P1mv9puA_KgSomSFDsSzD6/view?usp=sharing) we have prepared and unzip it to the `Data` folder, use the provided `config.yml` to reproduce our models.
53
+
54
+ ## Training
55
+ ```bash
56
+ python train.py --config_path ./Configs/config.yml
57
+ ```
58
+ Please specify the training and validation data in `config.yml` file. Change `num_domains` to the number of speakers in the dataset. The data list format needs to be `filename.wav|speaker_number`, see [train_list.txt](https://github.com/yl4579/StarGANv2-VC/blob/main/Data/train_list.txt) as an example.
59
+
60
+ Checkpoints and Tensorboard logs will be saved at `log_dir`. To speed up training, you may want to make `batch_size` as large as your GPU RAM can take. However, please note that `batch_size = 5` will take around 10G GPU RAM.
61
+
62
+ ## Inference
63
+
64
+ Please refer to [inference.ipynb](https://github.com/yl4579/StarGANv2-VC/blob/main/Demo/inference.ipynb) for details.
65
+
66
+ The pretrained StarGANv2 and ParallelWaveGAN on VCTK corpus can be downloaded at [StarGANv2 Link](https://drive.google.com/file/d/1nzTyyl-9A1Hmqya2Q_f2bpZkUoRjbZsY/view?usp=sharing) and [ParallelWaveGAN Link](https://drive.google.com/file/d/1q8oSAzwkqi99oOGXDZyLypCiz0Qzn3Ab/view?usp=sharing). Please unzip to `Models` and `Vocoder` respectivey and run each cell in the notebook.
67
+
68
+ ## ASR & F0 Models
69
+
70
+ The pretrained F0 and ASR models are provided under the `Utils` folder. Both the F0 and ASR models are trained with melspectrograms preprocessed using [meldataset.py](https://github.com/yl4579/StarGANv2-VC/blob/main/meldataset.py), and both models are trained on speech data only.
71
+
72
+ The ASR model is trained on English corpus, but it appears to work when training StarGANv2 models in other languages such as Japanese. The F0 model also appears to work with singing data. For the best performance, however, training your own ASR and F0 models is encouraged for non-English and non-speech data.
73
+
74
+ You can edit the [meldataset.py](https://github.com/yl4579/StarGANv2-VC/blob/main/meldataset.py) with your own melspectrogram preprocessing, but the provided pretrained models will no longer work. You will need to train your own ASR and F0 models with the new preprocessing. You may refer to repo [Diamondfan/CTC_pytorch](https://github.com/Diamondfan/CTC_pytorch) and [keums/melodyExtraction_JDC](https://github.com/keums/melodyExtraction_JDC) to train your own the ASR and F0 models, for example.
75
+
76
+ ## References
77
+ - [clovaai/stargan-v2](https://github.com/clovaai/stargan-v2)
78
+ - [kan-bayashi/ParallelWaveGAN](https://github.com/kan-bayashi/ParallelWaveGAN)
79
+ - [tosaka-m/japanese_realtime_tts](https://github.com/tosaka-m/japanese_realtime_tts)
80
+ - [keums/melodyExtraction_JDC](https://github.com/keums/melodyExtraction_JDC)
81
+ - [Diamondfan/CTC_pytorch](https://github.com/Diamondfan/CTC_pytorch)
82
+
83
+ ## Acknowledgement
84
+ The author would like to thank [@tosaka-m](https://github.com/tosaka-m) for his great repository and valuable discussions.
Utils/ASR/config.yml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ log_dir: "logs"
2
+ save_freq: 20
3
+ epochs: 180
4
+ batch_size: 48
5
+ pretrained_model: ""
6
+ train_data: "asr_train_list.txt"
7
+ val_data: "asr_val_list.txt"
8
+
9
+ dataset_params:
10
+ data_augmentation: true
11
+
12
+ preprocess_parasm:
13
+ sr: 24000
14
+ spect_params:
15
+ n_fft: 2048
16
+ win_length: 1200
17
+ hop_length: 300
18
+ mel_params:
19
+ n_mels: 80
20
+
21
+ model_params:
22
+ input_dim: 80
23
+ hidden_dim: 256
24
+ n_token: 80
25
+ token_embedding_dim: 256
26
+
27
+ optimizer_params:
28
+ lr: 0.0005
app.py ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ os.system("pip install gradio==2.9b24")
3
+
4
+ import gradio as gr
5
+
6
+
7
+ vocoder_url = 'https://bj.bcebos.com/v1/ai-studio-online/e46d52315a504f1fa520528582a8422b6fa7006463844b84b8a2c3d21cc314db?/Vocoder.zip'
8
+ models_url = 'https://bj.bcebos.com/v1/ai-studio-online/6c081f29caad483ebd4cded087ee6ddbfc8dca8fb89d4ab69d44253ce5525e32?/Models.zip'
9
+
10
+ from io import BytesIO
11
+ from zipfile import ZipFile
12
+ from urllib.request import urlopen
13
+
14
+
15
+ if not (os.path.isdir('Vocoder') and os.path.isdir('Models')):
16
+ for url in [vocoder_url, models_url]:
17
+ resp = urlopen(url)
18
+ zipfile = ZipFile(BytesIO(resp.read()))
19
+ zipfile.extractall()
20
+
21
+
22
+ import random
23
+ import yaml
24
+ from munch import Munch
25
+ import numpy as np
26
+ import paddle
27
+ from paddle import nn
28
+ import paddle.nn.functional as F
29
+ import paddleaudio
30
+ import librosa
31
+
32
+ from starganv2vc_paddle.Utils.JDC.model import JDCNet
33
+ from starganv2vc_paddle.models import Generator, MappingNetwork, StyleEncoder
34
+
35
+
36
+ speakers = [225,228,229,230,231,233,236,239,240,244,226,227,232,243,254,256,258,259,270,273]
37
+
38
+ to_mel = paddleaudio.features.MelSpectrogram(
39
+ n_mels=80, n_fft=2048, win_length=1200, hop_length=300)
40
+ to_mel.fbank_matrix[:] = paddle.load('starganv2vc_paddle/fbank_matrix.pd')['fbank_matrix']
41
+ mean, std = -4, 4
42
+
43
+ def preprocess(wave):
44
+ wave_tensor = paddle.to_tensor(wave).astype(paddle.float32)
45
+ mel_tensor = to_mel(wave_tensor)
46
+ mel_tensor = (paddle.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std
47
+ return mel_tensor
48
+
49
+ def build_model(model_params={}):
50
+ args = Munch(model_params)
51
+ generator = Generator(args.dim_in, args.style_dim, args.max_conv_dim, w_hpf=args.w_hpf, F0_channel=args.F0_channel)
52
+ mapping_network = MappingNetwork(args.latent_dim, args.style_dim, args.num_domains, hidden_dim=args.max_conv_dim)
53
+ style_encoder = StyleEncoder(args.dim_in, args.style_dim, args.num_domains, args.max_conv_dim)
54
+
55
+ nets_ema = Munch(generator=generator,
56
+ mapping_network=mapping_network,
57
+ style_encoder=style_encoder)
58
+
59
+ return nets_ema
60
+
61
+ def compute_style(speaker_dicts):
62
+ reference_embeddings = {}
63
+ for key, (path, speaker) in speaker_dicts.items():
64
+ if path == "":
65
+ label = paddle.to_tensor([speaker], dtype=paddle.int64)
66
+ latent_dim = starganv2.mapping_network.shared[0].weight.shape[0]
67
+ ref = starganv2.mapping_network(paddle.randn([1, latent_dim]), label)
68
+ else:
69
+ wave, sr = librosa.load(path, sr=24000)
70
+ audio, index = librosa.effects.trim(wave, top_db=30)
71
+ if sr != 24000:
72
+ wave = librosa.resample(wave, sr, 24000)
73
+ mel_tensor = preprocess(wave)
74
+
75
+ with paddle.no_grad():
76
+ label = paddle.to_tensor([speaker], dtype=paddle.int64)
77
+ ref = starganv2.style_encoder(mel_tensor.unsqueeze(1), label)
78
+ reference_embeddings[key] = (ref, label)
79
+
80
+ return reference_embeddings
81
+
82
+ F0_model = JDCNet(num_class=1, seq_len=192)
83
+ params = paddle.load("Models/bst.pd")['net']
84
+ F0_model.set_state_dict(params)
85
+ _ = F0_model.eval()
86
+
87
+ import yaml
88
+ import paddle
89
+
90
+ from yacs.config import CfgNode
91
+ from paddlespeech.t2s.models.parallel_wavegan import PWGGenerator
92
+
93
+ with open('Vocoder/config.yml') as f:
94
+ voc_config = CfgNode(yaml.safe_load(f))
95
+ voc_config["generator_params"].pop("upsample_net")
96
+ voc_config["generator_params"]["upsample_scales"] = voc_config["generator_params"].pop("upsample_params")["upsample_scales"]
97
+ vocoder = PWGGenerator(**voc_config["generator_params"])
98
+ vocoder.remove_weight_norm()
99
+ vocoder.eval()
100
+ vocoder.set_state_dict(paddle.load('Vocoder/checkpoint-400000steps.pd'))
101
+
102
+ model_path = 'Models/vc_ema.pd'
103
+
104
+ with open('Models/config.yml') as f:
105
+ starganv2_config = yaml.safe_load(f)
106
+ starganv2 = build_model(model_params=starganv2_config["model_params"])
107
+ params = paddle.load(model_path)
108
+ params = params['model_ema']
109
+ _ = [starganv2[key].set_state_dict(params[key]) for key in starganv2]
110
+ _ = [starganv2[key].eval() for key in starganv2]
111
+ starganv2.style_encoder = starganv2.style_encoder
112
+ starganv2.mapping_network = starganv2.mapping_network
113
+ starganv2.generator = starganv2.generator
114
+
115
+ # Compute speakers' styles under the Demo directory
116
+ speaker_dicts = {}
117
+ selected_speakers = [273, 259, 258, 243, 254, 244, 236, 233, 230, 228]
118
+ for s in selected_speakers:
119
+ k = s
120
+ speaker_dicts['p' + str(s)] = ('Demo/VCTK-corpus/p' + str(k) + '/p' + str(k) + '_023.wav', speakers.index(s))
121
+
122
+ reference_embeddings = compute_style(speaker_dicts)
123
+
124
+ examples = [['Demo/VCTK-corpus/p243/p243_023.wav', 'p236'], ['Demo/VCTK-corpus/p236/p236_023.wav', 'p243']]
125
+
126
+
127
+ def app(wav_path, speaker_id):
128
+ audio, _ = librosa.load(wav_path, sr=24000)
129
+ audio = audio / np.max(np.abs(audio))
130
+ audio.dtype = np.float32
131
+ source = preprocess(audio)
132
+ ref = reference_embeddings[speaker_id][0]
133
+
134
+ with paddle.no_grad():
135
+ f0_feat = F0_model.get_feature_GAN(source.unsqueeze(1))
136
+ out = starganv2.generator(source.unsqueeze(1), ref, F0=f0_feat)
137
+
138
+ c = out.transpose([0,1,3,2]).squeeze()
139
+ y_out = vocoder.inference(c)
140
+ y_out = y_out.reshape([-1])
141
+
142
+ return (24000, y_out.numpy())
143
+
144
+ title="StarGANv2 Voice Conversion"
145
+ description="Gradio Demo for voice conversion using paddlepaddle. "
146
+
147
+ iface = gr.Interface(app, [gr.inputs.Audio(source="microphone", type="filepath"),
148
+ gr.inputs.Radio(list(speaker_dicts.keys()), type="value", default='p228', label='speaker id')],
149
+ "audio", title=title, description=description, examples=examples)
150
+
151
+ iface.launch()
convert_parallel_wavegan_weights_to_paddle.ipynb ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "name": "ParallelWaveGAN to paddle.ipynb",
7
+ "provenance": [],
8
+ "collapsed_sections": [],
9
+ "private_outputs": true
10
+ },
11
+ "kernelspec": {
12
+ "name": "python3",
13
+ "display_name": "Python 3"
14
+ },
15
+ "language_info": {
16
+ "name": "python"
17
+ },
18
+ "accelerator": "GPU"
19
+ },
20
+ "cells": [
21
+ {
22
+ "cell_type": "code",
23
+ "execution_count": null,
24
+ "metadata": {
25
+ "id": "gZNDsJweNp1L"
26
+ },
27
+ "outputs": [],
28
+ "source": [
29
+ "!pip install parallel_wavegan paddlepaddle-gpu==2.2.2 \"paddlespeech<1\" pytest-runner"
30
+ ]
31
+ },
32
+ {
33
+ "cell_type": "code",
34
+ "source": [
35
+ "!gdown https://drive.google.com/uc?id=1q8oSAzwkqi99oOGXDZyLypCiz0Qzn3Ab\n",
36
+ "!unzip -qq Vocoder.zip"
37
+ ],
38
+ "metadata": {
39
+ "id": "HqA0VNKEOGfv"
40
+ },
41
+ "execution_count": null,
42
+ "outputs": []
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "source": [
47
+ "# load torch vocoder\n",
48
+ "import torch\n",
49
+ "from parallel_wavegan.utils import load_model\n",
50
+ "\n",
51
+ "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
52
+ "\n",
53
+ "vocoder_torch = load_model(\"Vocoder/checkpoint-400000steps.pkl\").to(device).eval()\n",
54
+ "vocoder_torch.remove_weight_norm()\n",
55
+ "_ = vocoder_torch.eval()"
56
+ ],
57
+ "metadata": {
58
+ "id": "9F0yA_dyPOVe"
59
+ },
60
+ "execution_count": null,
61
+ "outputs": []
62
+ },
63
+ {
64
+ "cell_type": "code",
65
+ "source": [
66
+ "import yaml\n",
67
+ "import paddle\n",
68
+ "\n",
69
+ "from yacs.config import CfgNode\n",
70
+ "from paddlespeech.s2t.utils.dynamic_import import dynamic_import\n",
71
+ "from paddlespeech.t2s.models.parallel_wavegan import PWGGenerator\n",
72
+ "\n",
73
+ "with open('Vocoder/config.yml') as f:\n",
74
+ " voc_config = CfgNode(yaml.safe_load(f))\n",
75
+ "voc_config[\"generator_params\"].pop(\"upsample_net\")\n",
76
+ "voc_config[\"generator_params\"][\"upsample_scales\"] = voc_config[\"generator_params\"].pop(\"upsample_params\")[\"upsample_scales\"]\n",
77
+ "vocoder_paddle = PWGGenerator(**voc_config[\"generator_params\"])\n",
78
+ "vocoder_paddle.remove_weight_norm()\n",
79
+ "vocoder_paddle.eval()\n",
80
+ "\n",
81
+ "\n",
82
+ "@paddle.no_grad()\n",
83
+ "def convert_weights(torch_model, paddle_model):\n",
84
+ " _ = torch_model.eval()\n",
85
+ " _ = paddle_model.eval()\n",
86
+ " dense_layers = []\n",
87
+ " for name, layer in torch_model.named_modules():\n",
88
+ " if isinstance(layer, torch.nn.Linear):\n",
89
+ " dense_layers.append(name)\n",
90
+ " torch_state_dict = torch_model.state_dict()\n",
91
+ " for name, param in paddle_model.named_parameters():\n",
92
+ " name = name.replace('._mean', '.running_mean')\n",
93
+ " name = name.replace('._variance', '.running_var')\n",
94
+ " name = name.replace('.scale', '.weight')\n",
95
+ " target_param = torch_state_dict[name].detach().cpu().numpy()\n",
96
+ " if '.'.join(name.split('.')[:-1]) in dense_layers:\n",
97
+ " if len(param.shape) == 2:\n",
98
+ " target_param = target_param.transpose((1,0))\n",
99
+ " param.set_value(paddle.to_tensor(target_param))\n",
100
+ "\n",
101
+ "convert_weights(vocoder_torch, vocoder_paddle)"
102
+ ],
103
+ "metadata": {
104
+ "id": "ch2uVW8OdKN0"
105
+ },
106
+ "execution_count": null,
107
+ "outputs": []
108
+ },
109
+ {
110
+ "cell_type": "code",
111
+ "source": [
112
+ "import os\n",
113
+ "import librosa\n",
114
+ "import torchaudio\n",
115
+ "import paddleaudio\n",
116
+ "import numpy as np\n",
117
+ "import IPython.display as ipd\n",
118
+ "\n",
119
+ "\n",
120
+ "to_mel = torchaudio.transforms.MelSpectrogram(\n",
121
+ " n_mels=80, n_fft=2048, win_length=1200, hop_length=300)\n",
122
+ "fb = to_mel.mel_scale.fb.detach().cpu().numpy().transpose([1,0])\n",
123
+ "to_mel = paddleaudio.features.MelSpectrogram(\n",
124
+ " n_mels=80, n_fft=2048, win_length=1200, hop_length=300)\n",
125
+ "to_mel.fbank_matrix[:] = fb\n",
126
+ "mean, std = -4, 4\n",
127
+ "\n",
128
+ "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
129
+ "\n",
130
+ "def preprocess(wave):\n",
131
+ " wave_tensor = paddle.to_tensor(wave).astype(paddle.float32)\n",
132
+ " mel_tensor = 2*to_mel(wave_tensor)\n",
133
+ " mel_tensor = (paddle.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std\n",
134
+ " return mel_tensor\n",
135
+ "\n",
136
+ "if not os.path.exists('p228_023.wav'):\n",
137
+ " !wget https://github.com/yl4579/StarGANv2-VC/raw/main/Demo/VCTK-corpus/p228/p228_023.wav\n",
138
+ "audio, source_sr = librosa.load('p228_023.wav', sr=24000)\n",
139
+ "audio = audio / np.max(np.abs(audio))\n",
140
+ "audio.dtype = np.float32\n",
141
+ "mel = preprocess(audio)\n",
142
+ "\n",
143
+ "import numpy as np\n",
144
+ "with torch.no_grad():\n",
145
+ " with paddle.no_grad():\n",
146
+ " c = mel.transpose([0, 2, 1]).squeeze()\n",
147
+ " recon_paddle = vocoder_paddle.inference(c)\n",
148
+ " recon_paddle = recon_paddle.reshape([-1]).numpy()\n",
149
+ " recon_torch = vocoder_torch.inference(torch.from_numpy(c.numpy()).to(device))\n",
150
+ " recon_torch = recon_torch.view(-1).cpu().numpy()\n",
151
+ " print(np.mean((recon_paddle - recon_torch)**2))\n",
152
+ "\n",
153
+ "print('Paddle recon:')\n",
154
+ "display(ipd.Audio(recon_paddle, rate=24000))\n",
155
+ "print('Torch recon:')\n",
156
+ "display(ipd.Audio(recon_torch, rate=24000))"
157
+ ],
158
+ "metadata": {
159
+ "id": "Q9dK5j1CleJM"
160
+ },
161
+ "execution_count": null,
162
+ "outputs": []
163
+ },
164
+ {
165
+ "cell_type": "code",
166
+ "source": [
167
+ "paddle.save(vocoder_paddle.state_dict(), 'checkpoint-400000steps.pd')\n",
168
+ "paddle.save({ 'fbank_matrix': to_mel.fbank_matrix }, 'fbank_matrix.pd')"
169
+ ],
170
+ "metadata": {
171
+ "id": "HwaLd_Eq3JrH"
172
+ },
173
+ "execution_count": null,
174
+ "outputs": []
175
+ }
176
+ ]
177
+ }
convert_starganv2_vc_weights_to_paddle.ipynb ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "name": "starganv2_vc_weights_converter.ipynb",
7
+ "private_outputs": true,
8
+ "provenance": [],
9
+ "collapsed_sections": []
10
+ },
11
+ "kernelspec": {
12
+ "name": "python3",
13
+ "display_name": "Python 3"
14
+ },
15
+ "language_info": {
16
+ "name": "python"
17
+ },
18
+ "accelerator": "GPU"
19
+ },
20
+ "cells": [
21
+ {
22
+ "cell_type": "code",
23
+ "execution_count": null,
24
+ "metadata": {
25
+ "id": "CA5i7YAlagUA"
26
+ },
27
+ "outputs": [],
28
+ "source": [
29
+ "!git clone https://github.com/yl4579/StarGANv2-VC\n",
30
+ "!pip install SoundFile torchaudio munch\n",
31
+ "!git clone https://github.com/HighCWu/starganv2vc-paddle\n",
32
+ "!cd starganv2vc-paddle && pip install paddlepaddle-gpu==2.2.2 paddleaudio munch pydub\n",
33
+ "!cp -r starganv2vc-paddle/starganv2vc_paddle StarGANv2-VC/"
34
+ ]
35
+ },
36
+ {
37
+ "cell_type": "code",
38
+ "source": [
39
+ "!gdown https://drive.google.com/uc?id=1nzTyyl-9A1Hmqya2Q_f2bpZkUoRjbZsY"
40
+ ],
41
+ "metadata": {
42
+ "id": "ac4g4L1Bbx1t"
43
+ },
44
+ "execution_count": null,
45
+ "outputs": []
46
+ },
47
+ {
48
+ "cell_type": "code",
49
+ "source": [
50
+ "!unzip -qq Models.zip\n",
51
+ "!rm -rf Models.zip\n",
52
+ "!mv Models StarGANv2-VC/Models"
53
+ ],
54
+ "metadata": {
55
+ "id": "EJ3vG_RvcOD8"
56
+ },
57
+ "execution_count": null,
58
+ "outputs": []
59
+ },
60
+ {
61
+ "cell_type": "code",
62
+ "source": [
63
+ "%cd StarGANv2-VC"
64
+ ],
65
+ "metadata": {
66
+ "id": "rKovh1Egi4mJ"
67
+ },
68
+ "execution_count": null,
69
+ "outputs": []
70
+ },
71
+ {
72
+ "cell_type": "code",
73
+ "source": [
74
+ "import os\n",
75
+ "import yaml\n",
76
+ "import numpy as np\n",
77
+ "import torch\n",
78
+ "import warnings\n",
79
+ "warnings.simplefilter('ignore')\n",
80
+ "\n",
81
+ "from munch import Munch\n",
82
+ "\n",
83
+ "from models import build_model\n",
84
+ "\n",
85
+ "from Utils.ASR.models import ASRCNN\n",
86
+ "from Utils.JDC.model import JDCNet\n",
87
+ "\n",
88
+ "torch.backends.cudnn.benchmark = True #\n",
89
+ "\n",
90
+ "def main(config_path):\n",
91
+ " config = yaml.safe_load(open(config_path))\n",
92
+ " \n",
93
+ " device = config.get('device', 'cpu')\n",
94
+ "\n",
95
+ " # load pretrained ASR model\n",
96
+ " ASR_config = config.get('ASR_config', False)\n",
97
+ " ASR_path = config.get('ASR_path', False)\n",
98
+ " with open(ASR_config) as f:\n",
99
+ " ASR_config = yaml.safe_load(f)\n",
100
+ " ASR_model_config = ASR_config['model_params']\n",
101
+ " ASR_model = ASRCNN(**ASR_model_config)\n",
102
+ " params = torch.load(ASR_path, map_location='cpu')['model']\n",
103
+ " ASR_model.load_state_dict(params)\n",
104
+ " ASR_model.to(device)\n",
105
+ " _ = ASR_model.eval()\n",
106
+ " \n",
107
+ " # load pretrained F0 model\n",
108
+ " F0_path = config.get('F0_path', False)\n",
109
+ " F0_model = JDCNet(num_class=1, seq_len=192)\n",
110
+ " params = torch.load(F0_path, map_location='cpu')['net']\n",
111
+ " F0_model.load_state_dict(params)\n",
112
+ " F0_model.to(device)\n",
113
+ " \n",
114
+ " # build model\n",
115
+ " _, model_ema = build_model(Munch(config['model_params']), F0_model, ASR_model)\n",
116
+ " pretrained_path = 'Models/epoch_00150.pth'# config.get('pretrained_model', False)\n",
117
+ " params = torch.load(pretrained_path, map_location='cpu')['model_ema']\n",
118
+ " [model_ema[key].load_state_dict(state_dict) for key, state_dict in params.items()]\n",
119
+ " _ = [model_ema[key].to(device) for key in model_ema]\n",
120
+ "\n",
121
+ " return ASR_model, F0_model, model_ema\n",
122
+ "\n",
123
+ "ASR_model_torch, F0_model_torch, model_ema_torch = main('./Models/config.yml')\n"
124
+ ],
125
+ "metadata": {
126
+ "id": "UpMuk5kni67B"
127
+ },
128
+ "execution_count": null,
129
+ "outputs": []
130
+ },
131
+ {
132
+ "cell_type": "code",
133
+ "source": [
134
+ "import os\n",
135
+ "import yaml\n",
136
+ "import numpy as np\n",
137
+ "import paddle\n",
138
+ "import warnings\n",
139
+ "warnings.simplefilter('ignore')\n",
140
+ "\n",
141
+ "from munch import Munch\n",
142
+ "\n",
143
+ "from starganv2vc_paddle.models import build_model\n",
144
+ "\n",
145
+ "from starganv2vc_paddle.Utils.ASR.models import ASRCNN\n",
146
+ "from starganv2vc_paddle.Utils.JDC.model import JDCNet\n",
147
+ "\n",
148
+ "@paddle.no_grad()\n",
149
+ "def convert_weights(torch_model, paddle_model):\n",
150
+ " _ = torch_model.eval()\n",
151
+ " _ = paddle_model.eval()\n",
152
+ " dense_layers = []\n",
153
+ " for name, layer in torch_model.named_modules():\n",
154
+ " if isinstance(layer, torch.nn.Linear):\n",
155
+ " dense_layers.append(name)\n",
156
+ " torch_state_dict = torch_model.state_dict()\n",
157
+ " for name, param in paddle_model.named_parameters():\n",
158
+ " name = name.replace('._mean', '.running_mean')\n",
159
+ " name = name.replace('._variance', '.running_var')\n",
160
+ " name = name.replace('.scale', '.weight')\n",
161
+ " target_param = torch_state_dict[name].detach().cpu().numpy()\n",
162
+ " if '.'.join(name.split('.')[:-1]) in dense_layers:\n",
163
+ " if len(param.shape) == 2:\n",
164
+ " target_param = target_param.transpose((1,0))\n",
165
+ " param.set_value(paddle.to_tensor(target_param))\n",
166
+ "\n",
167
+ "@torch.no_grad()\n",
168
+ "@paddle.no_grad()\n",
169
+ "def main(config_path):\n",
170
+ " config = yaml.safe_load(open(config_path))\n",
171
+ " \n",
172
+ " ASR_config = config.get('ASR_config', False)\n",
173
+ " with open(ASR_config) as f:\n",
174
+ " ASR_config = yaml.safe_load(f)\n",
175
+ " ASR_model_config = ASR_config['model_params']\n",
176
+ " ASR_model = ASRCNN(**ASR_model_config)\n",
177
+ " _ = ASR_model.eval()\n",
178
+ " convert_weights(ASR_model_torch, ASR_model)\n",
179
+ "\n",
180
+ " F0_model = JDCNet(num_class=1, seq_len=192)\n",
181
+ " _ = F0_model.eval()\n",
182
+ " convert_weights(F0_model_torch, F0_model)\n",
183
+ " \n",
184
+ " # build model\n",
185
+ " model, model_ema = build_model(Munch(config['model_params']), F0_model, ASR_model)\n",
186
+ "\n",
187
+ " asr_input = paddle.randn([2, 80, 192])\n",
188
+ " asr_output = ASR_model(asr_input)\n",
189
+ " asr_output_torch = ASR_model_torch(torch.from_numpy(asr_input.numpy()).cuda())\n",
190
+ " print('ASR model input:', asr_input.shape, 'output:', asr_output.shape)\n",
191
+ " print('Error:', (asr_output_torch.cpu().numpy() - asr_output.numpy()).mean())\n",
192
+ " mel_input = paddle.randn([2, 1, 192, 512])\n",
193
+ " f0_output = F0_model(mel_input)\n",
194
+ " f0_output_torch = F0_model_torch(torch.from_numpy(mel_input.numpy()).cuda())\n",
195
+ " print('F0 model input:', mel_input.shape, 'output:', [t.shape for t in f0_output])\n",
196
+ " # print('Error:', (t_dict2['output'].cpu().numpy() - t_dict1['output'].numpy()).mean())\n",
197
+ " print('Error:', [(t1.cpu().numpy() - t2.numpy()).mean() for t1, t2 in zip(f0_output_torch, f0_output)])\n",
198
+ "\n",
199
+ " _ = [convert_weights(model_ema_torch[k], model_ema[k]) for k in model_ema.keys()]\n",
200
+ " label = paddle.to_tensor([0,0], dtype=paddle.int64)\n",
201
+ " latent_dim = model_ema.mapping_network.shared[0].weight.shape[0]\n",
202
+ " latent_style = paddle.randn([2, latent_dim])\n",
203
+ " ref = model_ema.mapping_network(latent_style, label)\n",
204
+ " ref_torch = model_ema_torch.mapping_network(torch.from_numpy(latent_style.numpy()).cuda(), torch.from_numpy(label.numpy()).cuda())\n",
205
+ " print('Error of mapping network:', (ref_torch.cpu().numpy() - ref.numpy()).mean())\n",
206
+ " mel_input2 = paddle.randn([2, 1, 192, 512])\n",
207
+ " style_ref = model_ema.style_encoder(mel_input2, label)\n",
208
+ " style_ref_torch = model_ema_torch.style_encoder(torch.from_numpy(mel_input2.numpy()).cuda(), torch.from_numpy(label.numpy()).cuda())\n",
209
+ " print('StyleGANv2-VC encoder inputs:', mel_input2.shape, 'output:', style_ref.shape, 'should has the same shape as the ref:', ref.shape)\n",
210
+ " print('Error of style encoder:', (style_ref_torch.cpu().numpy() - style_ref.numpy()).mean())\n",
211
+ " f0_feat = F0_model.get_feature_GAN(mel_input)\n",
212
+ " f0_feat_torch = F0_model_torch.get_feature_GAN(torch.from_numpy(mel_input.numpy()).cuda())\n",
213
+ " print('Error of f0 feat:', (f0_feat_torch.cpu().numpy() - f0_feat.numpy()).mean())\n",
214
+ " out = model_ema.generator(mel_input, style_ref, F0=f0_feat)\n",
215
+ " out_torch = model_ema_torch.generator(torch.from_numpy(mel_input.numpy()).cuda(), torch.from_numpy(style_ref.numpy()).cuda(), F0=torch.from_numpy(f0_feat.numpy()).cuda())\n",
216
+ " print('StyleGANv2-VC inputs:', label.shape, latent_style.shape, mel_input.shape, 'output:', out.shape)\n",
217
+ " print('Error:', (out_torch.cpu().numpy() - out.numpy()).mean())\n",
218
+ "\n",
219
+ " paddle.save({'model': ASR_model.state_dict()}, 'ASR.pd')\n",
220
+ " paddle.save({ 'net': F0_model.state_dict()}, 'F0.pd')\n",
221
+ " model_ema_dict = {key: model.state_dict() for key, model in model_ema.items()}\n",
222
+ " \n",
223
+ " paddle.save({ 'model_ema': model_ema_dict }, 'VC.pd')\n",
224
+ "\n",
225
+ " return 0\n",
226
+ "\n",
227
+ "main('./Models/config.yml')\n"
228
+ ],
229
+ "metadata": {
230
+ "id": "PnuApVuyIIyd"
231
+ },
232
+ "execution_count": null,
233
+ "outputs": []
234
+ }
235
+ ]
236
+ }
prepare_data.ipynb ADDED
@@ -0,0 +1,179 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": null,
6
+ "id": "347ace04",
7
+ "metadata": {},
8
+ "outputs": [],
9
+ "source": [
10
+ "import os\n",
11
+ "\n",
12
+ "# VCTK Corpus Path\n",
13
+ "__CORPUSPATH__ = os.path.expanduser(\"~/data/VCTK-Corpus\") \n",
14
+ "\n",
15
+ "# output path\n",
16
+ "__OUTPATH__ = \"./Data\""
17
+ ]
18
+ },
19
+ {
20
+ "cell_type": "code",
21
+ "execution_count": null,
22
+ "id": "4ce9eb2e",
23
+ "metadata": {},
24
+ "outputs": [],
25
+ "source": [
26
+ "import os\n",
27
+ "from scipy.io import wavfile\n",
28
+ "from pydub import AudioSegment\n",
29
+ "\n",
30
+ "from pydub import AudioSegment\n",
31
+ "from pydub.silence import split_on_silence\n",
32
+ "import os\n",
33
+ "\n",
34
+ "def split(sound):\n",
35
+ " dBFS = sound.dBFS\n",
36
+ " chunks = split_on_silence(sound,\n",
37
+ " min_silence_len = 100,\n",
38
+ " silence_thresh = dBFS-16,\n",
39
+ " keep_silence = 100\n",
40
+ " )\n",
41
+ " return chunks\n",
42
+ "\n",
43
+ "def combine(_src):\n",
44
+ " audio = AudioSegment.empty()\n",
45
+ " for i,filename in enumerate(os.listdir(_src)):\n",
46
+ " if filename.endswith('.wav'):\n",
47
+ " filename = os.path.join(_src, filename)\n",
48
+ " audio += AudioSegment.from_wav(filename)\n",
49
+ " return audio\n",
50
+ "\n",
51
+ "def save_chunks(chunks, directory):\n",
52
+ " if not os.path.exists(directory):\n",
53
+ " os.makedirs(directory)\n",
54
+ " counter = 0\n",
55
+ "\n",
56
+ " target_length = 5 * 1000\n",
57
+ " output_chunks = [chunks[0]]\n",
58
+ " for chunk in chunks[1:]:\n",
59
+ " if len(output_chunks[-1]) < target_length:\n",
60
+ " output_chunks[-1] += chunk\n",
61
+ " else:\n",
62
+ " # if the last output chunk is longer than the target length,\n",
63
+ " # we can start a new one\n",
64
+ " output_chunks.append(chunk)\n",
65
+ "\n",
66
+ " for chunk in output_chunks:\n",
67
+ " chunk = chunk.set_frame_rate(24000)\n",
68
+ " chunk = chunk.set_channels(1)\n",
69
+ " counter = counter + 1\n",
70
+ " chunk.export(os.path.join(directory, str(counter) + '.wav'), format=\"wav\")"
71
+ ]
72
+ },
73
+ {
74
+ "cell_type": "code",
75
+ "execution_count": null,
76
+ "id": "769a7f62",
77
+ "metadata": {},
78
+ "outputs": [],
79
+ "source": [
80
+ "# Source: http://speech.ee.ntu.edu.tw/~jjery2243542/resource/model/is18/en_speaker_used.txt\n",
81
+ "# Source: https://github.com/jjery2243542/voice_conversion\n",
82
+ "\n",
83
+ "speakers = [225,228,229,230,231,233,236,239,240,244,226,227,232,243,254,256,258,259,270,273]"
84
+ ]
85
+ },
86
+ {
87
+ "cell_type": "code",
88
+ "execution_count": null,
89
+ "id": "9302fb6a",
90
+ "metadata": {},
91
+ "outputs": [],
92
+ "source": [
93
+ "# downsample to 24 kHz\n",
94
+ "\n",
95
+ "for p in speakers:\n",
96
+ " directory = __OUTPATH__ + '/p' + str(p)\n",
97
+ " if not os.path.exists(directory):\n",
98
+ " audio = combine(__CORPUSPATH__ + '/wav48/p' + str(p))\n",
99
+ " chunks = split(audio)\n",
100
+ " save_chunks(chunks, directory)"
101
+ ]
102
+ },
103
+ {
104
+ "cell_type": "code",
105
+ "execution_count": null,
106
+ "id": "4b0ca022",
107
+ "metadata": {},
108
+ "outputs": [],
109
+ "source": [
110
+ "# get all speakers\n",
111
+ "\n",
112
+ "data_list = []\n",
113
+ "for path, subdirs, files in os.walk(__OUTPATH__):\n",
114
+ " for name in files:\n",
115
+ " if name.endswith(\".wav\"):\n",
116
+ " speaker = int(path.split('/')[-1].replace('p', ''))\n",
117
+ " if speaker in speakers:\n",
118
+ " data_list.append({\"Path\": os.path.join(path, name), \"Speaker\": int(speakers.index(speaker)) + 1})\n",
119
+ " \n",
120
+ "import pandas as pd\n",
121
+ "\n",
122
+ "data_list = pd.DataFrame(data_list)\n",
123
+ "data_list = data_list.sample(frac=1)\n",
124
+ "\n",
125
+ "import random\n",
126
+ "\n",
127
+ "split_idx = round(len(data_list) * 0.1)\n",
128
+ "\n",
129
+ "test_data = data_list[:split_idx]\n",
130
+ "train_data = data_list[split_idx:]"
131
+ ]
132
+ },
133
+ {
134
+ "cell_type": "code",
135
+ "execution_count": null,
136
+ "id": "88df2a45",
137
+ "metadata": {},
138
+ "outputs": [],
139
+ "source": [
140
+ "# write to file \n",
141
+ "\n",
142
+ "file_str = \"\"\n",
143
+ "for index, k in train_data.iterrows():\n",
144
+ " file_str += k['Path'] + \"|\" +str(k['Speaker'] - 1)+ '\\n'\n",
145
+ "text_file = open(__OUTPATH__ + \"/train_list.txt\", \"w\")\n",
146
+ "text_file.write(file_str)\n",
147
+ "text_file.close()\n",
148
+ "\n",
149
+ "file_str = \"\"\n",
150
+ "for index, k in test_data.iterrows():\n",
151
+ " file_str += k['Path'] + \"|\" + str(k['Speaker'] - 1) + '\\n'\n",
152
+ "text_file = open(__OUTPATH__ + \"/val_list.txt\", \"w\")\n",
153
+ "text_file.write(file_str)\n",
154
+ "text_file.close()"
155
+ ]
156
+ }
157
+ ],
158
+ "metadata": {
159
+ "kernelspec": {
160
+ "display_name": "Python 3",
161
+ "language": "python",
162
+ "name": "python3"
163
+ },
164
+ "language_info": {
165
+ "codemirror_mode": {
166
+ "name": "ipython",
167
+ "version": 3
168
+ },
169
+ "file_extension": ".py",
170
+ "mimetype": "text/x-python",
171
+ "name": "python",
172
+ "nbconvert_exporter": "python",
173
+ "pygments_lexer": "ipython3",
174
+ "version": "3.7.10"
175
+ }
176
+ },
177
+ "nbformat": 4,
178
+ "nbformat_minor": 5
179
+ }
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ paddlepaddle-gpu>=2.2.2
2
+ paddlespeech==0.2.0
3
+ visualdl
4
+ munch
5
+ pydub
starganv2vc_paddle/LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2021 Aaron (Yinghao) Li
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
starganv2vc_paddle/Utils/ASR/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+
starganv2vc_paddle/Utils/ASR/layers.py ADDED
@@ -0,0 +1,359 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+ import paddle
3
+ from paddle import nn
4
+ from typing import Optional, Any
5
+ from paddle import Tensor
6
+ import paddle.nn.functional as F
7
+ import paddleaudio
8
+ import paddleaudio.functional as audio_F
9
+
10
+ import random
11
+ random.seed(0)
12
+
13
+
14
+ def _get_activation_fn(activ):
15
+ if activ == 'relu':
16
+ return nn.ReLU()
17
+ elif activ == 'lrelu':
18
+ return nn.LeakyReLU(0.2)
19
+ elif activ == 'swish':
20
+ return nn.Swish()
21
+ else:
22
+ raise RuntimeError('Unexpected activ type %s, expected [relu, lrelu, swish]' % activ)
23
+
24
+ class LinearNorm(paddle.nn.Layer):
25
+ def __init__(self, in_dim, out_dim, bias=True, w_init_gain='linear'):
26
+ super(LinearNorm, self).__init__()
27
+ self.linear_layer = paddle.nn.Linear(in_dim, out_dim, bias_attr=bias)
28
+
29
+ if float('.'.join(paddle.__version__.split('.')[:2])) >= 2.3:
30
+ gain = paddle.nn.initializer.calculate_gain(w_init_gain)
31
+ paddle.nn.initializer.XavierUniform()(self.linear_layer.weight)
32
+ self.linear_layer.weight.set_value(gain * self.linear_layer.weight)
33
+
34
+ def forward(self, x):
35
+ return self.linear_layer(x)
36
+
37
+
38
+ class ConvNorm(paddle.nn.Layer):
39
+ def __init__(self, in_channels, out_channels, kernel_size=1, stride=1,
40
+ padding=None, dilation=1, bias=True, w_init_gain='linear', param=None):
41
+ super(ConvNorm, self).__init__()
42
+ if padding is None:
43
+ assert(kernel_size % 2 == 1)
44
+ padding = int(dilation * (kernel_size - 1) / 2)
45
+
46
+ self.conv = paddle.nn.Conv1D(in_channels, out_channels,
47
+ kernel_size=kernel_size, stride=stride,
48
+ padding=padding, dilation=dilation,
49
+ bias_attr=bias)
50
+
51
+ if float('.'.join(paddle.__version__.split('.')[:2])) >= 2.3:
52
+ gain = paddle.nn.initializer.calculate_gain(w_init_gain, param=param)
53
+ paddle.nn.initializer.XavierUniform()(self.conv.weight)
54
+ self.conv.weight.set_value(gain * self.conv.weight)
55
+
56
+ def forward(self, signal):
57
+ conv_signal = self.conv(signal)
58
+ return conv_signal
59
+
60
+ class CausualConv(nn.Layer):
61
+ def __init__(self, in_channels, out_channels, kernel_size=1, stride=1, padding=1, dilation=1, bias=True, w_init_gain='linear', param=None):
62
+ super(CausualConv, self).__init__()
63
+ if padding is None:
64
+ assert(kernel_size % 2 == 1)
65
+ padding = int(dilation * (kernel_size - 1) / 2) * 2
66
+ else:
67
+ self.padding = padding * 2
68
+ self.conv = nn.Conv1D(in_channels, out_channels,
69
+ kernel_size=kernel_size, stride=stride,
70
+ padding=self.padding,
71
+ dilation=dilation,
72
+ bias_attr=bias)
73
+
74
+ if float('.'.join(paddle.__version__.split('.')[:2])) >= 2.3:
75
+ gain = paddle.nn.initializer.calculate_gain(w_init_gain, param=param)
76
+ paddle.nn.initializer.XavierUniform()(self.conv.weight)
77
+ self.conv.weight.set_value(gain * self.conv.weight)
78
+
79
+ def forward(self, x):
80
+ x = self.conv(x)
81
+ x = x[:, :, :-self.padding]
82
+ return x
83
+
84
+ class CausualBlock(nn.Layer):
85
+ def __init__(self, hidden_dim, n_conv=3, dropout_p=0.2, activ='lrelu'):
86
+ super(CausualBlock, self).__init__()
87
+ self.blocks = nn.LayerList([
88
+ self._get_conv(hidden_dim, dilation=3**i, activ=activ, dropout_p=dropout_p)
89
+ for i in range(n_conv)])
90
+
91
+ def forward(self, x):
92
+ for block in self.blocks:
93
+ res = x
94
+ x = block(x)
95
+ x += res
96
+ return x
97
+
98
+ def _get_conv(self, hidden_dim, dilation, activ='lrelu', dropout_p=0.2):
99
+ layers = [
100
+ CausualConv(hidden_dim, hidden_dim, kernel_size=3, padding=dilation, dilation=dilation),
101
+ _get_activation_fn(activ),
102
+ nn.BatchNorm1D(hidden_dim),
103
+ nn.Dropout(p=dropout_p),
104
+ CausualConv(hidden_dim, hidden_dim, kernel_size=3, padding=1, dilation=1),
105
+ _get_activation_fn(activ),
106
+ nn.Dropout(p=dropout_p)
107
+ ]
108
+ return nn.Sequential(*layers)
109
+
110
+ class ConvBlock(nn.Layer):
111
+ def __init__(self, hidden_dim, n_conv=3, dropout_p=0.2, activ='relu'):
112
+ super().__init__()
113
+ self._n_groups = 8
114
+ self.blocks = nn.LayerList([
115
+ self._get_conv(hidden_dim, dilation=3**i, activ=activ, dropout_p=dropout_p)
116
+ for i in range(n_conv)])
117
+
118
+
119
+ def forward(self, x):
120
+ for block in self.blocks:
121
+ res = x
122
+ x = block(x)
123
+ x += res
124
+ return x
125
+
126
+ def _get_conv(self, hidden_dim, dilation, activ='relu', dropout_p=0.2):
127
+ layers = [
128
+ ConvNorm(hidden_dim, hidden_dim, kernel_size=3, padding=dilation, dilation=dilation),
129
+ _get_activation_fn(activ),
130
+ nn.GroupNorm(num_groups=self._n_groups, num_channels=hidden_dim),
131
+ nn.Dropout(p=dropout_p),
132
+ ConvNorm(hidden_dim, hidden_dim, kernel_size=3, padding=1, dilation=1),
133
+ _get_activation_fn(activ),
134
+ nn.Dropout(p=dropout_p)
135
+ ]
136
+ return nn.Sequential(*layers)
137
+
138
+ class LocationLayer(nn.Layer):
139
+ def __init__(self, attention_n_filters, attention_kernel_size,
140
+ attention_dim):
141
+ super(LocationLayer, self).__init__()
142
+ padding = int((attention_kernel_size - 1) / 2)
143
+ self.location_conv = ConvNorm(2, attention_n_filters,
144
+ kernel_size=attention_kernel_size,
145
+ padding=padding, bias=False, stride=1,
146
+ dilation=1)
147
+ self.location_dense = LinearNorm(attention_n_filters, attention_dim,
148
+ bias=False, w_init_gain='tanh')
149
+
150
+ def forward(self, attention_weights_cat):
151
+ processed_attention = self.location_conv(attention_weights_cat)
152
+ processed_attention = processed_attention.transpose([0, 2, 1])
153
+ processed_attention = self.location_dense(processed_attention)
154
+ return processed_attention
155
+
156
+
157
+ class Attention(nn.Layer):
158
+ def __init__(self, attention_rnn_dim, embedding_dim, attention_dim,
159
+ attention_location_n_filters, attention_location_kernel_size):
160
+ super(Attention, self).__init__()
161
+ self.query_layer = LinearNorm(attention_rnn_dim, attention_dim,
162
+ bias=False, w_init_gain='tanh')
163
+ self.memory_layer = LinearNorm(embedding_dim, attention_dim, bias=False,
164
+ w_init_gain='tanh')
165
+ self.v = LinearNorm(attention_dim, 1, bias=False)
166
+ self.location_layer = LocationLayer(attention_location_n_filters,
167
+ attention_location_kernel_size,
168
+ attention_dim)
169
+ self.score_mask_value = -float("inf")
170
+
171
+ def get_alignment_energies(self, query, processed_memory,
172
+ attention_weights_cat):
173
+ """
174
+ PARAMS
175
+ ------
176
+ query: decoder output (batch, n_mel_channels * n_frames_per_step)
177
+ processed_memory: processed encoder outputs (B, T_in, attention_dim)
178
+ attention_weights_cat: cumulative and prev. att weights (B, 2, max_time)
179
+ RETURNS
180
+ -------
181
+ alignment (batch, max_time)
182
+ """
183
+
184
+ processed_query = self.query_layer(query.unsqueeze(1))
185
+ processed_attention_weights = self.location_layer(attention_weights_cat)
186
+ energies = self.v(paddle.tanh(
187
+ processed_query + processed_attention_weights + processed_memory))
188
+
189
+ energies = energies.squeeze(-1)
190
+ return energies
191
+
192
+ def forward(self, attention_hidden_state, memory, processed_memory,
193
+ attention_weights_cat, mask):
194
+ """
195
+ PARAMS
196
+ ------
197
+ attention_hidden_state: attention rnn last output
198
+ memory: encoder outputs
199
+ processed_memory: processed encoder outputs
200
+ attention_weights_cat: previous and cummulative attention weights
201
+ mask: binary mask for padded data
202
+ """
203
+ alignment = self.get_alignment_energies(
204
+ attention_hidden_state, processed_memory, attention_weights_cat)
205
+
206
+ if mask is not None:
207
+ alignment.data.masked_fill_(mask, self.score_mask_value)
208
+
209
+ attention_weights = F.softmax(alignment, axis=1)
210
+ attention_context = paddle.bmm(attention_weights.unsqueeze(1), memory)
211
+ attention_context = attention_context.squeeze(1)
212
+
213
+ return attention_context, attention_weights
214
+
215
+
216
+ class ForwardAttentionV2(nn.Layer):
217
+ def __init__(self, attention_rnn_dim, embedding_dim, attention_dim,
218
+ attention_location_n_filters, attention_location_kernel_size):
219
+ super(ForwardAttentionV2, self).__init__()
220
+ self.query_layer = LinearNorm(attention_rnn_dim, attention_dim,
221
+ bias=False, w_init_gain='tanh')
222
+ self.memory_layer = LinearNorm(embedding_dim, attention_dim, bias=False,
223
+ w_init_gain='tanh')
224
+ self.v = LinearNorm(attention_dim, 1, bias=False)
225
+ self.location_layer = LocationLayer(attention_location_n_filters,
226
+ attention_location_kernel_size,
227
+ attention_dim)
228
+ self.score_mask_value = -float(1e20)
229
+
230
+ def get_alignment_energies(self, query, processed_memory,
231
+ attention_weights_cat):
232
+ """
233
+ PARAMS
234
+ ------
235
+ query: decoder output (batch, n_mel_channels * n_frames_per_step)
236
+ processed_memory: processed encoder outputs (B, T_in, attention_dim)
237
+ attention_weights_cat: prev. and cumulative att weights (B, 2, max_time)
238
+ RETURNS
239
+ -------
240
+ alignment (batch, max_time)
241
+ """
242
+
243
+ processed_query = self.query_layer(query.unsqueeze(1))
244
+ processed_attention_weights = self.location_layer(attention_weights_cat)
245
+ energies = self.v(paddle.tanh(
246
+ processed_query + processed_attention_weights + processed_memory))
247
+
248
+ energies = energies.squeeze(-1)
249
+ return energies
250
+
251
+ def forward(self, attention_hidden_state, memory, processed_memory,
252
+ attention_weights_cat, mask, log_alpha):
253
+ """
254
+ PARAMS
255
+ ------
256
+ attention_hidden_state: attention rnn last output
257
+ memory: encoder outputs
258
+ processed_memory: processed encoder outputs
259
+ attention_weights_cat: previous and cummulative attention weights
260
+ mask: binary mask for padded data
261
+ """
262
+ log_energy = self.get_alignment_energies(
263
+ attention_hidden_state, processed_memory, attention_weights_cat)
264
+
265
+ #log_energy =
266
+
267
+ if mask is not None:
268
+ log_energy[:] = paddle.where(mask, paddle.full(log_energy.shape, self.score_mask_value, log_energy.dtype), log_energy)
269
+
270
+ #attention_weights = F.softmax(alignment, dim=1)
271
+
272
+ #content_score = log_energy.unsqueeze(1) #[B, MAX_TIME] -> [B, 1, MAX_TIME]
273
+ #log_alpha = log_alpha.unsqueeze(2) #[B, MAX_TIME] -> [B, MAX_TIME, 1]
274
+
275
+ #log_total_score = log_alpha + content_score
276
+
277
+ #previous_attention_weights = attention_weights_cat[:,0,:]
278
+
279
+ log_alpha_shift_padded = []
280
+ max_time = log_energy.shape[1]
281
+ for sft in range(2):
282
+ shifted = log_alpha[:,:max_time-sft]
283
+ shift_padded = F.pad(shifted, (sft,0), 'constant', self.score_mask_value)
284
+ log_alpha_shift_padded.append(shift_padded.unsqueeze(2))
285
+
286
+ biased = paddle.logsumexp(paddle.conat(log_alpha_shift_padded,2), 2)
287
+
288
+ log_alpha_new = biased + log_energy
289
+
290
+ attention_weights = F.softmax(log_alpha_new, axis=1)
291
+
292
+ attention_context = paddle.bmm(attention_weights.unsqueeze(1), memory)
293
+ attention_context = attention_context.squeeze(1)
294
+
295
+ return attention_context, attention_weights, log_alpha_new
296
+
297
+
298
+ class PhaseShuffle2D(nn.Layer):
299
+ def __init__(self, n=2):
300
+ super(PhaseShuffle2D, self).__init__()
301
+ self.n = n
302
+ self.random = random.Random(1)
303
+
304
+ def forward(self, x, move=None):
305
+ # x.size = (B, C, M, L)
306
+ if move is None:
307
+ move = self.random.randint(-self.n, self.n)
308
+
309
+ if move == 0:
310
+ return x
311
+ else:
312
+ left = x[:, :, :, :move]
313
+ right = x[:, :, :, move:]
314
+ shuffled = paddle.concat([right, left], axis=3)
315
+ return shuffled
316
+
317
+ class PhaseShuffle1D(nn.Layer):
318
+ def __init__(self, n=2):
319
+ super(PhaseShuffle1D, self).__init__()
320
+ self.n = n
321
+ self.random = random.Random(1)
322
+
323
+ def forward(self, x, move=None):
324
+ # x.size = (B, C, M, L)
325
+ if move is None:
326
+ move = self.random.randint(-self.n, self.n)
327
+
328
+ if move == 0:
329
+ return x
330
+ else:
331
+ left = x[:, :, :move]
332
+ right = x[:, :, move:]
333
+ shuffled = paddle.concat([right, left], axis=2)
334
+
335
+ return shuffled
336
+
337
+ class MFCC(nn.Layer):
338
+ def __init__(self, n_mfcc=40, n_mels=80):
339
+ super(MFCC, self).__init__()
340
+ self.n_mfcc = n_mfcc
341
+ self.n_mels = n_mels
342
+ self.norm = 'ortho'
343
+ dct_mat = audio_F.create_dct(self.n_mfcc, self.n_mels, self.norm)
344
+ self.register_buffer('dct_mat', dct_mat)
345
+
346
+ def forward(self, mel_specgram):
347
+ if len(mel_specgram.shape) == 2:
348
+ mel_specgram = mel_specgram.unsqueeze(0)
349
+ unsqueezed = True
350
+ else:
351
+ unsqueezed = False
352
+ # (channel, n_mels, time).tranpose(...) dot (n_mels, n_mfcc)
353
+ # -> (channel, time, n_mfcc).tranpose(...)
354
+ mfcc = paddle.matmul(mel_specgram.transpose([0, 2, 1]), self.dct_mat).transpose([0, 2, 1])
355
+
356
+ # unpack batch
357
+ if unsqueezed:
358
+ mfcc = mfcc.squeeze(0)
359
+ return mfcc
starganv2vc_paddle/Utils/ASR/models.py ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+ import paddle
3
+ from paddle import nn
4
+ from paddle.nn import TransformerEncoder
5
+ import paddle.nn.functional as F
6
+ from .layers import MFCC, Attention, LinearNorm, ConvNorm, ConvBlock
7
+
8
+ class ASRCNN(nn.Layer):
9
+ def __init__(self,
10
+ input_dim=80,
11
+ hidden_dim=256,
12
+ n_token=35,
13
+ n_layers=6,
14
+ token_embedding_dim=256,
15
+
16
+ ):
17
+ super().__init__()
18
+ self.n_token = n_token
19
+ self.n_down = 1
20
+ self.to_mfcc = MFCC()
21
+ self.init_cnn = ConvNorm(input_dim//2, hidden_dim, kernel_size=7, padding=3, stride=2)
22
+ self.cnns = nn.Sequential(
23
+ *[nn.Sequential(
24
+ ConvBlock(hidden_dim),
25
+ nn.GroupNorm(num_groups=1, num_channels=hidden_dim)
26
+ ) for n in range(n_layers)])
27
+ self.projection = ConvNorm(hidden_dim, hidden_dim // 2)
28
+ self.ctc_linear = nn.Sequential(
29
+ LinearNorm(hidden_dim//2, hidden_dim),
30
+ nn.ReLU(),
31
+ LinearNorm(hidden_dim, n_token))
32
+ self.asr_s2s = ASRS2S(
33
+ embedding_dim=token_embedding_dim,
34
+ hidden_dim=hidden_dim//2,
35
+ n_token=n_token)
36
+
37
+ def forward(self, x, src_key_padding_mask=None, text_input=None):
38
+ x = self.to_mfcc(x)
39
+ x = self.init_cnn(x)
40
+ x = self.cnns(x)
41
+ x = self.projection(x)
42
+ x = x.transpose([0, 2, 1])
43
+ ctc_logit = self.ctc_linear(x)
44
+ if text_input is not None:
45
+ _, s2s_logit, s2s_attn = self.asr_s2s(x, src_key_padding_mask, text_input)
46
+ return ctc_logit, s2s_logit, s2s_attn
47
+ else:
48
+ return ctc_logit
49
+
50
+ def get_feature(self, x):
51
+ x = self.to_mfcc(x.squeeze(1))
52
+ x = self.init_cnn(x)
53
+ x = self.cnns(x)
54
+ x = self.projection(x)
55
+ return x
56
+
57
+ def length_to_mask(self, lengths):
58
+ mask = paddle.arange(lengths.max()).unsqueeze(0).expand((lengths.shape[0], -1)).astype(lengths.dtype)
59
+ mask = paddle.greater_than(mask+1, lengths.unsqueeze(1))
60
+ return mask
61
+
62
+ def get_future_mask(self, out_length, unmask_future_steps=0):
63
+ """
64
+ Args:
65
+ out_length (int): returned mask shape is (out_length, out_length).
66
+ unmask_futre_steps (int): unmasking future step size.
67
+ Return:
68
+ mask (paddle.BoolTensor): mask future timesteps mask[i, j] = True if i > j + unmask_future_steps else False
69
+ """
70
+ index_tensor = paddle.arange(out_length).unsqueeze(0).expand([out_length, -1])
71
+ mask = paddle.greater_than(index_tensor, index_tensor.T + unmask_future_steps)
72
+ return mask
73
+
74
+ class ASRS2S(nn.Layer):
75
+ def __init__(self,
76
+ embedding_dim=256,
77
+ hidden_dim=512,
78
+ n_location_filters=32,
79
+ location_kernel_size=63,
80
+ n_token=40):
81
+ super(ASRS2S, self).__init__()
82
+ self.embedding = nn.Embedding(n_token, embedding_dim)
83
+ val_range = math.sqrt(6 / hidden_dim)
84
+ nn.initializer.Uniform(-val_range, val_range)(self.embedding.weight)
85
+
86
+ self.decoder_rnn_dim = hidden_dim
87
+ self.project_to_n_symbols = nn.Linear(self.decoder_rnn_dim, n_token)
88
+ self.attention_layer = Attention(
89
+ self.decoder_rnn_dim,
90
+ hidden_dim,
91
+ hidden_dim,
92
+ n_location_filters,
93
+ location_kernel_size
94
+ )
95
+ self.decoder_rnn = nn.LSTMCell(self.decoder_rnn_dim + embedding_dim, self.decoder_rnn_dim)
96
+ self.project_to_hidden = nn.Sequential(
97
+ LinearNorm(self.decoder_rnn_dim * 2, hidden_dim),
98
+ nn.Tanh())
99
+ self.sos = 1
100
+ self.eos = 2
101
+
102
+ def initialize_decoder_states(self, memory, mask):
103
+ """
104
+ moemory.shape = (B, L, H) = (Batchsize, Maxtimestep, Hiddendim)
105
+ """
106
+ B, L, H = memory.shape
107
+ self.decoder_hidden = paddle.zeros((B, self.decoder_rnn_dim)).astype(memory.dtype)
108
+ self.decoder_cell = paddle.zeros((B, self.decoder_rnn_dim)).astype(memory.dtype)
109
+ self.attention_weights = paddle.zeros((B, L)).astype(memory.dtype)
110
+ self.attention_weights_cum = paddle.zeros((B, L)).astype(memory.dtype)
111
+ self.attention_context = paddle.zeros((B, H)).astype(memory.dtype)
112
+ self.memory = memory
113
+ self.processed_memory = self.attention_layer.memory_layer(memory)
114
+ self.mask = mask
115
+ self.unk_index = 3
116
+ self.random_mask = 0.1
117
+
118
+ def forward(self, memory, memory_mask, text_input):
119
+ """
120
+ moemory.shape = (B, L, H) = (Batchsize, Maxtimestep, Hiddendim)
121
+ moemory_mask.shape = (B, L, )
122
+ texts_input.shape = (B, T)
123
+ """
124
+ self.initialize_decoder_states(memory, memory_mask)
125
+ # text random mask
126
+ random_mask = (paddle.rand(text_input.shape) < self.random_mask)
127
+ _text_input = text_input.clone()
128
+ _text_input[:] = paddle.where(random_mask, paddle.full(_text_input.shape, self.unk_index, _text_input.dtype), _text_input)
129
+ decoder_inputs = self.embedding(_text_input).transpose([1, 0, 2]) # -> [T, B, channel]
130
+ start_embedding = self.embedding(
131
+ paddle.to_tensor([self.sos]*decoder_inputs.shape[1], dtype=paddle.long))
132
+ decoder_inputs = paddle.concat((start_embedding.unsqueeze(0), decoder_inputs), axis=0)
133
+
134
+ hidden_outputs, logit_outputs, alignments = [], [], []
135
+ while len(hidden_outputs) < decoder_inputs.shape[0]:
136
+
137
+ decoder_input = decoder_inputs[len(hidden_outputs)]
138
+ hidden, logit, attention_weights = self.decode(decoder_input)
139
+ hidden_outputs += [hidden]
140
+ logit_outputs += [logit]
141
+ alignments += [attention_weights]
142
+
143
+ hidden_outputs, logit_outputs, alignments = \
144
+ self.parse_decoder_outputs(
145
+ hidden_outputs, logit_outputs, alignments)
146
+
147
+ return hidden_outputs, logit_outputs, alignments
148
+
149
+
150
+ def decode(self, decoder_input):
151
+
152
+ cell_input = paddle.concat((decoder_input, self.attention_context), -1)
153
+ self.decoder_rnn.flatten_parameters()
154
+ self.decoder_hidden, self.decoder_cell = self.decoder_rnn(
155
+ cell_input,
156
+ (self.decoder_hidden, self.decoder_cell))
157
+
158
+ attention_weights_cat = paddle.concat(
159
+ (self.attention_weights.unsqueeze(1),
160
+ self.attention_weights_cum.unsqueeze(1)),axis=1)
161
+
162
+ self.attention_context, self.attention_weights = self.attention_layer(
163
+ self.decoder_hidden,
164
+ self.memory,
165
+ self.processed_memory,
166
+ attention_weights_cat,
167
+ self.mask)
168
+
169
+ self.attention_weights_cum += self.attention_weights
170
+
171
+ hidden_and_context = paddle.concat((self.decoder_hidden, self.attention_context), -1)
172
+ hidden = self.project_to_hidden(hidden_and_context)
173
+
174
+ # dropout to increasing g
175
+ logit = self.project_to_n_symbols(F.dropout(hidden, 0.5, self.training))
176
+
177
+ return hidden, logit, self.attention_weights
178
+
179
+ def parse_decoder_outputs(self, hidden, logit, alignments):
180
+
181
+ # -> [B, T_out + 1, max_time]
182
+ alignments = paddle.stack(alignments).transpose([1,0,2])
183
+ # [T_out + 1, B, n_symbols] -> [B, T_out + 1, n_symbols]
184
+ logit = paddle.stack(logit).transpose([1,0,2])
185
+ hidden = paddle.stack(hidden).transpose([1,0,2])
186
+
187
+ return hidden, logit, alignments
starganv2vc_paddle/Utils/JDC/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+
starganv2vc_paddle/Utils/JDC/model.py ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Implementation of model from:
3
+ Kum et al. - "Joint Detection and Classification of Singing Voice Melody Using
4
+ Convolutional Recurrent Neural Networks" (2019)
5
+ Link: https://www.semanticscholar.org/paper/Joint-Detection-and-Classification-of-Singing-Voice-Kum-Nam/60a2ad4c7db43bace75805054603747fcd062c0d
6
+ """
7
+ import paddle
8
+ from paddle import nn
9
+
10
+ class JDCNet(nn.Layer):
11
+ """
12
+ Joint Detection and Classification Network model for singing voice melody.
13
+ """
14
+ def __init__(self, num_class=722, seq_len=31, leaky_relu_slope=0.01):
15
+ super().__init__()
16
+ self.seq_len = seq_len # 31
17
+ self.num_class = num_class
18
+
19
+ # input = (b, 1, 31, 513), b = batch size
20
+ self.conv_block = nn.Sequential(
21
+ nn.Conv2D(in_channels=1, out_channels=64, kernel_size=3, padding=1, bias_attr=False), # out: (b, 64, 31, 513)
22
+ nn.BatchNorm2D(num_features=64),
23
+ nn.LeakyReLU(leaky_relu_slope),
24
+ nn.Conv2D(64, 64, 3, padding=1, bias_attr=False), # (b, 64, 31, 513)
25
+ )
26
+
27
+ # res blocks
28
+ self.res_block1 = ResBlock(in_channels=64, out_channels=128) # (b, 128, 31, 128)
29
+ self.res_block2 = ResBlock(in_channels=128, out_channels=192) # (b, 192, 31, 32)
30
+ self.res_block3 = ResBlock(in_channels=192, out_channels=256) # (b, 256, 31, 8)
31
+
32
+ # pool block
33
+ self.pool_block = nn.Sequential(
34
+ nn.BatchNorm2D(num_features=256),
35
+ nn.LeakyReLU(leaky_relu_slope),
36
+ nn.MaxPool2D(kernel_size=(1, 4)), # (b, 256, 31, 2)
37
+ nn.Dropout(p=0.5),
38
+ )
39
+
40
+ # maxpool layers (for auxiliary network inputs)
41
+ # in = (b, 128, 31, 513) from conv_block, out = (b, 128, 31, 2)
42
+ self.maxpool1 = nn.MaxPool2D(kernel_size=(1, 40))
43
+ # in = (b, 128, 31, 128) from res_block1, out = (b, 128, 31, 2)
44
+ self.maxpool2 = nn.MaxPool2D(kernel_size=(1, 20))
45
+ # in = (b, 128, 31, 32) from res_block2, out = (b, 128, 31, 2)
46
+ self.maxpool3 = nn.MaxPool2D(kernel_size=(1, 10))
47
+
48
+ # in = (b, 640, 31, 2), out = (b, 256, 31, 2)
49
+ self.detector_conv = nn.Sequential(
50
+ nn.Conv2D(640, 256, 1, bias_attr=False),
51
+ nn.BatchNorm2D(256),
52
+ nn.LeakyReLU(leaky_relu_slope),
53
+ nn.Dropout(p=0.5),
54
+ )
55
+
56
+ # input: (b, 31, 512) - resized from (b, 256, 31, 2)
57
+ self.bilstm_classifier = nn.LSTM(
58
+ input_size=512, hidden_size=256,
59
+ time_major=False, direction='bidirectional') # (b, 31, 512)
60
+
61
+ # input: (b, 31, 512) - resized from (b, 256, 31, 2)
62
+ self.bilstm_detector = nn.LSTM(
63
+ input_size=512, hidden_size=256,
64
+ time_major=False, direction='bidirectional') # (b, 31, 512)
65
+
66
+ # input: (b * 31, 512)
67
+ self.classifier = nn.Linear(in_features=512, out_features=self.num_class) # (b * 31, num_class)
68
+
69
+ # input: (b * 31, 512)
70
+ self.detector = nn.Linear(in_features=512, out_features=2) # (b * 31, 2) - binary classifier
71
+
72
+ # initialize weights
73
+ self.apply(self.init_weights)
74
+
75
+ def get_feature_GAN(self, x):
76
+ seq_len = x.shape[-2]
77
+ x = x.astype(paddle.float32).transpose([0,1,3,2] if len(x.shape) == 4 else [0,2,1])
78
+
79
+ convblock_out = self.conv_block(x)
80
+
81
+ resblock1_out = self.res_block1(convblock_out)
82
+ resblock2_out = self.res_block2(resblock1_out)
83
+ resblock3_out = self.res_block3(resblock2_out)
84
+ poolblock_out = self.pool_block[0](resblock3_out)
85
+ poolblock_out = self.pool_block[1](poolblock_out)
86
+
87
+ return poolblock_out.transpose([0,1,3,2] if len(poolblock_out.shape) == 4 else [0,2,1])
88
+
89
+ def forward(self, x):
90
+ """
91
+ Returns:
92
+ classification_prediction, detection_prediction
93
+ sizes: (b, 31, 722), (b, 31, 2)
94
+ """
95
+ ###############################
96
+ # forward pass for classifier #
97
+ ###############################
98
+ x = x.astype(paddle.float32).transpose([0,1,3,2] if len(x.shape) == 4 else [0,2,1])
99
+
100
+ convblock_out = self.conv_block(x)
101
+
102
+ resblock1_out = self.res_block1(convblock_out)
103
+ resblock2_out = self.res_block2(resblock1_out)
104
+ resblock3_out = self.res_block3(resblock2_out)
105
+
106
+
107
+ poolblock_out = self.pool_block[0](resblock3_out)
108
+ poolblock_out = self.pool_block[1](poolblock_out)
109
+ GAN_feature = poolblock_out.transpose([0,1,3,2] if len(poolblock_out.shape) == 4 else [0,2,1])
110
+ poolblock_out = self.pool_block[2](poolblock_out)
111
+
112
+ # (b, 256, 31, 2) => (b, 31, 256, 2) => (b, 31, 512)
113
+ classifier_out = poolblock_out.transpose([0, 2, 1, 3]).reshape((-1, self.seq_len, 512))
114
+ self.bilstm_classifier.flatten_parameters()
115
+ classifier_out, _ = self.bilstm_classifier(classifier_out) # ignore the hidden states
116
+
117
+ classifier_out = classifier_out.reshape((-1, 512)) # (b * 31, 512)
118
+ classifier_out = self.classifier(classifier_out)
119
+ classifier_out = classifier_out.reshape((-1, self.seq_len, self.num_class)) # (b, 31, num_class)
120
+
121
+ # sizes: (b, 31, 722), (b, 31, 2)
122
+ # classifier output consists of predicted pitch classes per frame
123
+ # detector output consists of: (isvoice, notvoice) estimates per frame
124
+ return paddle.abs(classifier_out.squeeze()), GAN_feature, poolblock_out
125
+
126
+ @staticmethod
127
+ def init_weights(m):
128
+ if isinstance(m, nn.Linear):
129
+ nn.initializer.KaimingUniform()(m.weight)
130
+ if m.bias is not None:
131
+ nn.initializer.Constant(0)(m.bias)
132
+ elif isinstance(m, nn.Conv2D):
133
+ nn.initializer.XavierNormal()(m.weight)
134
+ elif isinstance(m, nn.LSTM) or isinstance(m, nn.LSTMCell):
135
+ for p in m.parameters():
136
+ if len(p.shape) >= 2 and float('.'.join(paddle.__version__.split('.')[:2])) >= 2.3:
137
+ nn.initializer.Orthogonal()(p)
138
+ else:
139
+ nn.initializer.Normal()(p)
140
+
141
+
142
+ class ResBlock(nn.Layer):
143
+ def __init__(self, in_channels: int, out_channels: int, leaky_relu_slope=0.01):
144
+ super().__init__()
145
+ self.downsample = in_channels != out_channels
146
+
147
+ # BN / LReLU / MaxPool layer before the conv layer - see Figure 1b in the paper
148
+ self.pre_conv = nn.Sequential(
149
+ nn.BatchNorm2D(num_features=in_channels),
150
+ nn.LeakyReLU(leaky_relu_slope),
151
+ nn.MaxPool2D(kernel_size=(1, 2)), # apply downsampling on the y axis only
152
+ )
153
+
154
+ # conv layers
155
+ self.conv = nn.Sequential(
156
+ nn.Conv2D(in_channels=in_channels, out_channels=out_channels,
157
+ kernel_size=3, padding=1, bias_attr=False),
158
+ nn.BatchNorm2D(out_channels),
159
+ nn.LeakyReLU(leaky_relu_slope),
160
+ nn.Conv2D(out_channels, out_channels, 3, padding=1, bias_attr=False),
161
+ )
162
+
163
+ # 1 x 1 convolution layer to match the feature dimensions
164
+ self.conv1by1 = None
165
+ if self.downsample:
166
+ self.conv1by1 = nn.Conv2D(in_channels, out_channels, 1, bias_attr=False)
167
+
168
+ def forward(self, x):
169
+ x = self.pre_conv(x)
170
+ if self.downsample:
171
+ x = self.conv(x) + self.conv1by1(x)
172
+ else:
173
+ x = self.conv(x) + x
174
+ return x
starganv2vc_paddle/Utils/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+
starganv2vc_paddle/fbank_matrix.pd ADDED
Binary file (328 kB). View file
 
starganv2vc_paddle/losses.py ADDED
@@ -0,0 +1,215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #coding:utf-8
2
+
3
+ import os
4
+ import paddle
5
+
6
+ from paddle import nn
7
+ from munch import Munch
8
+ from starganv2vc_paddle.transforms import build_transforms
9
+
10
+ import paddle.nn.functional as F
11
+ import numpy as np
12
+
13
+ def compute_d_loss(nets, args, x_real, y_org, y_trg, z_trg=None, x_ref=None, use_r1_reg=True, use_adv_cls=False, use_con_reg=False):
14
+ args = Munch(args)
15
+
16
+ assert (z_trg is None) != (x_ref is None)
17
+ # with real audios
18
+ x_real.stop_gradient = False
19
+ out = nets.discriminator(x_real, y_org)
20
+ loss_real = adv_loss(out, 1)
21
+
22
+ # R1 regularizaition (https://arxiv.org/abs/1801.04406v4)
23
+ if use_r1_reg:
24
+ loss_reg = r1_reg(out, x_real)
25
+ else:
26
+ loss_reg = paddle.to_tensor([0.], dtype=paddle.float32)
27
+
28
+ # consistency regularization (bCR-GAN: https://arxiv.org/abs/2002.04724)
29
+ loss_con_reg = paddle.to_tensor([0.], dtype=paddle.float32)
30
+ if use_con_reg:
31
+ t = build_transforms()
32
+ out_aug = nets.discriminator(t(x_real).detach(), y_org)
33
+ loss_con_reg += F.smooth_l1_loss(out, out_aug)
34
+
35
+ # with fake audios
36
+ with paddle.no_grad():
37
+ if z_trg is not None:
38
+ s_trg = nets.mapping_network(z_trg, y_trg)
39
+ else: # x_ref is not None
40
+ s_trg = nets.style_encoder(x_ref, y_trg)
41
+
42
+ F0 = nets.f0_model.get_feature_GAN(x_real)
43
+ x_fake = nets.generator(x_real, s_trg, masks=None, F0=F0)
44
+ out = nets.discriminator(x_fake, y_trg)
45
+ loss_fake = adv_loss(out, 0)
46
+ if use_con_reg:
47
+ out_aug = nets.discriminator(t(x_fake).detach(), y_trg)
48
+ loss_con_reg += F.smooth_l1_loss(out, out_aug)
49
+
50
+ # adversarial classifier loss
51
+ if use_adv_cls:
52
+ out_de = nets.discriminator.classifier(x_fake)
53
+ loss_real_adv_cls = F.cross_entropy(out_de[y_org != y_trg], y_org[y_org != y_trg])
54
+
55
+ if use_con_reg:
56
+ out_de_aug = nets.discriminator.classifier(t(x_fake).detach())
57
+ loss_con_reg += F.smooth_l1_loss(out_de, out_de_aug)
58
+ else:
59
+ loss_real_adv_cls = paddle.zeros([1]).mean()
60
+
61
+ loss = loss_real + loss_fake + args.lambda_reg * loss_reg + \
62
+ args.lambda_adv_cls * loss_real_adv_cls + \
63
+ args.lambda_con_reg * loss_con_reg
64
+
65
+ return loss, Munch(real=loss_real.item(),
66
+ fake=loss_fake.item(),
67
+ reg=loss_reg.item(),
68
+ real_adv_cls=loss_real_adv_cls.item(),
69
+ con_reg=loss_con_reg.item())
70
+
71
+ def compute_g_loss(nets, args, x_real, y_org, y_trg, z_trgs=None, x_refs=None, use_adv_cls=False):
72
+ args = Munch(args)
73
+
74
+ assert (z_trgs is None) != (x_refs is None)
75
+ if z_trgs is not None:
76
+ z_trg, z_trg2 = z_trgs
77
+ if x_refs is not None:
78
+ x_ref, x_ref2 = x_refs
79
+
80
+ # compute style vectors
81
+ if z_trgs is not None:
82
+ s_trg = nets.mapping_network(z_trg, y_trg)
83
+ else:
84
+ s_trg = nets.style_encoder(x_ref, y_trg)
85
+
86
+ # compute ASR/F0 features (real)
87
+ with paddle.no_grad():
88
+ F0_real, GAN_F0_real, cyc_F0_real = nets.f0_model(x_real)
89
+ ASR_real = nets.asr_model.get_feature(x_real)
90
+
91
+ # adversarial loss
92
+ x_fake = nets.generator(x_real, s_trg, masks=None, F0=GAN_F0_real)
93
+ out = nets.discriminator(x_fake, y_trg)
94
+ loss_adv = adv_loss(out, 1)
95
+
96
+ # compute ASR/F0 features (fake)
97
+ F0_fake, GAN_F0_fake, _ = nets.f0_model(x_fake)
98
+ ASR_fake = nets.asr_model.get_feature(x_fake)
99
+
100
+ # norm consistency loss
101
+ x_fake_norm = log_norm(x_fake)
102
+ x_real_norm = log_norm(x_real)
103
+ loss_norm = ((paddle.nn.ReLU()(paddle.abs(x_fake_norm - x_real_norm) - args.norm_bias))**2).mean()
104
+
105
+ # F0 loss
106
+ loss_f0 = f0_loss(F0_fake, F0_real)
107
+
108
+ # style F0 loss (style initialization)
109
+ if x_refs is not None and args.lambda_f0_sty > 0 and not use_adv_cls:
110
+ F0_sty, _, _ = nets.f0_model(x_ref)
111
+ loss_f0_sty = F.l1_loss(compute_mean_f0(F0_fake), compute_mean_f0(F0_sty))
112
+ else:
113
+ loss_f0_sty = paddle.zeros([1]).mean()
114
+
115
+ # ASR loss
116
+ loss_asr = F.smooth_l1_loss(ASR_fake, ASR_real)
117
+
118
+ # style reconstruction loss
119
+ s_pred = nets.style_encoder(x_fake, y_trg)
120
+ loss_sty = paddle.mean(paddle.abs(s_pred - s_trg))
121
+
122
+ # diversity sensitive loss
123
+ if z_trgs is not None:
124
+ s_trg2 = nets.mapping_network(z_trg2, y_trg)
125
+ else:
126
+ s_trg2 = nets.style_encoder(x_ref2, y_trg)
127
+ x_fake2 = nets.generator(x_real, s_trg2, masks=None, F0=GAN_F0_real)
128
+ x_fake2 = x_fake2.detach()
129
+ _, GAN_F0_fake2, _ = nets.f0_model(x_fake2)
130
+ loss_ds = paddle.mean(paddle.abs(x_fake - x_fake2))
131
+ loss_ds += F.smooth_l1_loss(GAN_F0_fake, GAN_F0_fake2.detach())
132
+
133
+ # cycle-consistency loss
134
+ s_org = nets.style_encoder(x_real, y_org)
135
+ x_rec = nets.generator(x_fake, s_org, masks=None, F0=GAN_F0_fake)
136
+ loss_cyc = paddle.mean(paddle.abs(x_rec - x_real))
137
+ # F0 loss in cycle-consistency loss
138
+ if args.lambda_f0 > 0:
139
+ _, _, cyc_F0_rec = nets.f0_model(x_rec)
140
+ loss_cyc += F.smooth_l1_loss(cyc_F0_rec, cyc_F0_real)
141
+ if args.lambda_asr > 0:
142
+ ASR_recon = nets.asr_model.get_feature(x_rec)
143
+ loss_cyc += F.smooth_l1_loss(ASR_recon, ASR_real)
144
+
145
+ # adversarial classifier loss
146
+ if use_adv_cls:
147
+ out_de = nets.discriminator.classifier(x_fake)
148
+ loss_adv_cls = F.cross_entropy(out_de[y_org != y_trg], y_trg[y_org != y_trg])
149
+ else:
150
+ loss_adv_cls = paddle.zeros([1]).mean()
151
+
152
+ loss = args.lambda_adv * loss_adv + args.lambda_sty * loss_sty \
153
+ - args.lambda_ds * loss_ds + args.lambda_cyc * loss_cyc\
154
+ + args.lambda_norm * loss_norm \
155
+ + args.lambda_asr * loss_asr \
156
+ + args.lambda_f0 * loss_f0 \
157
+ + args.lambda_f0_sty * loss_f0_sty \
158
+ + args.lambda_adv_cls * loss_adv_cls
159
+
160
+ return loss, Munch(adv=loss_adv.item(),
161
+ sty=loss_sty.item(),
162
+ ds=loss_ds.item(),
163
+ cyc=loss_cyc.item(),
164
+ norm=loss_norm.item(),
165
+ asr=loss_asr.item(),
166
+ f0=loss_f0.item(),
167
+ adv_cls=loss_adv_cls.item())
168
+
169
+ # for norm consistency loss
170
+ def log_norm(x, mean=-4, std=4, axis=2):
171
+ """
172
+ normalized log mel -> mel -> norm -> log(norm)
173
+ """
174
+ x = paddle.log(paddle.exp(x * std + mean).norm(axis=axis))
175
+ return x
176
+
177
+ # for adversarial loss
178
+ def adv_loss(logits, target):
179
+ assert target in [1, 0]
180
+ if len(logits.shape) > 1:
181
+ logits = logits.reshape([-1])
182
+ targets = paddle.full_like(logits, fill_value=target)
183
+ logits = logits.clip(min=-10, max=10) # prevent nan
184
+ loss = F.binary_cross_entropy_with_logits(logits, targets)
185
+ return loss
186
+
187
+ # for R1 regularization loss
188
+ def r1_reg(d_out, x_in):
189
+ # zero-centered gradient penalty for real images
190
+ batch_size = x_in.shape[0]
191
+ grad_dout = paddle.grad(
192
+ outputs=d_out.sum(), inputs=x_in,
193
+ create_graph=True, retain_graph=True, only_inputs=True
194
+ )[0]
195
+ grad_dout2 = grad_dout.pow(2)
196
+ assert(grad_dout2.shape == x_in.shape)
197
+ reg = 0.5 * grad_dout2.reshape((batch_size, -1)).sum(1).mean(0)
198
+ return reg
199
+
200
+ # for F0 consistency loss
201
+ def compute_mean_f0(f0):
202
+ f0_mean = f0.mean(-1)
203
+ f0_mean = f0_mean.expand((f0.shape[-1], f0_mean.shape[0])).transpose((1, 0)) # (B, M)
204
+ return f0_mean
205
+
206
+ def f0_loss(x_f0, y_f0):
207
+ """
208
+ x.shape = (B, 1, M, L): predict
209
+ y.shape = (B, 1, M, L): target
210
+ """
211
+ # compute the mean
212
+ x_mean = compute_mean_f0(x_f0)
213
+ y_mean = compute_mean_f0(y_f0)
214
+ loss = F.l1_loss(x_f0 / x_mean, y_f0 / y_mean)
215
+ return loss
starganv2vc_paddle/meldataset.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #coding: utf-8
2
+
3
+ import os
4
+ import time
5
+ import random
6
+ import random
7
+ import paddle
8
+ import paddleaudio
9
+
10
+ import numpy as np
11
+ import soundfile as sf
12
+ import paddle.nn.functional as F
13
+
14
+ from paddle import nn
15
+ from paddle.io import DataLoader
16
+
17
+ import logging
18
+ logger = logging.getLogger(__name__)
19
+ logger.setLevel(logging.DEBUG)
20
+
21
+ np.random.seed(1)
22
+ random.seed(1)
23
+
24
+ SPECT_PARAMS = {
25
+ "n_fft": 2048,
26
+ "win_length": 1200,
27
+ "hop_length": 300
28
+ }
29
+ MEL_PARAMS = {
30
+ "n_mels": 80,
31
+ "n_fft": 2048,
32
+ "win_length": 1200,
33
+ "hop_length": 300
34
+ }
35
+
36
+ class MelDataset(paddle.io.Dataset):
37
+ def __init__(self,
38
+ data_list,
39
+ sr=24000,
40
+ validation=False,
41
+ ):
42
+
43
+ _data_list = [l[:-1].split('|') for l in data_list]
44
+ self.data_list = [(path, int(label)) for path, label in _data_list]
45
+ self.data_list_per_class = {
46
+ target: [(path, label) for path, label in self.data_list if label == target] \
47
+ for target in list(set([label for _, label in self.data_list]))}
48
+
49
+ self.sr = sr
50
+ self.to_melspec = paddleaudio.features.MelSpectrogram(**MEL_PARAMS)
51
+ self.to_melspec.fbank_matrix[:] = paddle.load(os.path.dirname(__file__) + '/fbank_matrix.pd')['fbank_matrix']
52
+
53
+ self.mean, self.std = -4, 4
54
+ self.validation = validation
55
+ self.max_mel_length = 192
56
+
57
+ def __len__(self):
58
+ return len(self.data_list)
59
+
60
+ def __getitem__(self, idx):
61
+ with paddle.fluid.dygraph.guard(paddle.CPUPlace()):
62
+ data = self.data_list[idx]
63
+ mel_tensor, label = self._load_data(data)
64
+ ref_data = random.choice(self.data_list)
65
+ ref_mel_tensor, ref_label = self._load_data(ref_data)
66
+ ref2_data = random.choice(self.data_list_per_class[ref_label])
67
+ ref2_mel_tensor, _ = self._load_data(ref2_data)
68
+ return mel_tensor, label, ref_mel_tensor, ref2_mel_tensor, ref_label
69
+
70
+ def _load_data(self, path):
71
+ wave_tensor, label = self._load_tensor(path)
72
+
73
+ if not self.validation: # random scale for robustness
74
+ random_scale = 0.5 + 0.5 * np.random.random()
75
+ wave_tensor = random_scale * wave_tensor
76
+
77
+ mel_tensor = self.to_melspec(wave_tensor)
78
+ mel_tensor = (paddle.log(1e-5 + mel_tensor) - self.mean) / self.std
79
+ mel_length = mel_tensor.shape[1]
80
+ if mel_length > self.max_mel_length:
81
+ random_start = np.random.randint(0, mel_length - self.max_mel_length)
82
+ mel_tensor = mel_tensor[:, random_start:random_start + self.max_mel_length]
83
+
84
+ return mel_tensor, label
85
+
86
+ def _preprocess(self, wave_tensor, ):
87
+ mel_tensor = self.to_melspec(wave_tensor)
88
+ mel_tensor = (paddle.log(1e-5 + mel_tensor) - self.mean) / self.std
89
+ return mel_tensor
90
+
91
+ def _load_tensor(self, data):
92
+ wave_path, label = data
93
+ label = int(label)
94
+ wave, sr = sf.read(wave_path)
95
+ wave_tensor = paddle.from_numpy(wave).astype(paddle.float32)
96
+ return wave_tensor, label
97
+
98
+ class Collater(object):
99
+ """
100
+ Args:
101
+ adaptive_batch_size (bool): if true, decrease batch size when long data comes.
102
+ """
103
+
104
+ def __init__(self, return_wave=False):
105
+ self.text_pad_index = 0
106
+ self.return_wave = return_wave
107
+ self.max_mel_length = 192
108
+ self.mel_length_step = 16
109
+ self.latent_dim = 16
110
+
111
+ def __call__(self, batch):
112
+ batch_size = len(batch)
113
+ nmels = batch[0][0].shape[0]
114
+ mels = paddle.zeros((batch_size, nmels, self.max_mel_length)).astype(paddle.float32)
115
+ labels = paddle.zeros((batch_size)).astype(paddle.int64)
116
+ ref_mels = paddle.zeros((batch_size, nmels, self.max_mel_length)).astype(paddle.float32)
117
+ ref2_mels = paddle.zeros((batch_size, nmels, self.max_mel_length)).astype(paddle.float32)
118
+ ref_labels = paddle.zeros((batch_size)).astype(paddle.int64)
119
+
120
+ for bid, (mel, label, ref_mel, ref2_mel, ref_label) in enumerate(batch):
121
+ mel_size = mel.shape[1]
122
+ mels[bid, :, :mel_size] = mel
123
+
124
+ ref_mel_size = ref_mel.shape[1]
125
+ ref_mels[bid, :, :ref_mel_size] = ref_mel
126
+
127
+ ref2_mel_size = ref2_mel.shape[1]
128
+ ref2_mels[bid, :, :ref2_mel_size] = ref2_mel
129
+
130
+ labels[bid] = label
131
+ ref_labels[bid] = ref_label
132
+
133
+ z_trg = paddle.randn((batch_size, self.latent_dim))
134
+ z_trg2 = paddle.randn((batch_size, self.latent_dim))
135
+
136
+ mels, ref_mels, ref2_mels = mels.unsqueeze(1), ref_mels.unsqueeze(1), ref2_mels.unsqueeze(1)
137
+ return mels, labels, ref_mels, ref2_mels, ref_labels, z_trg, z_trg2
138
+
139
+ def build_dataloader(path_list,
140
+ validation=False,
141
+ batch_size=4,
142
+ num_workers=1,
143
+ collate_config={},
144
+ dataset_config={}):
145
+
146
+ dataset = MelDataset(path_list, validation=validation)
147
+ collate_fn = Collater(**collate_config)
148
+ data_loader = DataLoader(dataset,
149
+ batch_size=batch_size,
150
+ shuffle=(not validation),
151
+ num_workers=num_workers,
152
+ drop_last=(not validation),
153
+ collate_fn=collate_fn)
154
+
155
+ return data_loader
starganv2vc_paddle/models.py ADDED
@@ -0,0 +1,391 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ StarGAN v2
3
+ Copyright (c) 2020-present NAVER Corp.
4
+ This work is licensed under the Creative Commons Attribution-NonCommercial
5
+ 4.0 International License. To view a copy of this license, visit
6
+ http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to
7
+ Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
8
+ """
9
+ import os
10
+ import os.path as osp
11
+
12
+ import copy
13
+ import math
14
+
15
+ from munch import Munch
16
+ import numpy as np
17
+ import paddle
18
+ import paddle.nn as nn
19
+ import paddle.nn.functional as F
20
+
21
+ class DownSample(nn.Layer):
22
+ def __init__(self, layer_type):
23
+ super().__init__()
24
+ self.layer_type = layer_type
25
+
26
+ def forward(self, x):
27
+ if self.layer_type == 'none':
28
+ return x
29
+ elif self.layer_type == 'timepreserve':
30
+ return F.avg_pool2d(x, (2, 1))
31
+ elif self.layer_type == 'half':
32
+ return F.avg_pool2d(x, 2)
33
+ else:
34
+ raise RuntimeError('Got unexpected donwsampletype %s, expected is [none, timepreserve, half]' % self.layer_type)
35
+
36
+
37
+ class UpSample(nn.Layer):
38
+ def __init__(self, layer_type):
39
+ super().__init__()
40
+ self.layer_type = layer_type
41
+
42
+ def forward(self, x):
43
+ if self.layer_type == 'none':
44
+ return x
45
+ elif self.layer_type == 'timepreserve':
46
+ return F.interpolate(x, scale_factor=(2, 1), mode='nearest')
47
+ elif self.layer_type == 'half':
48
+ return F.interpolate(x, scale_factor=2, mode='nearest')
49
+ else:
50
+ raise RuntimeError('Got unexpected upsampletype %s, expected is [none, timepreserve, half]' % self.layer_type)
51
+
52
+
53
+ class ResBlk(nn.Layer):
54
+ def __init__(self, dim_in, dim_out, actv=nn.LeakyReLU(0.2),
55
+ normalize=False, downsample='none'):
56
+ super().__init__()
57
+ self.actv = actv
58
+ self.normalize = normalize
59
+ self.downsample = DownSample(downsample)
60
+ self.learned_sc = dim_in != dim_out
61
+ self._build_weights(dim_in, dim_out)
62
+
63
+ def _build_weights(self, dim_in, dim_out):
64
+ self.conv1 = nn.Conv2D(dim_in, dim_in, 3, 1, 1)
65
+ self.conv2 = nn.Conv2D(dim_in, dim_out, 3, 1, 1)
66
+ if self.normalize:
67
+ self.norm1 = nn.InstanceNorm2D(dim_in)
68
+ self.norm2 = nn.InstanceNorm2D(dim_in)
69
+ if self.learned_sc:
70
+ self.conv1x1 = nn.Conv2D(dim_in, dim_out, 1, 1, 0, bias_attr=False)
71
+
72
+ def _shortcut(self, x):
73
+ if self.learned_sc:
74
+ x = self.conv1x1(x)
75
+ if self.downsample:
76
+ x = self.downsample(x)
77
+ return x
78
+
79
+ def _residual(self, x):
80
+ if self.normalize:
81
+ x = self.norm1(x)
82
+ x = self.actv(x)
83
+ x = self.conv1(x)
84
+ x = self.downsample(x)
85
+ if self.normalize:
86
+ x = self.norm2(x)
87
+ x = self.actv(x)
88
+ x = self.conv2(x)
89
+ return x
90
+
91
+ def forward(self, x):
92
+ x = self._shortcut(x) + self._residual(x)
93
+ return x / math.sqrt(2) # unit variance
94
+
95
+ class AdaIN(nn.Layer):
96
+ def __init__(self, style_dim, num_features):
97
+ super().__init__()
98
+ self.norm = nn.InstanceNorm2D(num_features, weight_attr=False, bias_attr=False)
99
+ self.fc = nn.Linear(style_dim, num_features*2)
100
+
101
+ def forward(self, x, s):
102
+ if len(s.shape) == 1:
103
+ s = s[None]
104
+ h = self.fc(s)
105
+ h = h.reshape((h.shape[0], h.shape[1], 1, 1))
106
+ gamma, beta = paddle.split(h, 2, axis=1)
107
+ return (1 + gamma) * self.norm(x) + beta
108
+
109
+
110
+ class AdainResBlk(nn.Layer):
111
+ def __init__(self, dim_in, dim_out, style_dim=64, w_hpf=0,
112
+ actv=nn.LeakyReLU(0.2), upsample='none'):
113
+ super().__init__()
114
+ self.w_hpf = w_hpf
115
+ self.actv = actv
116
+ self.upsample = UpSample(upsample)
117
+ self.learned_sc = dim_in != dim_out
118
+ self._build_weights(dim_in, dim_out, style_dim)
119
+
120
+ def _build_weights(self, dim_in, dim_out, style_dim=64):
121
+ self.conv1 = nn.Conv2D(dim_in, dim_out, 3, 1, 1)
122
+ self.conv2 = nn.Conv2D(dim_out, dim_out, 3, 1, 1)
123
+ self.norm1 = AdaIN(style_dim, dim_in)
124
+ self.norm2 = AdaIN(style_dim, dim_out)
125
+ if self.learned_sc:
126
+ self.conv1x1 = nn.Conv2D(dim_in, dim_out, 1, 1, 0, bias_attr=False)
127
+
128
+ def _shortcut(self, x):
129
+ x = self.upsample(x)
130
+ if self.learned_sc:
131
+ x = self.conv1x1(x)
132
+ return x
133
+
134
+ def _residual(self, x, s):
135
+ x = self.norm1(x, s)
136
+ x = self.actv(x)
137
+ x = self.upsample(x)
138
+ x = self.conv1(x)
139
+ x = self.norm2(x, s)
140
+ x = self.actv(x)
141
+ x = self.conv2(x)
142
+ return x
143
+
144
+ def forward(self, x, s):
145
+ out = self._residual(x, s)
146
+ if self.w_hpf == 0:
147
+ out = (out + self._shortcut(x)) / math.sqrt(2)
148
+ return out
149
+
150
+
151
+ class HighPass(nn.Layer):
152
+ def __init__(self, w_hpf):
153
+ super(HighPass, self).__init__()
154
+ self.filter = paddle.to_tensor([[-1, -1, -1],
155
+ [-1, 8., -1],
156
+ [-1, -1, -1]]) / w_hpf
157
+
158
+ def forward(self, x):
159
+ filter = self.filter.unsqueeze(0).unsqueeze(1).tile([x.shape[1], 1, 1, 1])
160
+ return F.conv2d(x, filter, padding=1, groups=x.shape[1])
161
+
162
+
163
+ class Generator(nn.Layer):
164
+ def __init__(self, dim_in=48, style_dim=48, max_conv_dim=48*8, w_hpf=1, F0_channel=0):
165
+ super().__init__()
166
+
167
+ self.stem = nn.Conv2D(1, dim_in, 3, 1, 1)
168
+ self.encode = nn.LayerList()
169
+ self.decode = nn.LayerList()
170
+ self.to_out = nn.Sequential(
171
+ nn.InstanceNorm2D(dim_in),
172
+ nn.LeakyReLU(0.2),
173
+ nn.Conv2D(dim_in, 1, 1, 1, 0))
174
+ self.F0_channel = F0_channel
175
+ # down/up-sampling blocks
176
+ repeat_num = 4 #int(np.log2(img_size)) - 4
177
+ if w_hpf > 0:
178
+ repeat_num += 1
179
+
180
+ for lid in range(repeat_num):
181
+ if lid in [1, 3]:
182
+ _downtype = 'timepreserve'
183
+ else:
184
+ _downtype = 'half'
185
+
186
+ dim_out = min(dim_in*2, max_conv_dim)
187
+ self.encode.append(
188
+ ResBlk(dim_in, dim_out, normalize=True, downsample=_downtype))
189
+ (self.decode.insert if lid else lambda i, sublayer: self.decode.append(sublayer))(
190
+ 0, AdainResBlk(dim_out, dim_in, style_dim,
191
+ w_hpf=w_hpf, upsample=_downtype)) # stack-like
192
+ dim_in = dim_out
193
+
194
+ # bottleneck blocks (encoder)
195
+ for _ in range(2):
196
+ self.encode.append(
197
+ ResBlk(dim_out, dim_out, normalize=True))
198
+
199
+ # F0 blocks
200
+ if F0_channel != 0:
201
+ self.decode.insert(
202
+ 0, AdainResBlk(dim_out + int(F0_channel / 2), dim_out, style_dim, w_hpf=w_hpf))
203
+
204
+ # bottleneck blocks (decoder)
205
+ for _ in range(2):
206
+ self.decode.insert(
207
+ 0, AdainResBlk(dim_out + int(F0_channel / 2), dim_out + int(F0_channel / 2), style_dim, w_hpf=w_hpf))
208
+
209
+ if F0_channel != 0:
210
+ self.F0_conv = nn.Sequential(
211
+ ResBlk(F0_channel, int(F0_channel / 2), normalize=True, downsample="half"),
212
+ )
213
+
214
+
215
+ if w_hpf > 0:
216
+ self.hpf = HighPass(w_hpf)
217
+
218
+ def forward(self, x, s, masks=None, F0=None):
219
+ x = self.stem(x)
220
+ cache = {}
221
+ for block in self.encode:
222
+ if (masks is not None) and (x.shape[2] in [32, 64, 128]):
223
+ cache[x.shape[2]] = x
224
+ x = block(x)
225
+
226
+ if F0 is not None:
227
+ F0 = self.F0_conv(F0)
228
+ F0 = F.adaptive_avg_pool2d(F0, [x.shape[-2], x.shape[-1]])
229
+ x = paddle.concat([x, F0], axis=1)
230
+
231
+ for block in self.decode:
232
+ x = block(x, s)
233
+ if (masks is not None) and (x.shape[2] in [32, 64, 128]):
234
+ mask = masks[0] if x.shape[2] in [32] else masks[1]
235
+ mask = F.interpolate(mask, size=x.shape[2], mode='bilinear')
236
+ x = x + self.hpf(mask * cache[x.shape[2]])
237
+
238
+ return self.to_out(x)
239
+
240
+
241
+ class MappingNetwork(nn.Layer):
242
+ def __init__(self, latent_dim=16, style_dim=48, num_domains=2, hidden_dim=384):
243
+ super().__init__()
244
+ layers = []
245
+ layers += [nn.Linear(latent_dim, hidden_dim)]
246
+ layers += [nn.ReLU()]
247
+ for _ in range(3):
248
+ layers += [nn.Linear(hidden_dim, hidden_dim)]
249
+ layers += [nn.ReLU()]
250
+ self.shared = nn.Sequential(*layers)
251
+
252
+ self.unshared = nn.LayerList()
253
+ for _ in range(num_domains):
254
+ self.unshared.extend([nn.Sequential(nn.Linear(hidden_dim, hidden_dim),
255
+ nn.ReLU(),
256
+ nn.Linear(hidden_dim, hidden_dim),
257
+ nn.ReLU(),
258
+ nn.Linear(hidden_dim, hidden_dim),
259
+ nn.ReLU(),
260
+ nn.Linear(hidden_dim, style_dim))])
261
+
262
+ def forward(self, z, y):
263
+ h = self.shared(z)
264
+ out = []
265
+ for layer in self.unshared:
266
+ out += [layer(h)]
267
+ out = paddle.stack(out, axis=1) # (batch, num_domains, style_dim)
268
+ idx = paddle.arange(y.shape[0])
269
+ s = out[idx, y] # (batch, style_dim)
270
+ return s
271
+
272
+
273
+ class StyleEncoder(nn.Layer):
274
+ def __init__(self, dim_in=48, style_dim=48, num_domains=2, max_conv_dim=384):
275
+ super().__init__()
276
+ blocks = []
277
+ blocks += [nn.Conv2D(1, dim_in, 3, 1, 1)]
278
+
279
+ repeat_num = 4
280
+ for _ in range(repeat_num):
281
+ dim_out = min(dim_in*2, max_conv_dim)
282
+ blocks += [ResBlk(dim_in, dim_out, downsample='half')]
283
+ dim_in = dim_out
284
+
285
+ blocks += [nn.LeakyReLU(0.2)]
286
+ blocks += [nn.Conv2D(dim_out, dim_out, 5, 1, 0)]
287
+ blocks += [nn.AdaptiveAvgPool2D(1)]
288
+ blocks += [nn.LeakyReLU(0.2)]
289
+ self.shared = nn.Sequential(*blocks)
290
+
291
+ self.unshared = nn.LayerList()
292
+ for _ in range(num_domains):
293
+ self.unshared.append(nn.Linear(dim_out, style_dim))
294
+
295
+ def forward(self, x, y):
296
+ h = self.shared(x)
297
+
298
+ h = h.reshape((h.shape[0], -1))
299
+ out = []
300
+
301
+ for layer in self.unshared:
302
+ out += [layer(h)]
303
+
304
+ out = paddle.stack(out, axis=1) # (batch, num_domains, style_dim)
305
+ idx = paddle.arange(y.shape[0])
306
+ s = out[idx, y] # (batch, style_dim)
307
+ return s
308
+
309
+ class Discriminator(nn.Layer):
310
+ def __init__(self, dim_in=48, num_domains=2, max_conv_dim=384, repeat_num=4):
311
+ super().__init__()
312
+
313
+ # real/fake discriminator
314
+ self.dis = Discriminator2D(dim_in=dim_in, num_domains=num_domains,
315
+ max_conv_dim=max_conv_dim, repeat_num=repeat_num)
316
+ # adversarial classifier
317
+ self.cls = Discriminator2D(dim_in=dim_in, num_domains=num_domains,
318
+ max_conv_dim=max_conv_dim, repeat_num=repeat_num)
319
+ self.num_domains = num_domains
320
+
321
+ def forward(self, x, y):
322
+ return self.dis(x, y)
323
+
324
+ def classifier(self, x):
325
+ return self.cls.get_feature(x)
326
+
327
+
328
+ class LinearNorm(paddle.nn.Layer):
329
+ def __init__(self, in_dim, out_dim, bias=True, w_init_gain='linear'):
330
+ super(LinearNorm, self).__init__()
331
+ self.linear_layer = paddle.nn.Linear(in_dim, out_dim, bias_attr=bias)
332
+
333
+ if float('.'.join(paddle.__version__.split('.')[:2])) >= 2.3:
334
+ gain = paddle.nn.initializer.calculate_gain(w_init_gain)
335
+ paddle.nn.initializer.XavierUniform()(self.linear_layer.weight)
336
+ self.linear_layer.weight.set_value(gain*self.linear_layer.weight)
337
+
338
+ def forward(self, x):
339
+ return self.linear_layer(x)
340
+
341
+ class Discriminator2D(nn.Layer):
342
+ def __init__(self, dim_in=48, num_domains=2, max_conv_dim=384, repeat_num=4):
343
+ super().__init__()
344
+ blocks = []
345
+ blocks += [nn.Conv2D(1, dim_in, 3, 1, 1)]
346
+
347
+ for lid in range(repeat_num):
348
+ dim_out = min(dim_in*2, max_conv_dim)
349
+ blocks += [ResBlk(dim_in, dim_out, downsample='half')]
350
+ dim_in = dim_out
351
+
352
+ blocks += [nn.LeakyReLU(0.2)]
353
+ blocks += [nn.Conv2D(dim_out, dim_out, 5, 1, 0)]
354
+ blocks += [nn.LeakyReLU(0.2)]
355
+ blocks += [nn.AdaptiveAvgPool2D(1)]
356
+ blocks += [nn.Conv2D(dim_out, num_domains, 1, 1, 0)]
357
+ self.main = nn.Sequential(*blocks)
358
+
359
+ def get_feature(self, x):
360
+ out = self.main(x)
361
+ out = out.reshape((out.shape[0], -1)) # (batch, num_domains)
362
+ return out
363
+
364
+ def forward(self, x, y):
365
+ out = self.get_feature(x)
366
+ idx = paddle.arange(y.shape[0])
367
+ out = out[idx, y] # (batch)
368
+ return out
369
+
370
+
371
+ def build_model(args, F0_model, ASR_model):
372
+ generator = Generator(args.dim_in, args.style_dim, args.max_conv_dim, w_hpf=args.w_hpf, F0_channel=args.F0_channel)
373
+ mapping_network = MappingNetwork(args.latent_dim, args.style_dim, args.num_domains, hidden_dim=args.max_conv_dim)
374
+ style_encoder = StyleEncoder(args.dim_in, args.style_dim, args.num_domains, args.max_conv_dim)
375
+ discriminator = Discriminator(args.dim_in, args.num_domains, args.max_conv_dim, args.n_repeat)
376
+ generator_ema = copy.deepcopy(generator)
377
+ mapping_network_ema = copy.deepcopy(mapping_network)
378
+ style_encoder_ema = copy.deepcopy(style_encoder)
379
+
380
+ nets = Munch(generator=generator,
381
+ mapping_network=mapping_network,
382
+ style_encoder=style_encoder,
383
+ discriminator=discriminator,
384
+ f0_model=F0_model,
385
+ asr_model=ASR_model)
386
+
387
+ nets_ema = Munch(generator=generator_ema,
388
+ mapping_network=mapping_network_ema,
389
+ style_encoder=style_encoder_ema)
390
+
391
+ return nets, nets_ema
starganv2vc_paddle/optimizers.py ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #coding:utf-8
2
+ import os, sys
3
+ import os.path as osp
4
+ import numpy as np
5
+ import paddle
6
+ from paddle import nn
7
+ from paddle.optimizer import Optimizer
8
+ from functools import reduce
9
+ from paddle.optimizer import AdamW
10
+
11
+ class MultiOptimizer:
12
+ def __init__(self, optimizers={}, schedulers={}):
13
+ self.optimizers = optimizers
14
+ self.schedulers = schedulers
15
+ self.keys = list(optimizers.keys())
16
+
17
+ def get_lr(self):
18
+ return max([self.optimizers[key].get_lr()
19
+ for key in self.keys])
20
+
21
+ def state_dict(self):
22
+ state_dicts = [(key, self.optimizers[key].state_dict())\
23
+ for key in self.keys]
24
+ return state_dicts
25
+
26
+ def set_state_dict(self, state_dict):
27
+ for key, val in state_dict:
28
+ try:
29
+ self.optimizers[key].set_state_dict(val)
30
+ except:
31
+ print("Unloaded %s" % key)
32
+
33
+ def step(self, key=None, scaler=None):
34
+ keys = [key] if key is not None else self.keys
35
+ _ = [self._step(key, scaler) for key in keys]
36
+
37
+ def _step(self, key, scaler=None):
38
+ if scaler is not None:
39
+ scaler.step(self.optimizers[key])
40
+ scaler.update()
41
+ else:
42
+ self.optimizers[key].step()
43
+
44
+ def clear_grad(self, key=None):
45
+ if key is not None:
46
+ self.optimizers[key].clear_grad()
47
+ else:
48
+ _ = [self.optimizers[key].clear_grad() for key in self.keys]
49
+
50
+ def scheduler(self, *args, key=None):
51
+ if key is not None:
52
+ self.schedulers[key].step(*args)
53
+ else:
54
+ _ = [self.schedulers[key].step(*args) for key in self.keys]
55
+
56
+ def define_scheduler(params):
57
+ print(params)
58
+ # scheduler = paddle.optim.lr_scheduler.OneCycleLR(
59
+ # max_lr=params.get('max_lr', 2e-4),
60
+ # epochs=params.get('epochs', 200),
61
+ # steps_per_epoch=params.get('steps_per_epoch', 1000),
62
+ # pct_start=params.get('pct_start', 0.0),
63
+ # div_factor=1,
64
+ # final_div_factor=1)
65
+ scheduler = paddle.optimizer.lr.CosineAnnealingDecay(
66
+ learning_rate=params.get('max_lr', 2e-4),
67
+ T_max=10)
68
+
69
+ return scheduler
70
+
71
+ def build_optimizer(parameters_dict, scheduler_params_dict):
72
+ schedulers = dict([(key, define_scheduler(params)) \
73
+ for key, params in scheduler_params_dict.items()])
74
+
75
+ optim = dict([(key, AdamW(parameters=parameters_dict[key], learning_rate=sch, weight_decay=1e-4, beta1=0.1, beta2=0.99, epsilon=1e-9))
76
+ for key, sch in schedulers.items()])
77
+
78
+
79
+ multi_optim = MultiOptimizer(optim, schedulers)
80
+ return multi_optim
starganv2vc_paddle/trainer.py ADDED
@@ -0,0 +1,276 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+
3
+ import os
4
+ import os.path as osp
5
+ import sys
6
+ import time
7
+ from collections import defaultdict
8
+
9
+ import numpy as np
10
+ import paddle
11
+ from paddle import nn
12
+ from PIL import Image
13
+ from tqdm import tqdm
14
+
15
+ from starganv2vc_paddle.losses import compute_d_loss, compute_g_loss
16
+
17
+ import logging
18
+ logger = logging.getLogger(__name__)
19
+ logger.setLevel(logging.DEBUG)
20
+
21
+ class Trainer(object):
22
+ def __init__(self,
23
+ args,
24
+ model=None,
25
+ model_ema=None,
26
+ optimizer=None,
27
+ scheduler=None,
28
+ config={},
29
+ logger=logger,
30
+ train_dataloader=None,
31
+ val_dataloader=None,
32
+ initial_steps=0,
33
+ initial_epochs=0,
34
+ fp16_run=False
35
+ ):
36
+ self.args = args
37
+ self.steps = initial_steps
38
+ self.epochs = initial_epochs
39
+ self.model = model
40
+ self.model_ema = model_ema
41
+ self.optimizer = optimizer
42
+ self.scheduler = scheduler
43
+ self.train_dataloader = train_dataloader
44
+ self.val_dataloader = val_dataloader
45
+ self.config = config
46
+ self.finish_train = False
47
+ self.logger = logger
48
+ self.fp16_run = fp16_run
49
+
50
+ def _train_epoch(self):
51
+ """Train model one epoch."""
52
+ raise NotImplementedError
53
+
54
+ @paddle.no_grad()
55
+ def _eval_epoch(self):
56
+ """Evaluate model one epoch."""
57
+ pass
58
+
59
+ def save_checkpoint(self, checkpoint_path):
60
+ """Save checkpoint.
61
+ Args:
62
+ checkpoint_path (str): Checkpoint path to be saved.
63
+ """
64
+ state_dict = {
65
+ "optimizer": self.optimizer.state_dict(),
66
+ "steps": self.steps,
67
+ "epochs": self.epochs,
68
+ "model": {key: self.model[key].state_dict() for key in self.model}
69
+ }
70
+ if self.model_ema is not None:
71
+ state_dict['model_ema'] = {key: self.model_ema[key].state_dict() for key in self.model_ema}
72
+
73
+ if not os.path.exists(os.path.dirname(checkpoint_path)):
74
+ os.makedirs(os.path.dirname(checkpoint_path))
75
+ paddle.save(state_dict, checkpoint_path)
76
+
77
+ def load_checkpoint(self, checkpoint_path, load_only_params=False):
78
+ """Load checkpoint.
79
+
80
+ Args:
81
+ checkpoint_path (str): Checkpoint path to be loaded.
82
+ load_only_params (bool): Whether to load only model parameters.
83
+
84
+ """
85
+ state_dict = paddle.load(checkpoint_path)
86
+ if state_dict["model"] is not None:
87
+ for key in self.model:
88
+ self._load(state_dict["model"][key], self.model[key])
89
+
90
+ if self.model_ema is not None:
91
+ for key in self.model_ema:
92
+ self._load(state_dict["model_ema"][key], self.model_ema[key])
93
+
94
+ if not load_only_params:
95
+ self.steps = state_dict["steps"]
96
+ self.epochs = state_dict["epochs"]
97
+ self.optimizer.set_state_dict(state_dict["optimizer"])
98
+
99
+
100
+ def _load(self, states, model, force_load=True):
101
+ model_states = model.state_dict()
102
+ for key, val in states.items():
103
+ try:
104
+ if key not in model_states:
105
+ continue
106
+ if isinstance(val, nn.Parameter):
107
+ val = val.clone().detach()
108
+
109
+ if val.shape != model_states[key].shape:
110
+ self.logger.info("%s does not have same shape" % key)
111
+ print(val.shape, model_states[key].shape)
112
+ if not force_load:
113
+ continue
114
+
115
+ min_shape = np.minimum(np.array(val.shape), np.array(model_states[key].shape))
116
+ slices = [slice(0, min_index) for min_index in min_shape]
117
+ model_states[key][slices][:] = val[slices]
118
+ else:
119
+ model_states[key][:] = val
120
+ except:
121
+ self.logger.info("not exist :%s" % key)
122
+ print("not exist ", key)
123
+
124
+ @staticmethod
125
+ def get_gradient_norm(model):
126
+ total_norm = 0
127
+ for p in model.parameters():
128
+ param_norm = p.grad.data.norm(2)
129
+ total_norm += param_norm.item() ** 2
130
+
131
+ total_norm = np.sqrt(total_norm)
132
+ return total_norm
133
+
134
+ @staticmethod
135
+ def length_to_mask(lengths):
136
+ mask = paddle.arange(lengths.max()).unsqueeze(0).expand([lengths.shape[0], -1]).astype(lengths.dtype)
137
+ mask = paddle.greater_than(mask+1, lengths.unsqueeze(1))
138
+ return mask
139
+
140
+ def _get_lr(self):
141
+ return self.optimizer.get_lr()
142
+
143
+ @staticmethod
144
+ def moving_average(model, model_test, beta=0.999):
145
+ for param, param_test in zip(model.parameters(), model_test.parameters()):
146
+ param_test.set_value(param + beta * (param_test - param))
147
+
148
+ def _train_epoch(self):
149
+ self.epochs += 1
150
+
151
+ train_losses = defaultdict(list)
152
+ _ = [self.model[k].train() for k in self.model]
153
+ scaler = paddle.amp.GradScaler() if self.fp16_run else None
154
+
155
+ use_con_reg = (self.epochs >= self.args.con_reg_epoch)
156
+ use_adv_cls = (self.epochs >= self.args.adv_cls_epoch)
157
+
158
+ for train_steps_per_epoch, batch in enumerate(tqdm(self.train_dataloader, desc="[train]"), 1):
159
+
160
+ ### load data
161
+ x_real, y_org, x_ref, x_ref2, y_trg, z_trg, z_trg2 = batch
162
+
163
+ # train the discriminator (by random reference)
164
+ self.optimizer.clear_grad()
165
+ if scaler is not None:
166
+ with paddle.amp.autocast():
167
+ d_loss, d_losses_latent = compute_d_loss(self.model, self.args.d_loss, x_real, y_org, y_trg, z_trg=z_trg, use_adv_cls=use_adv_cls, use_con_reg=use_con_reg)
168
+ scaler.scale(d_loss).backward()
169
+ else:
170
+ d_loss, d_losses_latent = compute_d_loss(self.model, self.args.d_loss, x_real, y_org, y_trg, z_trg=z_trg, use_adv_cls=use_adv_cls, use_con_reg=use_con_reg)
171
+ d_loss.backward()
172
+ self.optimizer.step('discriminator', scaler=scaler)
173
+
174
+ # train the discriminator (by target reference)
175
+ self.optimizer.clear_grad()
176
+ if scaler is not None:
177
+ with paddle.amp.autocast():
178
+ d_loss, d_losses_ref = compute_d_loss(self.model, self.args.d_loss, x_real, y_org, y_trg, x_ref=x_ref, use_adv_cls=use_adv_cls, use_con_reg=use_con_reg)
179
+ scaler.scale(d_loss).backward()
180
+ else:
181
+ d_loss, d_losses_ref = compute_d_loss(self.model, self.args.d_loss, x_real, y_org, y_trg, x_ref=x_ref, use_adv_cls=use_adv_cls, use_con_reg=use_con_reg)
182
+ d_loss.backward()
183
+
184
+ self.optimizer.step('discriminator', scaler=scaler)
185
+
186
+ # train the generator (by random reference)
187
+ self.optimizer.clear_grad()
188
+ if scaler is not None:
189
+ with paddle.amp.autocast():
190
+ g_loss, g_losses_latent = compute_g_loss(
191
+ self.model, self.args.g_loss, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], use_adv_cls=use_adv_cls)
192
+ scaler.scale(g_loss).backward()
193
+ else:
194
+ g_loss, g_losses_latent = compute_g_loss(
195
+ self.model, self.args.g_loss, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], use_adv_cls=use_adv_cls)
196
+ g_loss.backward()
197
+
198
+ self.optimizer.step('generator', scaler=scaler)
199
+ self.optimizer.step('mapping_network', scaler=scaler)
200
+ self.optimizer.step('style_encoder', scaler=scaler)
201
+
202
+ # train the generator (by target reference)
203
+ self.optimizer.clear_grad()
204
+ if scaler is not None:
205
+ with paddle.amp.autocast():
206
+ g_loss, g_losses_ref = compute_g_loss(
207
+ self.model, self.args.g_loss, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], use_adv_cls=use_adv_cls)
208
+ scaler.scale(g_loss).backward()
209
+ else:
210
+ g_loss, g_losses_ref = compute_g_loss(
211
+ self.model, self.args.g_loss, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], use_adv_cls=use_adv_cls)
212
+ g_loss.backward()
213
+ self.optimizer.step('generator', scaler=scaler)
214
+
215
+ # compute moving average of network parameters
216
+ self.moving_average(self.model.generator, self.model_ema.generator, beta=0.999)
217
+ self.moving_average(self.model.mapping_network, self.model_ema.mapping_network, beta=0.999)
218
+ self.moving_average(self.model.style_encoder, self.model_ema.style_encoder, beta=0.999)
219
+ self.optimizer.scheduler()
220
+
221
+ for key in d_losses_latent:
222
+ train_losses["train/%s" % key].append(d_losses_latent[key])
223
+ for key in g_losses_latent:
224
+ train_losses["train/%s" % key].append(g_losses_latent[key])
225
+
226
+
227
+ train_losses = {key: np.mean(value) for key, value in train_losses.items()}
228
+ return train_losses
229
+
230
+ @paddle.no_grad()
231
+ def _eval_epoch(self):
232
+ use_adv_cls = (self.epochs >= self.args.adv_cls_epoch)
233
+
234
+ eval_losses = defaultdict(list)
235
+ eval_images = defaultdict(list)
236
+ _ = [self.model[k].eval() for k in self.model]
237
+ for eval_steps_per_epoch, batch in enumerate(tqdm(self.val_dataloader, desc="[eval]"), 1):
238
+
239
+ ### load data
240
+ x_real, y_org, x_ref, x_ref2, y_trg, z_trg, z_trg2 = batch
241
+
242
+ # train the discriminator
243
+ d_loss, d_losses_latent = compute_d_loss(
244
+ self.model, self.args.d_loss, x_real, y_org, y_trg, z_trg=z_trg, use_r1_reg=False, use_adv_cls=use_adv_cls)
245
+ d_loss, d_losses_ref = compute_d_loss(
246
+ self.model, self.args.d_loss, x_real, y_org, y_trg, x_ref=x_ref, use_r1_reg=False, use_adv_cls=use_adv_cls)
247
+
248
+ # train the generator
249
+ g_loss, g_losses_latent = compute_g_loss(
250
+ self.model, self.args.g_loss, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], use_adv_cls=use_adv_cls)
251
+ g_loss, g_losses_ref = compute_g_loss(
252
+ self.model, self.args.g_loss, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], use_adv_cls=use_adv_cls)
253
+
254
+ for key in d_losses_latent:
255
+ eval_losses["eval/%s" % key].append(d_losses_latent[key])
256
+ for key in g_losses_latent:
257
+ eval_losses["eval/%s" % key].append(g_losses_latent[key])
258
+
259
+ # if eval_steps_per_epoch % 10 == 0:
260
+ # # generate x_fake
261
+ # s_trg = self.model_ema.style_encoder(x_ref, y_trg)
262
+ # F0 = self.model.f0_model.get_feature_GAN(x_real)
263
+ # x_fake = self.model_ema.generator(x_real, s_trg, masks=None, F0=F0)
264
+ # # generate x_recon
265
+ # s_real = self.model_ema.style_encoder(x_real, y_org)
266
+ # F0_fake = self.model.f0_model.get_feature_GAN(x_fake)
267
+ # x_recon = self.model_ema.generator(x_fake, s_real, masks=None, F0=F0_fake)
268
+
269
+ # eval_images['eval/image'].append(
270
+ # ([x_real[0, 0].numpy(),
271
+ # x_fake[0, 0].numpy(),
272
+ # x_recon[0, 0].numpy()]))
273
+
274
+ eval_losses = {key: np.mean(value) for key, value in eval_losses.items()}
275
+ eval_losses.update(eval_images)
276
+ return eval_losses
starganv2vc_paddle/transforms.py ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+
3
+ import numpy as np
4
+ import paddle
5
+ from paddle import nn
6
+ import paddle.nn.functional as F
7
+ import paddleaudio
8
+ import paddleaudio.functional as audio_F
9
+ import random
10
+
11
+ ## 1. RandomTimeStrech
12
+
13
+ class TimeStrech(nn.Layer):
14
+ def __init__(self, scale):
15
+ super(TimeStrech, self).__init__()
16
+ self.scale = scale
17
+
18
+ def forward(self, x):
19
+ mel_size = x.shape[-1]
20
+
21
+ x = F.interpolate(x, scale_factor=(1, self.scale), align_corners=False,
22
+ mode='bilinear').squeeze()
23
+
24
+ if x.shape[-1] < mel_size:
25
+ noise_length = (mel_size - x.shape[-1])
26
+ random_pos = random.randint(0, x.shape[-1]) - noise_length
27
+ if random_pos < 0:
28
+ random_pos = 0
29
+ noise = x[..., random_pos:random_pos + noise_length]
30
+ x = paddle.concat([x, noise], axis=-1)
31
+ else:
32
+ x = x[..., :mel_size]
33
+
34
+ return x.unsqueeze(1)
35
+
36
+ ## 2. PitchShift
37
+ class PitchShift(nn.Layer):
38
+ def __init__(self, shift):
39
+ super(PitchShift, self).__init__()
40
+ self.shift = shift
41
+
42
+ def forward(self, x):
43
+ if len(x.shape) == 2:
44
+ x = x.unsqueeze(0)
45
+ x = x.squeeze()
46
+ mel_size = x.shape[1]
47
+ shift_scale = (mel_size + self.shift) / mel_size
48
+ x = F.interpolate(x.unsqueeze(1), scale_factor=(shift_scale, 1.), align_corners=False,
49
+ mode='bilinear').squeeze(1)
50
+
51
+ x = x[:, :mel_size]
52
+ if x.shape[1] < mel_size:
53
+ pad_size = mel_size - x.shape[1]
54
+ x = paddle.cat([x, paddle.zeros(x.shape[0], pad_size, x.shape[2])], axis=1)
55
+ x = x.squeeze()
56
+ return x.unsqueeze(1)
57
+
58
+ ## 3. ShiftBias
59
+ class ShiftBias(nn.Layer):
60
+ def __init__(self, bias):
61
+ super(ShiftBias, self).__init__()
62
+ self.bias = bias
63
+
64
+ def forward(self, x):
65
+ return x + self.bias
66
+
67
+ ## 4. Scaling
68
+ class SpectScaling(nn.Layer):
69
+ def __init__(self, scale):
70
+ super(SpectScaling, self).__init__()
71
+ self.scale = scale
72
+
73
+ def forward(self, x):
74
+ return x * self.scale
75
+
76
+ ## 5. Time Flip
77
+ class TimeFlip(nn.Layer):
78
+ def __init__(self, length):
79
+ super(TimeFlip, self).__init__()
80
+ self.length = round(length)
81
+
82
+ def forward(self, x):
83
+ if self.length > 1:
84
+ start = np.random.randint(0, x.shape[-1] - self.length)
85
+ x_ret = x.clone()
86
+ x_ret[..., start:start + self.length] = paddle.flip(x[..., start:start + self.length], axis=[-1])
87
+ x = x_ret
88
+ return x
89
+
90
+ class PhaseShuffle2D(nn.Layer):
91
+ def __init__(self, n=2):
92
+ super(PhaseShuffle2D, self).__init__()
93
+ self.n = n
94
+ self.random = random.Random(1)
95
+
96
+ def forward(self, x, move=None):
97
+ # x.size = (B, C, M, L)
98
+ if move is None:
99
+ move = self.random.randint(-self.n, self.n)
100
+
101
+ if move == 0:
102
+ return x
103
+ else:
104
+ left = x[:, :, :, :move]
105
+ right = x[:, :, :, move:]
106
+ shuffled = paddle.concat([right, left], axis=3)
107
+
108
+ return shuffled
109
+
110
+ def build_transforms():
111
+ transforms = [
112
+ lambda M: TimeStrech(1+ (np.random.random()-0.5)*M*0.2),
113
+ lambda M: SpectScaling(1 + (np.random.random()-1)*M*0.1),
114
+ lambda M: PhaseShuffle2D(192),
115
+ ]
116
+ N, M = len(transforms), np.random.random()
117
+ composed = nn.Sequential(
118
+ *[trans(M) for trans in np.random.choice(transforms, N)]
119
+ )
120
+ return composed
test_arch.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ #coding:utf-8
3
+
4
+ import os
5
+ import yaml
6
+ import paddle
7
+ import click
8
+ import warnings
9
+ warnings.simplefilter('ignore')
10
+
11
+ from munch import Munch
12
+
13
+ from starganv2vc_paddle.models import build_model
14
+
15
+ from starganv2vc_paddle.Utils.ASR.models import ASRCNN
16
+ from starganv2vc_paddle.Utils.JDC.model import JDCNet
17
+
18
+
19
+ @click.command()
20
+ @click.option('-p', '--config_path', default='Configs/config.yml', type=str)
21
+
22
+ def main(config_path):
23
+ config = yaml.safe_load(open(config_path))
24
+
25
+ # load ASR model
26
+ ASR_config = config.get('ASR_config', False)
27
+ with open(ASR_config) as f:
28
+ ASR_config = yaml.safe_load(f)
29
+ ASR_model_config = ASR_config['model_params']
30
+ ASR_model = ASRCNN(**ASR_model_config)
31
+ _ = ASR_model.eval()
32
+
33
+ # load F0 model
34
+ F0_model = JDCNet(num_class=1, seq_len=192)
35
+ _ = F0_model.eval()
36
+
37
+ # build model
38
+ _, model_ema = build_model(Munch(config['model_params']), F0_model, ASR_model)
39
+
40
+ asr_input = paddle.randn([4, 80, 192])
41
+ print('ASR model input:', asr_input.shape, 'output:', ASR_model(asr_input).shape)
42
+ mel_input = paddle.randn([4, 1, 192, 512])
43
+ print('F0 model input:', mel_input.shape, 'output:', [t.shape for t in F0_model(mel_input)])
44
+
45
+ _ = [v.eval() for v in model_ema.values()]
46
+ label = paddle.to_tensor([0,1,2,3], dtype=paddle.int64)
47
+ latent_dim = model_ema.mapping_network.shared[0].weight.shape[0]
48
+ latent_style = paddle.randn([4, latent_dim])
49
+ ref = model_ema.mapping_network(latent_style, label)
50
+ mel_input2 = paddle.randn([4, 1, 192, 512])
51
+ style_ref = model_ema.style_encoder(mel_input2, label)
52
+ print('StyleGANv2-VC encoder inputs:', mel_input2.shape, 'output:', style_ref.shape, 'should has the same shape as the ref:', ref.shape)
53
+ f0_feat = F0_model.get_feature_GAN(mel_input)
54
+ out = model_ema.generator(mel_input, style_ref, F0=f0_feat)
55
+ print('StyleGANv2-VC inputs:', label.shape, latent_style.shape, mel_input.shape, 'output:', out.shape)
56
+
57
+ paddle.save({k: v.state_dict() for k, v in model_ema.items()}, 'test_arch.pd')
58
+ file_size = os.path.getsize('test_arch.pd') / float(1024*1024)
59
+ print(f'Main models occupied {file_size:.2f} MB')
60
+ os.remove('test_arch.pd')
61
+
62
+ return 0
63
+
64
+ if __name__=="__main__":
65
+ main()
train.py ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ #coding:utf-8
3
+
4
+ import os
5
+ import os.path as osp
6
+ import re
7
+ import sys
8
+ import yaml
9
+ import shutil
10
+ import numpy as np
11
+ import paddle
12
+ import click
13
+ import warnings
14
+ warnings.simplefilter('ignore')
15
+
16
+ from functools import reduce
17
+ from munch import Munch
18
+
19
+ from starganv2vc_paddle.meldataset import build_dataloader
20
+ from starganv2vc_paddle.optimizers import build_optimizer
21
+ from starganv2vc_paddle.models import build_model
22
+ from starganv2vc_paddle.trainer import Trainer
23
+ from visualdl import LogWriter
24
+
25
+ from starganv2vc_paddle.Utils.ASR.models import ASRCNN
26
+ from starganv2vc_paddle.Utils.JDC.model import JDCNet
27
+
28
+ import logging
29
+ from logging import StreamHandler
30
+ logger = logging.getLogger(__name__)
31
+ logger.setLevel(logging.DEBUG)
32
+ handler = StreamHandler()
33
+ handler.setLevel(logging.DEBUG)
34
+ logger.addHandler(handler)
35
+
36
+
37
+ @click.command()
38
+ @click.option('-p', '--config_path', default='Configs/config.yml', type=str)
39
+
40
+ def main(config_path):
41
+ config = yaml.safe_load(open(config_path))
42
+
43
+ log_dir = config['log_dir']
44
+ if not osp.exists(log_dir): os.makedirs(log_dir, exist_ok=True)
45
+ shutil.copy(config_path, osp.join(log_dir, osp.basename(config_path)))
46
+ writer = LogWriter(log_dir + "/visualdl")
47
+
48
+ # write logs
49
+ file_handler = logging.FileHandler(osp.join(log_dir, 'train.log'))
50
+ file_handler.setLevel(logging.DEBUG)
51
+ file_handler.setFormatter(logging.Formatter('%(levelname)s:%(asctime)s: %(message)s'))
52
+ logger.addHandler(file_handler)
53
+
54
+ batch_size = config.get('batch_size', 10)
55
+ epochs = config.get('epochs', 1000)
56
+ save_freq = config.get('save_freq', 20)
57
+ train_path = config.get('train_data', None)
58
+ val_path = config.get('val_data', None)
59
+ stage = config.get('stage', 'star')
60
+ fp16_run = config.get('fp16_run', False)
61
+
62
+ # load data
63
+ train_list, val_list = get_data_path_list(train_path, val_path)
64
+ train_dataloader = build_dataloader(train_list,
65
+ batch_size=batch_size,
66
+ num_workers=4)
67
+ val_dataloader = build_dataloader(val_list,
68
+ batch_size=batch_size,
69
+ validation=True,
70
+ num_workers=2)
71
+
72
+ # load pretrained ASR model
73
+ ASR_config = config.get('ASR_config', False)
74
+ ASR_path = config.get('ASR_path', False)
75
+ with open(ASR_config) as f:
76
+ ASR_config = yaml.safe_load(f)
77
+ ASR_model_config = ASR_config['model_params']
78
+ ASR_model = ASRCNN(**ASR_model_config)
79
+ params = paddle.load(ASR_path)['model']
80
+ ASR_model.set_state_dict(params)
81
+ _ = ASR_model.eval()
82
+
83
+ # load pretrained F0 model
84
+ F0_path = config.get('F0_path', False)
85
+ F0_model = JDCNet(num_class=1, seq_len=192)
86
+ params = paddle.load(F0_path)['net']
87
+ F0_model.set_state_dict(params)
88
+
89
+ # build model
90
+ model, model_ema = build_model(Munch(config['model_params']), F0_model, ASR_model)
91
+
92
+ scheduler_params = {
93
+ "max_lr": float(config['optimizer_params'].get('lr', 2e-4)),
94
+ "pct_start": float(config['optimizer_params'].get('pct_start', 0.0)),
95
+ "epochs": epochs,
96
+ "steps_per_epoch": len(train_dataloader),
97
+ }
98
+
99
+ scheduler_params_dict = {key: scheduler_params.copy() for key in model}
100
+ scheduler_params_dict['mapping_network']['max_lr'] = 2e-6
101
+ optimizer = build_optimizer({key: model[key].parameters() for key in model},
102
+ scheduler_params_dict=scheduler_params_dict)
103
+
104
+ trainer = Trainer(args=Munch(config['loss_params']), model=model,
105
+ model_ema=model_ema,
106
+ optimizer=optimizer,
107
+ train_dataloader=train_dataloader,
108
+ val_dataloader=val_dataloader,
109
+ logger=logger,
110
+ fp16_run=fp16_run)
111
+
112
+ if config.get('pretrained_model', '') != '':
113
+ trainer.load_checkpoint(config['pretrained_model'],
114
+ load_only_params=config.get('load_only_params', True))
115
+
116
+ for _ in range(1, epochs+1):
117
+ epoch = trainer.epochs
118
+ train_results = trainer._train_epoch()
119
+ eval_results = trainer._eval_epoch()
120
+ results = train_results.copy()
121
+ results.update(eval_results)
122
+ logger.info('--- epoch %d ---' % epoch)
123
+ for key, value in results.items():
124
+ if isinstance(value, float):
125
+ logger.info('%-15s: %.4f' % (key, value))
126
+ writer.add_scalar(key, value, epoch)
127
+ else:
128
+ for v in value:
129
+ writer.add_histogram('eval_spec', v, epoch)
130
+ if (epoch % save_freq) == 0:
131
+ trainer.save_checkpoint(osp.join(log_dir, 'epoch_%05d.pd' % epoch))
132
+
133
+ return 0
134
+
135
+ def get_data_path_list(train_path=None, val_path=None):
136
+ if train_path is None:
137
+ train_path = "Data/train_list.txt"
138
+ if val_path is None:
139
+ val_path = "Data/val_list.txt"
140
+
141
+ with open(train_path, 'r') as f:
142
+ train_list = f.readlines()
143
+ with open(val_path, 'r') as f:
144
+ val_list = f.readlines()
145
+
146
+ return train_list, val_list
147
+
148
+ if __name__=="__main__":
149
+ main()