Upload 5 files
Browse files- log_YGD_gesture_seg_3spk/config.yaml +40 -0
- log_YGD_gesture_seg_3spk/last_best_checkpoint.pt +3 -0
- log_YGD_gesture_seg_3spk/last_checkpoint.pt +3 -0
- log_YGD_gesture_seg_3spk/log_2024-09-26(17:14:29).txt +181 -0
- log_YGD_gesture_seg_3spk/tensorboard/events.out.tfevents.1727342114.bach-gpu011024008016.na620.62208.0 +3 -0
log_YGD_gesture_seg_3spk/config.yaml
ADDED
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## Config file
|
2 |
+
|
3 |
+
# Log
|
4 |
+
seed: 777
|
5 |
+
use_cuda: 1 # 1 for True, 0 for False
|
6 |
+
|
7 |
+
# dataset
|
8 |
+
speaker_no: 3
|
9 |
+
mix_lst_path: ./data/YGD/mixture_data_list_3mix.csv
|
10 |
+
audio_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/gesture_TED/audio_clean/
|
11 |
+
reference_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/
|
12 |
+
audio_sr: 16000
|
13 |
+
ref_sr: 15
|
14 |
+
|
15 |
+
# dataloader
|
16 |
+
num_workers: 4
|
17 |
+
batch_size: 4 # two GPU training with a total effective batch size of 16
|
18 |
+
accu_grad: 1
|
19 |
+
effec_batch_size: 16 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
|
20 |
+
max_length: 10 # truncate the utterances in dataloader, in seconds
|
21 |
+
|
22 |
+
# network settings
|
23 |
+
init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
|
24 |
+
causal: 0 # 1 for True, 0 for False
|
25 |
+
network_reference:
|
26 |
+
cue: gesture # lip or speech or gesture or EEG
|
27 |
+
network_audio:
|
28 |
+
backbone: seg
|
29 |
+
N: 256
|
30 |
+
L: 40
|
31 |
+
B: 64
|
32 |
+
H: 128
|
33 |
+
K: 100
|
34 |
+
R: 6
|
35 |
+
|
36 |
+
# optimizer
|
37 |
+
loss_type: sisdr # "snr", "sisdr", "hybrid"
|
38 |
+
init_learning_rate: 0.0005
|
39 |
+
max_epoch: 200
|
40 |
+
clip_grad_norm: 5
|
log_YGD_gesture_seg_3spk/last_best_checkpoint.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8728490c9875ca6c0d34c35529c6080582406eabd69f9ebf4930d730500a4c1a
|
3 |
+
size 53036661
|
log_YGD_gesture_seg_3spk/last_checkpoint.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d9085aece16f5634a9e8b4a55cd5e57b14bd3e67818db83fa2d3049da5b3deb6
|
3 |
+
size 53036661
|
log_YGD_gesture_seg_3spk/log_2024-09-26(17:14:29).txt
ADDED
@@ -0,0 +1,181 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## Config file
|
2 |
+
|
3 |
+
# Log
|
4 |
+
seed: 777
|
5 |
+
use_cuda: 1 # 1 for True, 0 for False
|
6 |
+
|
7 |
+
# dataset
|
8 |
+
speaker_no: 3
|
9 |
+
mix_lst_path: ./data/YGD/mixture_data_list_3mix.csv
|
10 |
+
audio_direc: /mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/audio_clean/
|
11 |
+
reference_direc: /mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/
|
12 |
+
audio_sr: 16000
|
13 |
+
visual_sr: 15
|
14 |
+
|
15 |
+
# dataloader
|
16 |
+
num_workers: 4
|
17 |
+
batch_size: 4 # two GPU training with a total effective batch size of 16
|
18 |
+
accu_grad: 1
|
19 |
+
effec_batch_size: 16 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
|
20 |
+
max_length: 10 # truncate the utterances in dataloader, in seconds
|
21 |
+
|
22 |
+
# network settings
|
23 |
+
init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
|
24 |
+
causal: 0 # 1 for True, 0 for False
|
25 |
+
network_reference:
|
26 |
+
cue: gesture # lip or speech or gesture or EEG
|
27 |
+
network_audio:
|
28 |
+
backbone: seg
|
29 |
+
N: 256
|
30 |
+
L: 40
|
31 |
+
B: 64
|
32 |
+
H: 128
|
33 |
+
K: 100
|
34 |
+
R: 6
|
35 |
+
|
36 |
+
# optimizer
|
37 |
+
loss_type: sisdr # "snr", "sisdr", "hybrid"
|
38 |
+
init_learning_rate: 0.0005
|
39 |
+
max_epoch: 200
|
40 |
+
clip_grad_norm: 5
|
41 |
+
WARNING:torch.distributed.run:
|
42 |
+
*****************************************
|
43 |
+
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
|
44 |
+
*****************************************
|
45 |
+
started on checkpoints/log_2024-09-26(17:14:29)
|
46 |
+
|
47 |
+
namespace(seed=777, use_cuda=1, config=[<yamlargparse.Path object at 0x7f1e36ef0ee0>], checkpoint_dir='checkpoints/log_2024-09-26(17:14:29)', train_from_last_checkpoint=0, loss_type='sisdr', init_learning_rate=0.0005, max_epoch=200, clip_grad_norm=5.0, batch_size=4, accu_grad=1, effec_batch_size=16, max_length=10, num_workers=4, causal=0, network_reference=namespace(cue='gesture'), network_audio=namespace(backbone='seg', N=256, L=40, B=64, H=128, K=100, R=6), init_from='None', mix_lst_path='./data/YGD/mixture_data_list_3mix.csv', audio_direc='/mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/audio_clean/', reference_direc='/mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/', speaker_no=3, audio_sr=16000, visual_sr=15, local_rank=0, distributed=True, world_size=2, device=device(type='cuda'))
|
48 |
+
network_wrapper(
|
49 |
+
(sep_network): seg(
|
50 |
+
(encoder): Encoder(
|
51 |
+
(conv1d_U): Conv1d(1, 256, kernel_size=(40,), stride=(20,), bias=False)
|
52 |
+
)
|
53 |
+
(separator): rnn(
|
54 |
+
(layer_norm): GroupNorm(1, 256, eps=1e-08, affine=True)
|
55 |
+
(bottleneck_conv1x1): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
|
56 |
+
(dual_rnn): ModuleList(
|
57 |
+
(0-5): 6 x Dual_RNN_Block(
|
58 |
+
(intra_rnn): LSTM(64, 128, batch_first=True, bidirectional=True)
|
59 |
+
(inter_rnn): LSTM(64, 128, batch_first=True, bidirectional=True)
|
60 |
+
(intra_norm): GroupNorm(1, 64, eps=1e-08, affine=True)
|
61 |
+
(inter_norm): GroupNorm(1, 64, eps=1e-08, affine=True)
|
62 |
+
(intra_linear): Linear(in_features=256, out_features=64, bias=True)
|
63 |
+
(inter_linear): Linear(in_features=256, out_features=64, bias=True)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(prelu): PReLU(num_parameters=1)
|
67 |
+
(mask_conv1x1): Conv1d(64, 256, kernel_size=(1,), stride=(1,), bias=False)
|
68 |
+
(visual_net): LSTM(30, 128, num_layers=5, batch_first=True, dropout=0.3, bidirectional=True)
|
69 |
+
(av_conv): Conv1d(320, 64, kernel_size=(1,), stride=(1,), bias=False)
|
70 |
+
)
|
71 |
+
(decoder): Decoder(
|
72 |
+
(basis_signals): Linear(in_features=256, out_features=40, bias=False)
|
73 |
+
)
|
74 |
+
)
|
75 |
+
)
|
76 |
+
|
77 |
+
Total number of parameters: 4401921
|
78 |
+
|
79 |
+
|
80 |
+
Total number of trainable parameters: 4401921
|
81 |
+
|
82 |
+
Start new training from scratch
|
83 |
+
Could not load symbol cublasGetSmCountTarget from libcublas.so.11. Error: /mnt/nas_sg/mit_sg/zexu.pan/applications/miniconda3/envs/ss/lib/python3.9/site-packages/torch/lib/../../../../libcublas.so.11: undefined symbol: cublasGetSmCountTarget
|
84 |
+
[W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
|
85 |
+
Could not load symbol cublasGetSmCountTarget from libcublas.so.11. Error: /mnt/nas_sg/mit_sg/zexu.pan/applications/miniconda3/envs/ss/lib/python3.9/site-packages/torch/lib/../../../../libcublas.so.11: undefined symbol: cublasGetSmCountTarget
|
86 |
+
[W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
|
87 |
+
Train Summary | End of Epoch 1 | Time 8980.03s | Train Loss 2.753
|
88 |
+
Valid Summary | End of Epoch 1 | Time 80.87s | Valid Loss 2.009
|
89 |
+
Test Summary | End of Epoch 1 | Time 58.72s | Test Loss 2.009
|
90 |
+
Fund new best model, dict saved
|
91 |
+
Train Summary | End of Epoch 2 | Time 9114.77s | Train Loss 1.592
|
92 |
+
Valid Summary | End of Epoch 2 | Time 81.36s | Valid Loss 1.080
|
93 |
+
Test Summary | End of Epoch 2 | Time 48.11s | Test Loss 1.080
|
94 |
+
Fund new best model, dict saved
|
95 |
+
Train Summary | End of Epoch 3 | Time 8942.33s | Train Loss 0.577
|
96 |
+
Valid Summary | End of Epoch 3 | Time 78.41s | Valid Loss 0.390
|
97 |
+
Test Summary | End of Epoch 3 | Time 51.14s | Test Loss 0.390
|
98 |
+
Fund new best model, dict saved
|
99 |
+
Train Summary | End of Epoch 4 | Time 8948.55s | Train Loss -0.289
|
100 |
+
Valid Summary | End of Epoch 4 | Time 85.34s | Valid Loss -0.136
|
101 |
+
Test Summary | End of Epoch 4 | Time 50.37s | Test Loss -0.136
|
102 |
+
Fund new best model, dict saved
|
103 |
+
Train Summary | End of Epoch 5 | Time 8968.08s | Train Loss -0.978
|
104 |
+
Valid Summary | End of Epoch 5 | Time 82.28s | Valid Loss -0.504
|
105 |
+
Test Summary | End of Epoch 5 | Time 50.15s | Test Loss -0.504
|
106 |
+
Fund new best model, dict saved
|
107 |
+
Train Summary | End of Epoch 6 | Time 8948.92s | Train Loss -1.597
|
108 |
+
Valid Summary | End of Epoch 6 | Time 93.25s | Valid Loss -0.469
|
109 |
+
Test Summary | End of Epoch 6 | Time 54.30s | Test Loss -0.469
|
110 |
+
Train Summary | End of Epoch 7 | Time 9007.14s | Train Loss -2.154
|
111 |
+
Valid Summary | End of Epoch 7 | Time 95.28s | Valid Loss -0.870
|
112 |
+
Test Summary | End of Epoch 7 | Time 53.72s | Test Loss -0.870
|
113 |
+
Fund new best model, dict saved
|
114 |
+
Train Summary | End of Epoch 8 | Time 8878.79s | Train Loss -2.667
|
115 |
+
Valid Summary | End of Epoch 8 | Time 103.83s | Valid Loss -0.692
|
116 |
+
Test Summary | End of Epoch 8 | Time 51.33s | Test Loss -0.692
|
117 |
+
Train Summary | End of Epoch 9 | Time 8925.52s | Train Loss -3.144
|
118 |
+
Valid Summary | End of Epoch 9 | Time 103.27s | Valid Loss -0.810
|
119 |
+
Test Summary | End of Epoch 9 | Time 51.32s | Test Loss -0.810
|
120 |
+
Train Summary | End of Epoch 10 | Time 8940.42s | Train Loss -3.612
|
121 |
+
Valid Summary | End of Epoch 10 | Time 112.05s | Valid Loss -1.067
|
122 |
+
Test Summary | End of Epoch 10 | Time 54.05s | Test Loss -1.067
|
123 |
+
Fund new best model, dict saved
|
124 |
+
Train Summary | End of Epoch 11 | Time 8990.64s | Train Loss -4.026
|
125 |
+
Valid Summary | End of Epoch 11 | Time 107.08s | Valid Loss -0.921
|
126 |
+
Test Summary | End of Epoch 11 | Time 54.22s | Test Loss -0.921
|
127 |
+
Train Summary | End of Epoch 12 | Time 8973.45s | Train Loss -4.425
|
128 |
+
Valid Summary | End of Epoch 12 | Time 112.20s | Valid Loss -0.684
|
129 |
+
Test Summary | End of Epoch 12 | Time 52.61s | Test Loss -0.684
|
130 |
+
Train Summary | End of Epoch 13 | Time 8975.87s | Train Loss -4.821
|
131 |
+
Valid Summary | End of Epoch 13 | Time 108.89s | Valid Loss -0.950
|
132 |
+
Test Summary | End of Epoch 13 | Time 53.24s | Test Loss -0.950
|
133 |
+
Train Summary | End of Epoch 14 | Time 9038.47s | Train Loss -5.181
|
134 |
+
Valid Summary | End of Epoch 14 | Time 106.35s | Valid Loss -0.874
|
135 |
+
Test Summary | End of Epoch 14 | Time 54.29s | Test Loss -0.874
|
136 |
+
Train Summary | End of Epoch 15 | Time 9017.49s | Train Loss -5.511
|
137 |
+
Valid Summary | End of Epoch 15 | Time 111.92s | Valid Loss -1.061
|
138 |
+
Test Summary | End of Epoch 15 | Time 51.66s | Test Loss -1.061
|
139 |
+
reload weights and optimizer from last best checkpoint
|
140 |
+
Learning rate adjusted to: 0.000250
|
141 |
+
Train Summary | End of Epoch 16 | Time 9008.05s | Train Loss -4.883
|
142 |
+
Valid Summary | End of Epoch 16 | Time 117.85s | Valid Loss -1.140
|
143 |
+
Test Summary | End of Epoch 16 | Time 56.46s | Test Loss -1.140
|
144 |
+
Fund new best model, dict saved
|
145 |
+
Train Summary | End of Epoch 17 | Time 9021.77s | Train Loss -5.422
|
146 |
+
Valid Summary | End of Epoch 17 | Time 100.85s | Valid Loss -0.781
|
147 |
+
Test Summary | End of Epoch 17 | Time 53.69s | Test Loss -0.781
|
148 |
+
Train Summary | End of Epoch 18 | Time 9041.29s | Train Loss -5.799
|
149 |
+
Valid Summary | End of Epoch 18 | Time 103.86s | Valid Loss -0.656
|
150 |
+
Test Summary | End of Epoch 18 | Time 52.03s | Test Loss -0.656
|
151 |
+
Train Summary | End of Epoch 19 | Time 9018.85s | Train Loss -6.112
|
152 |
+
Valid Summary | End of Epoch 19 | Time 100.50s | Valid Loss -0.758
|
153 |
+
Test Summary | End of Epoch 19 | Time 51.89s | Test Loss -0.758
|
154 |
+
Train Summary | End of Epoch 20 | Time 9036.90s | Train Loss -6.391
|
155 |
+
Valid Summary | End of Epoch 20 | Time 105.47s | Valid Loss -0.480
|
156 |
+
Test Summary | End of Epoch 20 | Time 55.45s | Test Loss -0.480
|
157 |
+
Train Summary | End of Epoch 21 | Time 9015.76s | Train Loss -6.648
|
158 |
+
Valid Summary | End of Epoch 21 | Time 99.11s | Valid Loss -0.962
|
159 |
+
Test Summary | End of Epoch 21 | Time 51.96s | Test Loss -0.962
|
160 |
+
reload weights and optimizer from last best checkpoint
|
161 |
+
Learning rate adjusted to: 0.000125
|
162 |
+
Train Summary | End of Epoch 22 | Time 9050.62s | Train Loss -5.796
|
163 |
+
Valid Summary | End of Epoch 22 | Time 101.54s | Valid Loss -0.840
|
164 |
+
Test Summary | End of Epoch 22 | Time 56.05s | Test Loss -0.840
|
165 |
+
Train Summary | End of Epoch 23 | Time 9014.94s | Train Loss -6.156
|
166 |
+
Valid Summary | End of Epoch 23 | Time 104.00s | Valid Loss -0.683
|
167 |
+
Test Summary | End of Epoch 23 | Time 51.18s | Test Loss -0.683
|
168 |
+
Train Summary | End of Epoch 24 | Time 9046.79s | Train Loss -6.420
|
169 |
+
Valid Summary | End of Epoch 24 | Time 106.82s | Valid Loss -0.820
|
170 |
+
Test Summary | End of Epoch 24 | Time 51.79s | Test Loss -0.820
|
171 |
+
Train Summary | End of Epoch 25 | Time 9018.59s | Train Loss -6.645
|
172 |
+
Valid Summary | End of Epoch 25 | Time 98.66s | Valid Loss -0.555
|
173 |
+
Test Summary | End of Epoch 25 | Time 55.48s | Test Loss -0.555
|
174 |
+
Train Summary | End of Epoch 26 | Time 9076.83s | Train Loss -6.849
|
175 |
+
Valid Summary | End of Epoch 26 | Time 102.13s | Valid Loss -0.612
|
176 |
+
Test Summary | End of Epoch 26 | Time 54.27s | Test Loss -0.612
|
177 |
+
No imporvement for 10 epochs, early stopping.
|
178 |
+
Start evaluation
|
179 |
+
Avg SISNR:i tensor([4.8980], device='cuda:0')
|
180 |
+
Avg SNRi: 5.567126948555473
|
181 |
+
Avg STOIi: 0.09230865127134166
|
log_YGD_gesture_seg_3spk/tensorboard/events.out.tfevents.1727342114.bach-gpu011024008016.na620.62208.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:71582401494f98c65bf1dde18ddec260e7dbd45a93ee45b068564b7f0e5219d1
|
3 |
+
size 3788
|