alibabasglab commited on
Commit
46d5c4e
·
verified ·
1 Parent(s): c2c8f3c

Upload 6 files

Browse files
checkpoints/.DS_Store ADDED
Binary file (6.15 kB). View file
 
checkpoints/log_YGD_gesture_seg_3spk/config.yaml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Config file
2
+
3
+ # Log
4
+ seed: 777
5
+ use_cuda: 1 # 1 for True, 0 for False
6
+
7
+ # dataset
8
+ speaker_no: 3
9
+ mix_lst_path: ./data/YGD/mixture_data_list_3mix.csv
10
+ audio_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/gesture_TED/audio_clean/
11
+ reference_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/
12
+ audio_sr: 16000
13
+ ref_sr: 15
14
+
15
+ # dataloader
16
+ num_workers: 4
17
+ batch_size: 4 # two GPU training with a total effective batch size of 16
18
+ accu_grad: 1
19
+ effec_batch_size: 16 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
20
+ max_length: 10 # truncate the utterances in dataloader, in seconds
21
+
22
+ # network settings
23
+ init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
24
+ causal: 0 # 1 for True, 0 for False
25
+ network_reference:
26
+ cue: gesture # lip or speech or gesture or EEG
27
+ network_audio:
28
+ backbone: seg
29
+ N: 256
30
+ L: 40
31
+ B: 64
32
+ H: 128
33
+ K: 100
34
+ R: 6
35
+
36
+ # optimizer
37
+ loss_type: sisdr # "snr", "sisdr", "hybrid"
38
+ init_learning_rate: 0.0005
39
+ max_epoch: 200
40
+ clip_grad_norm: 5
checkpoints/log_YGD_gesture_seg_3spk/last_best_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8728490c9875ca6c0d34c35529c6080582406eabd69f9ebf4930d730500a4c1a
3
+ size 53036661
checkpoints/log_YGD_gesture_seg_3spk/last_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9085aece16f5634a9e8b4a55cd5e57b14bd3e67818db83fa2d3049da5b3deb6
3
+ size 53036661
checkpoints/log_YGD_gesture_seg_3spk/log_2024-09-26(17:14:29).txt ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Config file
2
+
3
+ # Log
4
+ seed: 777
5
+ use_cuda: 1 # 1 for True, 0 for False
6
+
7
+ # dataset
8
+ speaker_no: 3
9
+ mix_lst_path: ./data/YGD/mixture_data_list_3mix.csv
10
+ audio_direc: /mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/audio_clean/
11
+ reference_direc: /mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/
12
+ audio_sr: 16000
13
+ visual_sr: 15
14
+
15
+ # dataloader
16
+ num_workers: 4
17
+ batch_size: 4 # two GPU training with a total effective batch size of 16
18
+ accu_grad: 1
19
+ effec_batch_size: 16 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
20
+ max_length: 10 # truncate the utterances in dataloader, in seconds
21
+
22
+ # network settings
23
+ init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
24
+ causal: 0 # 1 for True, 0 for False
25
+ network_reference:
26
+ cue: gesture # lip or speech or gesture or EEG
27
+ network_audio:
28
+ backbone: seg
29
+ N: 256
30
+ L: 40
31
+ B: 64
32
+ H: 128
33
+ K: 100
34
+ R: 6
35
+
36
+ # optimizer
37
+ loss_type: sisdr # "snr", "sisdr", "hybrid"
38
+ init_learning_rate: 0.0005
39
+ max_epoch: 200
40
+ clip_grad_norm: 5
41
+ WARNING:torch.distributed.run:
42
+ *****************************************
43
+ Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
44
+ *****************************************
45
+ started on checkpoints/log_2024-09-26(17:14:29)
46
+
47
+ namespace(seed=777, use_cuda=1, config=[<yamlargparse.Path object at 0x7f1e36ef0ee0>], checkpoint_dir='checkpoints/log_2024-09-26(17:14:29)', train_from_last_checkpoint=0, loss_type='sisdr', init_learning_rate=0.0005, max_epoch=200, clip_grad_norm=5.0, batch_size=4, accu_grad=1, effec_batch_size=16, max_length=10, num_workers=4, causal=0, network_reference=namespace(cue='gesture'), network_audio=namespace(backbone='seg', N=256, L=40, B=64, H=128, K=100, R=6), init_from='None', mix_lst_path='./data/YGD/mixture_data_list_3mix.csv', audio_direc='/mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/audio_clean/', reference_direc='/mnt/nas_sg/mit_sg/zexu.pan/datasets/gesture_TED/visual/gesture_embedding/', speaker_no=3, audio_sr=16000, visual_sr=15, local_rank=0, distributed=True, world_size=2, device=device(type='cuda'))
48
+ network_wrapper(
49
+ (sep_network): seg(
50
+ (encoder): Encoder(
51
+ (conv1d_U): Conv1d(1, 256, kernel_size=(40,), stride=(20,), bias=False)
52
+ )
53
+ (separator): rnn(
54
+ (layer_norm): GroupNorm(1, 256, eps=1e-08, affine=True)
55
+ (bottleneck_conv1x1): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
56
+ (dual_rnn): ModuleList(
57
+ (0-5): 6 x Dual_RNN_Block(
58
+ (intra_rnn): LSTM(64, 128, batch_first=True, bidirectional=True)
59
+ (inter_rnn): LSTM(64, 128, batch_first=True, bidirectional=True)
60
+ (intra_norm): GroupNorm(1, 64, eps=1e-08, affine=True)
61
+ (inter_norm): GroupNorm(1, 64, eps=1e-08, affine=True)
62
+ (intra_linear): Linear(in_features=256, out_features=64, bias=True)
63
+ (inter_linear): Linear(in_features=256, out_features=64, bias=True)
64
+ )
65
+ )
66
+ (prelu): PReLU(num_parameters=1)
67
+ (mask_conv1x1): Conv1d(64, 256, kernel_size=(1,), stride=(1,), bias=False)
68
+ (visual_net): LSTM(30, 128, num_layers=5, batch_first=True, dropout=0.3, bidirectional=True)
69
+ (av_conv): Conv1d(320, 64, kernel_size=(1,), stride=(1,), bias=False)
70
+ )
71
+ (decoder): Decoder(
72
+ (basis_signals): Linear(in_features=256, out_features=40, bias=False)
73
+ )
74
+ )
75
+ )
76
+
77
+ Total number of parameters: 4401921
78
+
79
+
80
+ Total number of trainable parameters: 4401921
81
+
82
+ Start new training from scratch
83
+ Could not load symbol cublasGetSmCountTarget from libcublas.so.11. Error: /mnt/nas_sg/mit_sg/zexu.pan/applications/miniconda3/envs/ss/lib/python3.9/site-packages/torch/lib/../../../../libcublas.so.11: undefined symbol: cublasGetSmCountTarget
84
+ [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
85
+ Could not load symbol cublasGetSmCountTarget from libcublas.so.11. Error: /mnt/nas_sg/mit_sg/zexu.pan/applications/miniconda3/envs/ss/lib/python3.9/site-packages/torch/lib/../../../../libcublas.so.11: undefined symbol: cublasGetSmCountTarget
86
+ [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
87
+ Train Summary | End of Epoch 1 | Time 8980.03s | Train Loss 2.753
88
+ Valid Summary | End of Epoch 1 | Time 80.87s | Valid Loss 2.009
89
+ Test Summary | End of Epoch 1 | Time 58.72s | Test Loss 2.009
90
+ Fund new best model, dict saved
91
+ Train Summary | End of Epoch 2 | Time 9114.77s | Train Loss 1.592
92
+ Valid Summary | End of Epoch 2 | Time 81.36s | Valid Loss 1.080
93
+ Test Summary | End of Epoch 2 | Time 48.11s | Test Loss 1.080
94
+ Fund new best model, dict saved
95
+ Train Summary | End of Epoch 3 | Time 8942.33s | Train Loss 0.577
96
+ Valid Summary | End of Epoch 3 | Time 78.41s | Valid Loss 0.390
97
+ Test Summary | End of Epoch 3 | Time 51.14s | Test Loss 0.390
98
+ Fund new best model, dict saved
99
+ Train Summary | End of Epoch 4 | Time 8948.55s | Train Loss -0.289
100
+ Valid Summary | End of Epoch 4 | Time 85.34s | Valid Loss -0.136
101
+ Test Summary | End of Epoch 4 | Time 50.37s | Test Loss -0.136
102
+ Fund new best model, dict saved
103
+ Train Summary | End of Epoch 5 | Time 8968.08s | Train Loss -0.978
104
+ Valid Summary | End of Epoch 5 | Time 82.28s | Valid Loss -0.504
105
+ Test Summary | End of Epoch 5 | Time 50.15s | Test Loss -0.504
106
+ Fund new best model, dict saved
107
+ Train Summary | End of Epoch 6 | Time 8948.92s | Train Loss -1.597
108
+ Valid Summary | End of Epoch 6 | Time 93.25s | Valid Loss -0.469
109
+ Test Summary | End of Epoch 6 | Time 54.30s | Test Loss -0.469
110
+ Train Summary | End of Epoch 7 | Time 9007.14s | Train Loss -2.154
111
+ Valid Summary | End of Epoch 7 | Time 95.28s | Valid Loss -0.870
112
+ Test Summary | End of Epoch 7 | Time 53.72s | Test Loss -0.870
113
+ Fund new best model, dict saved
114
+ Train Summary | End of Epoch 8 | Time 8878.79s | Train Loss -2.667
115
+ Valid Summary | End of Epoch 8 | Time 103.83s | Valid Loss -0.692
116
+ Test Summary | End of Epoch 8 | Time 51.33s | Test Loss -0.692
117
+ Train Summary | End of Epoch 9 | Time 8925.52s | Train Loss -3.144
118
+ Valid Summary | End of Epoch 9 | Time 103.27s | Valid Loss -0.810
119
+ Test Summary | End of Epoch 9 | Time 51.32s | Test Loss -0.810
120
+ Train Summary | End of Epoch 10 | Time 8940.42s | Train Loss -3.612
121
+ Valid Summary | End of Epoch 10 | Time 112.05s | Valid Loss -1.067
122
+ Test Summary | End of Epoch 10 | Time 54.05s | Test Loss -1.067
123
+ Fund new best model, dict saved
124
+ Train Summary | End of Epoch 11 | Time 8990.64s | Train Loss -4.026
125
+ Valid Summary | End of Epoch 11 | Time 107.08s | Valid Loss -0.921
126
+ Test Summary | End of Epoch 11 | Time 54.22s | Test Loss -0.921
127
+ Train Summary | End of Epoch 12 | Time 8973.45s | Train Loss -4.425
128
+ Valid Summary | End of Epoch 12 | Time 112.20s | Valid Loss -0.684
129
+ Test Summary | End of Epoch 12 | Time 52.61s | Test Loss -0.684
130
+ Train Summary | End of Epoch 13 | Time 8975.87s | Train Loss -4.821
131
+ Valid Summary | End of Epoch 13 | Time 108.89s | Valid Loss -0.950
132
+ Test Summary | End of Epoch 13 | Time 53.24s | Test Loss -0.950
133
+ Train Summary | End of Epoch 14 | Time 9038.47s | Train Loss -5.181
134
+ Valid Summary | End of Epoch 14 | Time 106.35s | Valid Loss -0.874
135
+ Test Summary | End of Epoch 14 | Time 54.29s | Test Loss -0.874
136
+ Train Summary | End of Epoch 15 | Time 9017.49s | Train Loss -5.511
137
+ Valid Summary | End of Epoch 15 | Time 111.92s | Valid Loss -1.061
138
+ Test Summary | End of Epoch 15 | Time 51.66s | Test Loss -1.061
139
+ reload weights and optimizer from last best checkpoint
140
+ Learning rate adjusted to: 0.000250
141
+ Train Summary | End of Epoch 16 | Time 9008.05s | Train Loss -4.883
142
+ Valid Summary | End of Epoch 16 | Time 117.85s | Valid Loss -1.140
143
+ Test Summary | End of Epoch 16 | Time 56.46s | Test Loss -1.140
144
+ Fund new best model, dict saved
145
+ Train Summary | End of Epoch 17 | Time 9021.77s | Train Loss -5.422
146
+ Valid Summary | End of Epoch 17 | Time 100.85s | Valid Loss -0.781
147
+ Test Summary | End of Epoch 17 | Time 53.69s | Test Loss -0.781
148
+ Train Summary | End of Epoch 18 | Time 9041.29s | Train Loss -5.799
149
+ Valid Summary | End of Epoch 18 | Time 103.86s | Valid Loss -0.656
150
+ Test Summary | End of Epoch 18 | Time 52.03s | Test Loss -0.656
151
+ Train Summary | End of Epoch 19 | Time 9018.85s | Train Loss -6.112
152
+ Valid Summary | End of Epoch 19 | Time 100.50s | Valid Loss -0.758
153
+ Test Summary | End of Epoch 19 | Time 51.89s | Test Loss -0.758
154
+ Train Summary | End of Epoch 20 | Time 9036.90s | Train Loss -6.391
155
+ Valid Summary | End of Epoch 20 | Time 105.47s | Valid Loss -0.480
156
+ Test Summary | End of Epoch 20 | Time 55.45s | Test Loss -0.480
157
+ Train Summary | End of Epoch 21 | Time 9015.76s | Train Loss -6.648
158
+ Valid Summary | End of Epoch 21 | Time 99.11s | Valid Loss -0.962
159
+ Test Summary | End of Epoch 21 | Time 51.96s | Test Loss -0.962
160
+ reload weights and optimizer from last best checkpoint
161
+ Learning rate adjusted to: 0.000125
162
+ Train Summary | End of Epoch 22 | Time 9050.62s | Train Loss -5.796
163
+ Valid Summary | End of Epoch 22 | Time 101.54s | Valid Loss -0.840
164
+ Test Summary | End of Epoch 22 | Time 56.05s | Test Loss -0.840
165
+ Train Summary | End of Epoch 23 | Time 9014.94s | Train Loss -6.156
166
+ Valid Summary | End of Epoch 23 | Time 104.00s | Valid Loss -0.683
167
+ Test Summary | End of Epoch 23 | Time 51.18s | Test Loss -0.683
168
+ Train Summary | End of Epoch 24 | Time 9046.79s | Train Loss -6.420
169
+ Valid Summary | End of Epoch 24 | Time 106.82s | Valid Loss -0.820
170
+ Test Summary | End of Epoch 24 | Time 51.79s | Test Loss -0.820
171
+ Train Summary | End of Epoch 25 | Time 9018.59s | Train Loss -6.645
172
+ Valid Summary | End of Epoch 25 | Time 98.66s | Valid Loss -0.555
173
+ Test Summary | End of Epoch 25 | Time 55.48s | Test Loss -0.555
174
+ Train Summary | End of Epoch 26 | Time 9076.83s | Train Loss -6.849
175
+ Valid Summary | End of Epoch 26 | Time 102.13s | Valid Loss -0.612
176
+ Test Summary | End of Epoch 26 | Time 54.27s | Test Loss -0.612
177
+ No imporvement for 10 epochs, early stopping.
178
+ Start evaluation
179
+ Avg SISNR:i tensor([4.8980], device='cuda:0')
180
+ Avg SNRi: 5.567126948555473
181
+ Avg STOIi: 0.09230865127134166
checkpoints/log_YGD_gesture_seg_3spk/tensorboard/events.out.tfevents.1727342114.bach-gpu011024008016.na620.62208.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71582401494f98c65bf1dde18ddec260e7dbd45a93ee45b068564b7f0e5219d1
3
+ size 3788