dobbersc commited on
Commit
a544e36
·
verified ·
1 Parent(s): c89209e

Add remaining log

Browse files
models/en2de/word_word2vec_embeddings_without_attention/log.txt CHANGED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-07-29 13:39:12,882 ----------------------------------------------------------------------------------------------------
2
+ 2024-07-29 13:39:12,882 Training Model
3
+ 2024-07-29 13:39:12,882 ----------------------------------------------------------------------------------------------------
4
+ 2024-07-29 13:39:12,882 Translator(
5
+ (encoder): EncoderLSTM(
6
+ (embedding): Embedding(14303, 300, padding_idx=14298)
7
+ (dropout): Dropout(p=0.1, inplace=False)
8
+ (lstm): LSTM(300, 512, batch_first=True, bidirectional=True)
9
+ )
10
+ (decoder): DecoderLSTM(
11
+ (embedding): Embedding(22834, 300, padding_idx=22829)
12
+ (dropout): Dropout(p=0.1, inplace=False)
13
+ (lstm): LSTM(300, 1024, batch_first=True)
14
+ (hidden2vocab): Linear(in_features=1024, out_features=22834, bias=True)
15
+ (log_softmax): LogSoftmax(dim=-1)
16
+ )
17
+ )
18
+ 2024-07-29 13:39:12,882 ----------------------------------------------------------------------------------------------------
19
+ 2024-07-29 13:39:12,882 Training Hyperparameters:
20
+ 2024-07-29 13:39:12,882 - max_epochs: 10
21
+ 2024-07-29 13:39:12,882 - learning_rate: 0.001
22
+ 2024-07-29 13:39:12,882 - batch_size: 128
23
+ 2024-07-29 13:39:12,882 - patience: 5
24
+ 2024-07-29 13:39:12,882 - scheduler_patience: 3
25
+ 2024-07-29 13:39:12,882 - teacher_forcing_ratio: 0.5
26
+ 2024-07-29 13:39:12,882 ----------------------------------------------------------------------------------------------------
27
+ 2024-07-29 13:39:12,882 Computational Parameters:
28
+ 2024-07-29 13:39:12,882 - num_workers: 4
29
+ 2024-07-29 13:39:12,882 - device: device(type='cuda', index=0)
30
+ 2024-07-29 13:39:12,882 ----------------------------------------------------------------------------------------------------
31
+ 2024-07-29 13:39:12,882 Dataset Splits:
32
+ 2024-07-29 13:39:12,882 - train: 133623 data points
33
+ 2024-07-29 13:39:12,883 - dev: 19090 data points
34
+ 2024-07-29 13:39:12,883 - test: 38179 data points
35
+ 2024-07-29 13:39:12,883 ----------------------------------------------------------------------------------------------------
36
+ 2024-07-29 13:39:12,883 EPOCH 1
37
+ 2024-07-29 13:40:05,065 batch 104/1044 - loss 6.44635533 - lr 0.0010 - time 52.18s
38
+ 2024-07-29 13:40:54,623 batch 208/1044 - loss 6.12962256 - lr 0.0010 - time 101.74s
39
+ 2024-07-29 13:41:43,929 batch 312/1044 - loss 5.93635781 - lr 0.0010 - time 151.05s
40
+ 2024-07-29 13:42:31,843 batch 416/1044 - loss 5.79898563 - lr 0.0010 - time 198.96s
41
+ 2024-07-29 13:43:16,643 batch 520/1044 - loss 5.68921758 - lr 0.0010 - time 243.76s
42
+ 2024-07-29 13:44:06,699 batch 624/1044 - loss 5.59986271 - lr 0.0010 - time 293.82s
43
+ 2024-07-29 13:44:54,263 batch 728/1044 - loss 5.52647748 - lr 0.0010 - time 341.38s
44
+ 2024-07-29 13:45:42,455 batch 832/1044 - loss 5.46140777 - lr 0.0010 - time 389.57s
45
+ 2024-07-29 13:46:30,596 batch 936/1044 - loss 5.40276114 - lr 0.0010 - time 437.71s
46
+ 2024-07-29 13:47:19,411 batch 1040/1044 - loss 5.35238057 - lr 0.0010 - time 486.53s
47
+ 2024-07-29 13:47:21,172 ----------------------------------------------------------------------------------------------------
48
+ 2024-07-29 13:47:21,173 EPOCH 1 DONE
49
+ 2024-07-29 13:47:34,037 TRAIN Loss: 5.3503
50
+ 2024-07-29 13:47:34,037 DEV Loss: 5.6631
51
+ 2024-07-29 13:47:34,037 DEV Perplexity: 288.0468
52
+ 2024-07-29 13:47:34,037 New best score!
53
+ 2024-07-29 13:47:34,038 ----------------------------------------------------------------------------------------------------
54
+ 2024-07-29 13:47:34,038 EPOCH 2
55
+ 2024-07-29 13:48:19,442 batch 104/1044 - loss 4.75314385 - lr 0.0010 - time 45.40s
56
+ 2024-07-29 13:49:09,391 batch 208/1044 - loss 4.72889426 - lr 0.0010 - time 95.35s
57
+ 2024-07-29 13:50:00,631 batch 312/1044 - loss 4.71117107 - lr 0.0010 - time 146.59s
58
+ 2024-07-29 13:50:52,513 batch 416/1044 - loss 4.70529533 - lr 0.0010 - time 198.47s
59
+ 2024-07-29 13:51:41,603 batch 520/1044 - loss 4.68753760 - lr 0.0010 - time 247.56s
60
+ 2024-07-29 13:52:32,194 batch 624/1044 - loss 4.67994702 - lr 0.0010 - time 298.16s
61
+ 2024-07-29 13:53:18,653 batch 728/1044 - loss 4.67035193 - lr 0.0010 - time 344.61s
62
+ 2024-07-29 13:54:08,061 batch 832/1044 - loss 4.65910866 - lr 0.0010 - time 394.02s
63
+ 2024-07-29 13:54:55,509 batch 936/1044 - loss 4.64630038 - lr 0.0010 - time 441.47s
64
+ 2024-07-29 13:55:41,903 batch 1040/1044 - loss 4.63526248 - lr 0.0010 - time 487.87s
65
+ 2024-07-29 13:55:44,152 ----------------------------------------------------------------------------------------------------
66
+ 2024-07-29 13:55:44,153 EPOCH 2 DONE
67
+ 2024-07-29 13:55:57,276 TRAIN Loss: 4.6352
68
+ 2024-07-29 13:55:57,276 DEV Loss: 5.3919
69
+ 2024-07-29 13:55:57,276 DEV Perplexity: 219.6304
70
+ 2024-07-29 13:55:57,277 New best score!
71
+ 2024-07-29 13:55:57,278 ----------------------------------------------------------------------------------------------------
72
+ 2024-07-29 13:55:57,278 EPOCH 3
73
+ 2024-07-29 13:56:44,945 batch 104/1044 - loss 4.32507801 - lr 0.0010 - time 47.67s
74
+ 2024-07-29 13:57:33,185 batch 208/1044 - loss 4.34560976 - lr 0.0010 - time 95.91s
75
+ 2024-07-29 13:58:20,859 batch 312/1044 - loss 4.35252990 - lr 0.0010 - time 143.58s
76
+ 2024-07-29 13:59:11,009 batch 416/1044 - loss 4.34997149 - lr 0.0010 - time 193.73s
77
+ 2024-07-29 13:59:59,141 batch 520/1044 - loss 4.34624985 - lr 0.0010 - time 241.86s
78
+ 2024-07-29 14:00:45,968 batch 624/1044 - loss 4.34042853 - lr 0.0010 - time 288.69s
79
+ 2024-07-29 14:01:35,933 batch 728/1044 - loss 4.33867311 - lr 0.0010 - time 338.66s
80
+ 2024-07-29 14:02:24,301 batch 832/1044 - loss 4.33741166 - lr 0.0010 - time 387.02s
81
+ 2024-07-29 14:03:12,821 batch 936/1044 - loss 4.33510092 - lr 0.0010 - time 435.54s
82
+ 2024-07-29 14:04:06,006 batch 1040/1044 - loss 4.33377559 - lr 0.0010 - time 488.73s
83
+ 2024-07-29 14:04:07,768 ----------------------------------------------------------------------------------------------------
84
+ 2024-07-29 14:04:07,769 EPOCH 3 DONE
85
+ 2024-07-29 14:04:20,667 TRAIN Loss: 4.3334
86
+ 2024-07-29 14:04:20,668 DEV Loss: 5.3567
87
+ 2024-07-29 14:04:20,668 DEV Perplexity: 212.0222
88
+ 2024-07-29 14:04:20,668 New best score!
89
+ 2024-07-29 14:04:20,669 ----------------------------------------------------------------------------------------------------
90
+ 2024-07-29 14:04:20,669 EPOCH 4
91
+ 2024-07-29 14:05:12,412 batch 104/1044 - loss 4.10891296 - lr 0.0010 - time 51.74s
92
+ 2024-07-29 14:05:56,410 batch 208/1044 - loss 4.11914104 - lr 0.0010 - time 95.74s
93
+ 2024-07-29 14:06:44,425 batch 312/1044 - loss 4.11985568 - lr 0.0010 - time 143.76s
94
+ 2024-07-29 14:07:34,889 batch 416/1044 - loss 4.12697194 - lr 0.0010 - time 194.22s
95
+ 2024-07-29 14:08:25,932 batch 520/1044 - loss 4.12800773 - lr 0.0010 - time 245.26s
96
+ 2024-07-29 14:09:14,874 batch 624/1044 - loss 4.13386618 - lr 0.0010 - time 294.20s
97
+ 2024-07-29 14:10:02,875 batch 728/1044 - loss 4.13324166 - lr 0.0010 - time 342.21s
98
+ 2024-07-29 14:10:51,328 batch 832/1044 - loss 4.13408759 - lr 0.0010 - time 390.66s
99
+ 2024-07-29 14:11:40,322 batch 936/1044 - loss 4.13373322 - lr 0.0010 - time 439.65s
100
+ 2024-07-29 14:12:26,532 batch 1040/1044 - loss 4.13221874 - lr 0.0010 - time 485.86s
101
+ 2024-07-29 14:12:28,774 ----------------------------------------------------------------------------------------------------
102
+ 2024-07-29 14:12:28,775 EPOCH 4 DONE
103
+ 2024-07-29 14:12:41,718 TRAIN Loss: 4.1323
104
+ 2024-07-29 14:12:41,719 DEV Loss: 5.3784
105
+ 2024-07-29 14:12:41,719 DEV Perplexity: 216.6765
106
+ 2024-07-29 14:12:41,719 No improvement for 1 epoch(s)
107
+ 2024-07-29 14:12:41,719 ----------------------------------------------------------------------------------------------------
108
+ 2024-07-29 14:12:41,719 EPOCH 5
109
+ 2024-07-29 14:13:29,471 batch 104/1044 - loss 3.94119389 - lr 0.0010 - time 47.75s
110
+ 2024-07-29 14:14:17,243 batch 208/1044 - loss 3.94173051 - lr 0.0010 - time 95.52s
111
+ 2024-07-29 14:15:07,687 batch 312/1044 - loss 3.95529621 - lr 0.0010 - time 145.97s
112
+ 2024-07-29 14:15:56,554 batch 416/1044 - loss 3.96082954 - lr 0.0010 - time 194.83s
113
+ 2024-07-29 14:16:44,483 batch 520/1044 - loss 3.97343106 - lr 0.0010 - time 242.76s
114
+ 2024-07-29 14:17:32,350 batch 624/1044 - loss 3.97887683 - lr 0.0010 - time 290.63s
115
+ 2024-07-29 14:18:22,605 batch 728/1044 - loss 3.98052316 - lr 0.0010 - time 340.89s
116
+ 2024-07-29 14:19:11,900 batch 832/1044 - loss 3.98124155 - lr 0.0010 - time 390.18s
117
+ 2024-07-29 14:20:00,077 batch 936/1044 - loss 3.98377551 - lr 0.0010 - time 438.36s
118
+ 2024-07-29 14:20:50,266 batch 1040/1044 - loss 3.98429667 - lr 0.0010 - time 488.55s
119
+ 2024-07-29 14:20:52,438 ----------------------------------------------------------------------------------------------------
120
+ 2024-07-29 14:20:52,439 EPOCH 5 DONE
121
+ 2024-07-29 14:21:05,495 TRAIN Loss: 3.9845
122
+ 2024-07-29 14:21:05,496 DEV Loss: 5.3922
123
+ 2024-07-29 14:21:05,496 DEV Perplexity: 219.6797
124
+ 2024-07-29 14:21:05,496 No improvement for 2 epoch(s)
125
+ 2024-07-29 14:21:05,496 ----------------------------------------------------------------------------------------------------
126
+ 2024-07-29 14:21:05,496 EPOCH 6
127
+ 2024-07-29 14:21:52,704 batch 104/1044 - loss 3.79832284 - lr 0.0010 - time 47.21s
128
+ 2024-07-29 14:22:41,571 batch 208/1044 - loss 3.80723104 - lr 0.0010 - time 96.08s
129
+ 2024-07-29 14:23:30,438 batch 312/1044 - loss 3.81664630 - lr 0.0010 - time 144.94s
130
+ 2024-07-29 14:24:17,826 batch 416/1044 - loss 3.81986886 - lr 0.0010 - time 192.33s
131
+ 2024-07-29 14:25:06,881 batch 520/1044 - loss 3.82626900 - lr 0.0010 - time 241.39s
132
+ 2024-07-29 14:25:52,624 batch 624/1044 - loss 3.83213730 - lr 0.0010 - time 287.13s
133
+ 2024-07-29 14:26:43,197 batch 728/1044 - loss 3.84186704 - lr 0.0010 - time 337.70s
134
+ 2024-07-29 14:27:33,111 batch 832/1044 - loss 3.84894093 - lr 0.0010 - time 387.62s
135
+ 2024-07-29 14:28:23,137 batch 936/1044 - loss 3.85404221 - lr 0.0010 - time 437.64s
136
+ 2024-07-29 14:29:12,179 batch 1040/1044 - loss 3.85523346 - lr 0.0010 - time 486.68s
137
+ 2024-07-29 14:29:13,750 ----------------------------------------------------------------------------------------------------
138
+ 2024-07-29 14:29:13,751 EPOCH 6 DONE
139
+ 2024-07-29 14:29:26,691 TRAIN Loss: 3.8555
140
+ 2024-07-29 14:29:26,691 DEV Loss: 5.3926
141
+ 2024-07-29 14:29:26,691 DEV Perplexity: 219.7775
142
+ 2024-07-29 14:29:26,691 No improvement for 3 epoch(s)
143
+ 2024-07-29 14:29:26,691 ----------------------------------------------------------------------------------------------------
144
+ 2024-07-29 14:29:26,691 EPOCH 7
145
+ 2024-07-29 14:30:18,839 batch 104/1044 - loss 3.69032830 - lr 0.0010 - time 52.15s
146
+ 2024-07-29 14:31:06,645 batch 208/1044 - loss 3.69084198 - lr 0.0010 - time 99.95s
147
+ 2024-07-29 14:31:57,776 batch 312/1044 - loss 3.68775470 - lr 0.0010 - time 151.09s
148
+ 2024-07-29 14:32:42,614 batch 416/1044 - loss 3.69797963 - lr 0.0010 - time 195.92s
149
+ 2024-07-29 14:33:31,288 batch 520/1044 - loss 3.70423176 - lr 0.0010 - time 244.60s
150
+ 2024-07-29 14:34:19,800 batch 624/1044 - loss 3.71455426 - lr 0.0010 - time 293.11s
151
+ 2024-07-29 14:35:07,090 batch 728/1044 - loss 3.72059076 - lr 0.0010 - time 340.40s
152
+ 2024-07-29 14:35:57,356 batch 832/1044 - loss 3.72933647 - lr 0.0010 - time 390.67s
153
+ 2024-07-29 14:36:48,582 batch 936/1044 - loss 3.73725470 - lr 0.0010 - time 441.89s
154
+ 2024-07-29 14:37:36,154 batch 1040/1044 - loss 3.74537229 - lr 0.0010 - time 489.46s
155
+ 2024-07-29 14:37:38,563 ----------------------------------------------------------------------------------------------------
156
+ 2024-07-29 14:37:38,564 EPOCH 7 DONE
157
+ 2024-07-29 14:37:51,548 TRAIN Loss: 3.7451
158
+ 2024-07-29 14:37:51,548 DEV Loss: 5.3882
159
+ 2024-07-29 14:37:51,548 DEV Perplexity: 218.8039
160
+ 2024-07-29 14:37:51,549 No improvement for 4 epoch(s)
161
+ 2024-07-29 14:37:51,549 ----------------------------------------------------------------------------------------------------
162
+ 2024-07-29 14:37:51,549 EPOCH 8
163
+ 2024-07-29 14:38:39,248 batch 104/1044 - loss 3.55457444 - lr 0.0001 - time 47.70s
164
+ 2024-07-29 14:39:27,858 batch 208/1044 - loss 3.55389387 - lr 0.0001 - time 96.31s
165
+ 2024-07-29 14:40:20,207 batch 312/1044 - loss 3.53512270 - lr 0.0001 - time 148.66s
166
+ 2024-07-29 14:41:12,977 batch 416/1044 - loss 3.52695438 - lr 0.0001 - time 201.43s
167
+ 2024-07-29 14:42:01,643 batch 520/1044 - loss 3.51997281 - lr 0.0001 - time 250.09s
168
+ 2024-07-29 14:42:47,457 batch 624/1044 - loss 3.51754866 - lr 0.0001 - time 295.91s
169
+ 2024-07-29 14:43:37,541 batch 728/1044 - loss 3.51377626 - lr 0.0001 - time 345.99s
170
+ 2024-07-29 14:44:24,756 batch 832/1044 - loss 3.51298037 - lr 0.0001 - time 393.21s
171
+ 2024-07-29 14:45:08,901 batch 936/1044 - loss 3.51079810 - lr 0.0001 - time 437.35s
172
+ 2024-07-29 14:45:58,849 batch 1040/1044 - loss 3.51072362 - lr 0.0001 - time 487.30s
173
+ 2024-07-29 14:46:00,541 ----------------------------------------------------------------------------------------------------
174
+ 2024-07-29 14:46:00,542 EPOCH 8 DONE
175
+ 2024-07-29 14:46:13,531 TRAIN Loss: 3.5108
176
+ 2024-07-29 14:46:13,531 DEV Loss: 5.4292
177
+ 2024-07-29 14:46:13,531 DEV Perplexity: 227.9716
178
+ 2024-07-29 14:46:13,531 No improvement for 5 epoch(s)
179
+ 2024-07-29 14:46:13,531 Patience reached: Terminating model training due to early stopping
180
+ 2024-07-29 14:46:13,531 ----------------------------------------------------------------------------------------------------
181
+ 2024-07-29 14:46:13,531 Finished Training
182
+ 2024-07-29 14:46:38,572 TEST Perplexity: 212.0491
183
+ 2024-07-29 14:49:34,773 TEST BLEU = 30.03 82.6/62.6/26.2/6.2 (BP = 0.992 ratio = 0.992 hyp_len = 132 ref_len = 133)