File size: 10,651 Bytes
c89209e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
2024-07-29 04:43:06,934 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,934 Training Model
2024-07-29 04:43:06,934 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,934 Translator(
  (encoder): EncoderLSTM(
    (embedding): Embedding(114, 300, padding_idx=0)
    (dropout): Dropout(p=0.1, inplace=False)
    (lstm): LSTM(300, 512, batch_first=True)
  )
  (decoder): DecoderLSTM(
    (embedding): Embedding(112, 300, padding_idx=0)
    (dropout): Dropout(p=0.1, inplace=False)
    (lstm): LSTM(300, 512, batch_first=True)
    (attention): DotProductAttention(
      (softmax): Softmax(dim=-1)
      (combined2hidden): Sequential(
        (0): Linear(in_features=1024, out_features=512, bias=True)
        (1): ReLU()
      )
    )
    (hidden2vocab): Linear(in_features=512, out_features=112, bias=True)
    (log_softmax): LogSoftmax(dim=-1)
  )
)
2024-07-29 04:43:06,934 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,934 Training Hyperparameters:
2024-07-29 04:43:06,934  - max_epochs: 10
2024-07-29 04:43:06,934  - learning_rate: 0.001
2024-07-29 04:43:06,934  - batch_size: 128
2024-07-29 04:43:06,934  - patience: 5
2024-07-29 04:43:06,934  - scheduler_patience: 3
2024-07-29 04:43:06,934  - teacher_forcing_ratio: 0.5
2024-07-29 04:43:06,934 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,934 Computational Parameters:
2024-07-29 04:43:06,934  - num_workers: 4
2024-07-29 04:43:06,934  - device: device(type='cuda', index=0)
2024-07-29 04:43:06,934 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,934 Dataset Splits:
2024-07-29 04:43:06,934  - train: 133623 data points
2024-07-29 04:43:06,934  - dev: 19090 data points
2024-07-29 04:43:06,934  - test: 38179 data points
2024-07-29 04:43:06,935 ----------------------------------------------------------------------------------------------------
2024-07-29 04:43:06,935 EPOCH 1
2024-07-29 04:46:03,502 batch 104/1044 - loss 2.83783023 - lr 0.0010 - time 176.57s
2024-07-29 04:48:58,358 batch 208/1044 - loss 2.67827428 - lr 0.0010 - time 351.42s
2024-07-29 04:52:09,047 batch 312/1044 - loss 2.59119082 - lr 0.0010 - time 542.11s
2024-07-29 04:55:23,591 batch 416/1044 - loss 2.52991555 - lr 0.0010 - time 736.66s
2024-07-29 04:58:24,345 batch 520/1044 - loss 2.48547669 - lr 0.0010 - time 917.41s
2024-07-29 05:01:11,473 batch 624/1044 - loss 2.44637715 - lr 0.0010 - time 1084.54s
2024-07-29 05:04:20,046 batch 728/1044 - loss 2.41217192 - lr 0.0010 - time 1273.11s
2024-07-29 05:07:28,110 batch 832/1044 - loss 2.37809223 - lr 0.0010 - time 1461.18s
2024-07-29 05:10:38,372 batch 936/1044 - loss 2.34602575 - lr 0.0010 - time 1651.44s
2024-07-29 05:13:32,549 batch 1040/1044 - loss 2.31563680 - lr 0.0010 - time 1825.61s
2024-07-29 05:13:39,106 ----------------------------------------------------------------------------------------------------
2024-07-29 05:13:39,108 EPOCH 1 DONE
2024-07-29 05:14:26,303 TRAIN Loss:       2.3144
2024-07-29 05:14:26,303 DEV   Loss:       3.5700
2024-07-29 05:14:26,303 DEV   Perplexity: 35.5166
2024-07-29 05:14:26,303 New best score!
2024-07-29 05:14:26,305 ----------------------------------------------------------------------------------------------------
2024-07-29 05:14:26,305 EPOCH 2
2024-07-29 05:17:25,271 batch 104/1044 - loss 2.02556723 - lr 0.0010 - time 178.97s
2024-07-29 05:20:25,054 batch 208/1044 - loss 2.00942771 - lr 0.0010 - time 358.75s
2024-07-29 05:23:12,883 batch 312/1044 - loss 1.99176520 - lr 0.0010 - time 526.58s
2024-07-29 05:26:08,804 batch 416/1044 - loss 1.97854575 - lr 0.0010 - time 702.50s
2024-07-29 05:29:14,936 batch 520/1044 - loss 1.97086978 - lr 0.0010 - time 888.63s
2024-07-29 05:32:21,237 batch 624/1044 - loss 1.95995870 - lr 0.0010 - time 1074.93s
2024-07-29 05:35:20,854 batch 728/1044 - loss 1.95067503 - lr 0.0010 - time 1254.55s
2024-07-29 05:38:34,956 batch 832/1044 - loss 1.94326082 - lr 0.0010 - time 1448.65s
2024-07-29 05:41:48,006 batch 936/1044 - loss 1.93362772 - lr 0.0010 - time 1641.70s
2024-07-29 05:44:42,067 batch 1040/1044 - loss 1.92524348 - lr 0.0010 - time 1815.76s
2024-07-29 05:44:50,207 ----------------------------------------------------------------------------------------------------
2024-07-29 05:44:50,210 EPOCH 2 DONE
2024-07-29 05:45:37,466 TRAIN Loss:       1.9249
2024-07-29 05:45:37,466 DEV   Loss:       3.8374
2024-07-29 05:45:37,466 DEV   Perplexity: 46.4067
2024-07-29 05:45:37,466 No improvement for 1 epoch(s)
2024-07-29 05:45:37,466 ----------------------------------------------------------------------------------------------------
2024-07-29 05:45:37,466 EPOCH 3
2024-07-29 05:48:43,560 batch 104/1044 - loss 1.82380688 - lr 0.0010 - time 186.09s
2024-07-29 05:51:53,714 batch 208/1044 - loss 1.82825828 - lr 0.0010 - time 376.25s
2024-07-29 05:55:08,715 batch 312/1044 - loss 1.82657076 - lr 0.0010 - time 571.25s
2024-07-29 05:58:07,203 batch 416/1044 - loss 1.82265144 - lr 0.0010 - time 749.74s
2024-07-29 06:00:58,968 batch 520/1044 - loss 1.81858461 - lr 0.0010 - time 921.50s
2024-07-29 06:03:59,822 batch 624/1044 - loss 1.80977892 - lr 0.0010 - time 1102.36s
2024-07-29 06:07:08,066 batch 728/1044 - loss 1.80312389 - lr 0.0010 - time 1290.60s
2024-07-29 06:10:01,948 batch 832/1044 - loss 1.79834272 - lr 0.0010 - time 1464.48s
2024-07-29 06:12:49,654 batch 936/1044 - loss 1.79244394 - lr 0.0010 - time 1632.19s
2024-07-29 06:15:41,378 batch 1040/1044 - loss 1.78895096 - lr 0.0010 - time 1803.91s
2024-07-29 06:15:47,180 ----------------------------------------------------------------------------------------------------
2024-07-29 06:15:47,183 EPOCH 3 DONE
2024-07-29 06:16:34,306 TRAIN Loss:       1.7889
2024-07-29 06:16:34,306 DEV   Loss:       3.8489
2024-07-29 06:16:34,306 DEV   Perplexity: 46.9422
2024-07-29 06:16:34,307 No improvement for 2 epoch(s)
2024-07-29 06:16:34,307 ----------------------------------------------------------------------------------------------------
2024-07-29 06:16:34,307 EPOCH 4
2024-07-29 06:19:47,695 batch 104/1044 - loss 1.72615880 - lr 0.0010 - time 193.39s
2024-07-29 06:22:47,789 batch 208/1044 - loss 1.72849645 - lr 0.0010 - time 373.48s
2024-07-29 06:25:49,316 batch 312/1044 - loss 1.72645533 - lr 0.0010 - time 555.01s
2024-07-29 06:28:43,932 batch 416/1044 - loss 1.72066385 - lr 0.0010 - time 729.63s
2024-07-29 06:31:56,479 batch 520/1044 - loss 1.71717779 - lr 0.0010 - time 922.17s
2024-07-29 06:34:57,754 batch 624/1044 - loss 1.71594436 - lr 0.0010 - time 1103.45s
2024-07-29 06:37:51,089 batch 728/1044 - loss 1.71165972 - lr 0.0010 - time 1276.78s
2024-07-29 06:40:52,402 batch 832/1044 - loss 1.70951752 - lr 0.0010 - time 1458.10s
2024-07-29 06:43:46,624 batch 936/1044 - loss 1.70553106 - lr 0.0010 - time 1632.32s
2024-07-29 06:46:41,386 batch 1040/1044 - loss 1.70329877 - lr 0.0010 - time 1807.08s
2024-07-29 06:46:48,093 ----------------------------------------------------------------------------------------------------
2024-07-29 06:46:48,095 EPOCH 4 DONE
2024-07-29 06:47:35,218 TRAIN Loss:       1.7032
2024-07-29 06:47:35,219 DEV   Loss:       4.1957
2024-07-29 06:47:35,219 DEV   Perplexity: 66.3981
2024-07-29 06:47:35,219 No improvement for 3 epoch(s)
2024-07-29 06:47:35,219 ----------------------------------------------------------------------------------------------------
2024-07-29 06:47:35,219 EPOCH 5
2024-07-29 06:50:45,524 batch 104/1044 - loss 1.64844567 - lr 0.0010 - time 190.31s
2024-07-29 06:53:48,606 batch 208/1044 - loss 1.64985944 - lr 0.0010 - time 373.39s
2024-07-29 06:56:52,667 batch 312/1044 - loss 1.65055201 - lr 0.0010 - time 557.45s
2024-07-29 06:59:51,714 batch 416/1044 - loss 1.65345511 - lr 0.0010 - time 736.50s
2024-07-29 07:02:52,445 batch 520/1044 - loss 1.65111495 - lr 0.0010 - time 917.23s
2024-07-29 07:06:00,096 batch 624/1044 - loss 1.65081866 - lr 0.0010 - time 1104.88s
2024-07-29 07:09:16,066 batch 728/1044 - loss 1.64957887 - lr 0.0010 - time 1300.85s
2024-07-29 07:12:15,087 batch 832/1044 - loss 1.64832800 - lr 0.0010 - time 1479.87s
2024-07-29 07:15:10,030 batch 936/1044 - loss 1.64612010 - lr 0.0010 - time 1654.81s
2024-07-29 07:18:02,140 batch 1040/1044 - loss 1.64496474 - lr 0.0010 - time 1826.92s
2024-07-29 07:18:08,591 ----------------------------------------------------------------------------------------------------
2024-07-29 07:18:08,594 EPOCH 5 DONE
2024-07-29 07:18:55,835 TRAIN Loss:       1.6448
2024-07-29 07:18:55,835 DEV   Loss:       4.0923
2024-07-29 07:18:55,835 DEV   Perplexity: 59.8790
2024-07-29 07:18:55,835 No improvement for 4 epoch(s)
2024-07-29 07:18:55,835 ----------------------------------------------------------------------------------------------------
2024-07-29 07:18:55,835 EPOCH 6
2024-07-29 07:21:53,160 batch 104/1044 - loss 1.58821843 - lr 0.0001 - time 177.32s
2024-07-29 07:24:44,349 batch 208/1044 - loss 1.59108787 - lr 0.0001 - time 348.51s
2024-07-29 07:27:37,622 batch 312/1044 - loss 1.58441215 - lr 0.0001 - time 521.79s
2024-07-29 07:30:43,750 batch 416/1044 - loss 1.58090937 - lr 0.0001 - time 707.91s
2024-07-29 07:33:54,621 batch 520/1044 - loss 1.58090223 - lr 0.0001 - time 898.79s
2024-07-29 07:36:52,832 batch 624/1044 - loss 1.58009594 - lr 0.0001 - time 1077.00s
2024-07-29 07:40:09,071 batch 728/1044 - loss 1.57836947 - lr 0.0001 - time 1273.24s
2024-07-29 07:43:11,085 batch 832/1044 - loss 1.57711583 - lr 0.0001 - time 1455.25s
2024-07-29 07:46:18,514 batch 936/1044 - loss 1.57624354 - lr 0.0001 - time 1642.68s
2024-07-29 07:49:05,093 batch 1040/1044 - loss 1.57536047 - lr 0.0001 - time 1809.26s
2024-07-29 07:49:11,696 ----------------------------------------------------------------------------------------------------
2024-07-29 07:49:11,699 EPOCH 6 DONE
2024-07-29 07:49:59,010 TRAIN Loss:       1.5752
2024-07-29 07:49:59,010 DEV   Loss:       4.1991
2024-07-29 07:49:59,010 DEV   Perplexity: 66.6274
2024-07-29 07:49:59,010 No improvement for 5 epoch(s)
2024-07-29 07:49:59,010 Patience reached: Terminating model training due to early stopping
2024-07-29 07:49:59,010 ----------------------------------------------------------------------------------------------------
2024-07-29 07:49:59,010 Finished Training
2024-07-29 07:51:31,366 TEST  Perplexity: 35.5327
2024-07-29 08:02:43,738 TEST  BLEU = 4.47 45.6/8.8/2.0/0.5 (BP = 1.000 ratio = 1.000 hyp_len = 103 ref_len = 103)