File size: 10,651 Bytes
c89209e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
2024-07-29 04:43:06,934 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,934 Training Model 2024-07-29 04:43:06,934 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,934 Translator( (encoder): EncoderLSTM( (embedding): Embedding(114, 300, padding_idx=0) (dropout): Dropout(p=0.1, inplace=False) (lstm): LSTM(300, 512, batch_first=True) ) (decoder): DecoderLSTM( (embedding): Embedding(112, 300, padding_idx=0) (dropout): Dropout(p=0.1, inplace=False) (lstm): LSTM(300, 512, batch_first=True) (attention): DotProductAttention( (softmax): Softmax(dim=-1) (combined2hidden): Sequential( (0): Linear(in_features=1024, out_features=512, bias=True) (1): ReLU() ) ) (hidden2vocab): Linear(in_features=512, out_features=112, bias=True) (log_softmax): LogSoftmax(dim=-1) ) ) 2024-07-29 04:43:06,934 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,934 Training Hyperparameters: 2024-07-29 04:43:06,934 - max_epochs: 10 2024-07-29 04:43:06,934 - learning_rate: 0.001 2024-07-29 04:43:06,934 - batch_size: 128 2024-07-29 04:43:06,934 - patience: 5 2024-07-29 04:43:06,934 - scheduler_patience: 3 2024-07-29 04:43:06,934 - teacher_forcing_ratio: 0.5 2024-07-29 04:43:06,934 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,934 Computational Parameters: 2024-07-29 04:43:06,934 - num_workers: 4 2024-07-29 04:43:06,934 - device: device(type='cuda', index=0) 2024-07-29 04:43:06,934 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,934 Dataset Splits: 2024-07-29 04:43:06,934 - train: 133623 data points 2024-07-29 04:43:06,934 - dev: 19090 data points 2024-07-29 04:43:06,934 - test: 38179 data points 2024-07-29 04:43:06,935 ---------------------------------------------------------------------------------------------------- 2024-07-29 04:43:06,935 EPOCH 1 2024-07-29 04:46:03,502 batch 104/1044 - loss 2.83783023 - lr 0.0010 - time 176.57s 2024-07-29 04:48:58,358 batch 208/1044 - loss 2.67827428 - lr 0.0010 - time 351.42s 2024-07-29 04:52:09,047 batch 312/1044 - loss 2.59119082 - lr 0.0010 - time 542.11s 2024-07-29 04:55:23,591 batch 416/1044 - loss 2.52991555 - lr 0.0010 - time 736.66s 2024-07-29 04:58:24,345 batch 520/1044 - loss 2.48547669 - lr 0.0010 - time 917.41s 2024-07-29 05:01:11,473 batch 624/1044 - loss 2.44637715 - lr 0.0010 - time 1084.54s 2024-07-29 05:04:20,046 batch 728/1044 - loss 2.41217192 - lr 0.0010 - time 1273.11s 2024-07-29 05:07:28,110 batch 832/1044 - loss 2.37809223 - lr 0.0010 - time 1461.18s 2024-07-29 05:10:38,372 batch 936/1044 - loss 2.34602575 - lr 0.0010 - time 1651.44s 2024-07-29 05:13:32,549 batch 1040/1044 - loss 2.31563680 - lr 0.0010 - time 1825.61s 2024-07-29 05:13:39,106 ---------------------------------------------------------------------------------------------------- 2024-07-29 05:13:39,108 EPOCH 1 DONE 2024-07-29 05:14:26,303 TRAIN Loss: 2.3144 2024-07-29 05:14:26,303 DEV Loss: 3.5700 2024-07-29 05:14:26,303 DEV Perplexity: 35.5166 2024-07-29 05:14:26,303 New best score! 2024-07-29 05:14:26,305 ---------------------------------------------------------------------------------------------------- 2024-07-29 05:14:26,305 EPOCH 2 2024-07-29 05:17:25,271 batch 104/1044 - loss 2.02556723 - lr 0.0010 - time 178.97s 2024-07-29 05:20:25,054 batch 208/1044 - loss 2.00942771 - lr 0.0010 - time 358.75s 2024-07-29 05:23:12,883 batch 312/1044 - loss 1.99176520 - lr 0.0010 - time 526.58s 2024-07-29 05:26:08,804 batch 416/1044 - loss 1.97854575 - lr 0.0010 - time 702.50s 2024-07-29 05:29:14,936 batch 520/1044 - loss 1.97086978 - lr 0.0010 - time 888.63s 2024-07-29 05:32:21,237 batch 624/1044 - loss 1.95995870 - lr 0.0010 - time 1074.93s 2024-07-29 05:35:20,854 batch 728/1044 - loss 1.95067503 - lr 0.0010 - time 1254.55s 2024-07-29 05:38:34,956 batch 832/1044 - loss 1.94326082 - lr 0.0010 - time 1448.65s 2024-07-29 05:41:48,006 batch 936/1044 - loss 1.93362772 - lr 0.0010 - time 1641.70s 2024-07-29 05:44:42,067 batch 1040/1044 - loss 1.92524348 - lr 0.0010 - time 1815.76s 2024-07-29 05:44:50,207 ---------------------------------------------------------------------------------------------------- 2024-07-29 05:44:50,210 EPOCH 2 DONE 2024-07-29 05:45:37,466 TRAIN Loss: 1.9249 2024-07-29 05:45:37,466 DEV Loss: 3.8374 2024-07-29 05:45:37,466 DEV Perplexity: 46.4067 2024-07-29 05:45:37,466 No improvement for 1 epoch(s) 2024-07-29 05:45:37,466 ---------------------------------------------------------------------------------------------------- 2024-07-29 05:45:37,466 EPOCH 3 2024-07-29 05:48:43,560 batch 104/1044 - loss 1.82380688 - lr 0.0010 - time 186.09s 2024-07-29 05:51:53,714 batch 208/1044 - loss 1.82825828 - lr 0.0010 - time 376.25s 2024-07-29 05:55:08,715 batch 312/1044 - loss 1.82657076 - lr 0.0010 - time 571.25s 2024-07-29 05:58:07,203 batch 416/1044 - loss 1.82265144 - lr 0.0010 - time 749.74s 2024-07-29 06:00:58,968 batch 520/1044 - loss 1.81858461 - lr 0.0010 - time 921.50s 2024-07-29 06:03:59,822 batch 624/1044 - loss 1.80977892 - lr 0.0010 - time 1102.36s 2024-07-29 06:07:08,066 batch 728/1044 - loss 1.80312389 - lr 0.0010 - time 1290.60s 2024-07-29 06:10:01,948 batch 832/1044 - loss 1.79834272 - lr 0.0010 - time 1464.48s 2024-07-29 06:12:49,654 batch 936/1044 - loss 1.79244394 - lr 0.0010 - time 1632.19s 2024-07-29 06:15:41,378 batch 1040/1044 - loss 1.78895096 - lr 0.0010 - time 1803.91s 2024-07-29 06:15:47,180 ---------------------------------------------------------------------------------------------------- 2024-07-29 06:15:47,183 EPOCH 3 DONE 2024-07-29 06:16:34,306 TRAIN Loss: 1.7889 2024-07-29 06:16:34,306 DEV Loss: 3.8489 2024-07-29 06:16:34,306 DEV Perplexity: 46.9422 2024-07-29 06:16:34,307 No improvement for 2 epoch(s) 2024-07-29 06:16:34,307 ---------------------------------------------------------------------------------------------------- 2024-07-29 06:16:34,307 EPOCH 4 2024-07-29 06:19:47,695 batch 104/1044 - loss 1.72615880 - lr 0.0010 - time 193.39s 2024-07-29 06:22:47,789 batch 208/1044 - loss 1.72849645 - lr 0.0010 - time 373.48s 2024-07-29 06:25:49,316 batch 312/1044 - loss 1.72645533 - lr 0.0010 - time 555.01s 2024-07-29 06:28:43,932 batch 416/1044 - loss 1.72066385 - lr 0.0010 - time 729.63s 2024-07-29 06:31:56,479 batch 520/1044 - loss 1.71717779 - lr 0.0010 - time 922.17s 2024-07-29 06:34:57,754 batch 624/1044 - loss 1.71594436 - lr 0.0010 - time 1103.45s 2024-07-29 06:37:51,089 batch 728/1044 - loss 1.71165972 - lr 0.0010 - time 1276.78s 2024-07-29 06:40:52,402 batch 832/1044 - loss 1.70951752 - lr 0.0010 - time 1458.10s 2024-07-29 06:43:46,624 batch 936/1044 - loss 1.70553106 - lr 0.0010 - time 1632.32s 2024-07-29 06:46:41,386 batch 1040/1044 - loss 1.70329877 - lr 0.0010 - time 1807.08s 2024-07-29 06:46:48,093 ---------------------------------------------------------------------------------------------------- 2024-07-29 06:46:48,095 EPOCH 4 DONE 2024-07-29 06:47:35,218 TRAIN Loss: 1.7032 2024-07-29 06:47:35,219 DEV Loss: 4.1957 2024-07-29 06:47:35,219 DEV Perplexity: 66.3981 2024-07-29 06:47:35,219 No improvement for 3 epoch(s) 2024-07-29 06:47:35,219 ---------------------------------------------------------------------------------------------------- 2024-07-29 06:47:35,219 EPOCH 5 2024-07-29 06:50:45,524 batch 104/1044 - loss 1.64844567 - lr 0.0010 - time 190.31s 2024-07-29 06:53:48,606 batch 208/1044 - loss 1.64985944 - lr 0.0010 - time 373.39s 2024-07-29 06:56:52,667 batch 312/1044 - loss 1.65055201 - lr 0.0010 - time 557.45s 2024-07-29 06:59:51,714 batch 416/1044 - loss 1.65345511 - lr 0.0010 - time 736.50s 2024-07-29 07:02:52,445 batch 520/1044 - loss 1.65111495 - lr 0.0010 - time 917.23s 2024-07-29 07:06:00,096 batch 624/1044 - loss 1.65081866 - lr 0.0010 - time 1104.88s 2024-07-29 07:09:16,066 batch 728/1044 - loss 1.64957887 - lr 0.0010 - time 1300.85s 2024-07-29 07:12:15,087 batch 832/1044 - loss 1.64832800 - lr 0.0010 - time 1479.87s 2024-07-29 07:15:10,030 batch 936/1044 - loss 1.64612010 - lr 0.0010 - time 1654.81s 2024-07-29 07:18:02,140 batch 1040/1044 - loss 1.64496474 - lr 0.0010 - time 1826.92s 2024-07-29 07:18:08,591 ---------------------------------------------------------------------------------------------------- 2024-07-29 07:18:08,594 EPOCH 5 DONE 2024-07-29 07:18:55,835 TRAIN Loss: 1.6448 2024-07-29 07:18:55,835 DEV Loss: 4.0923 2024-07-29 07:18:55,835 DEV Perplexity: 59.8790 2024-07-29 07:18:55,835 No improvement for 4 epoch(s) 2024-07-29 07:18:55,835 ---------------------------------------------------------------------------------------------------- 2024-07-29 07:18:55,835 EPOCH 6 2024-07-29 07:21:53,160 batch 104/1044 - loss 1.58821843 - lr 0.0001 - time 177.32s 2024-07-29 07:24:44,349 batch 208/1044 - loss 1.59108787 - lr 0.0001 - time 348.51s 2024-07-29 07:27:37,622 batch 312/1044 - loss 1.58441215 - lr 0.0001 - time 521.79s 2024-07-29 07:30:43,750 batch 416/1044 - loss 1.58090937 - lr 0.0001 - time 707.91s 2024-07-29 07:33:54,621 batch 520/1044 - loss 1.58090223 - lr 0.0001 - time 898.79s 2024-07-29 07:36:52,832 batch 624/1044 - loss 1.58009594 - lr 0.0001 - time 1077.00s 2024-07-29 07:40:09,071 batch 728/1044 - loss 1.57836947 - lr 0.0001 - time 1273.24s 2024-07-29 07:43:11,085 batch 832/1044 - loss 1.57711583 - lr 0.0001 - time 1455.25s 2024-07-29 07:46:18,514 batch 936/1044 - loss 1.57624354 - lr 0.0001 - time 1642.68s 2024-07-29 07:49:05,093 batch 1040/1044 - loss 1.57536047 - lr 0.0001 - time 1809.26s 2024-07-29 07:49:11,696 ---------------------------------------------------------------------------------------------------- 2024-07-29 07:49:11,699 EPOCH 6 DONE 2024-07-29 07:49:59,010 TRAIN Loss: 1.5752 2024-07-29 07:49:59,010 DEV Loss: 4.1991 2024-07-29 07:49:59,010 DEV Perplexity: 66.6274 2024-07-29 07:49:59,010 No improvement for 5 epoch(s) 2024-07-29 07:49:59,010 Patience reached: Terminating model training due to early stopping 2024-07-29 07:49:59,010 ---------------------------------------------------------------------------------------------------- 2024-07-29 07:49:59,010 Finished Training 2024-07-29 07:51:31,366 TEST Perplexity: 35.5327 2024-07-29 08:02:43,738 TEST BLEU = 4.47 45.6/8.8/2.0/0.5 (BP = 1.000 ratio = 1.000 hyp_len = 103 ref_len = 103) |