2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,295 Training Model 2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,295 Translator( (encoder): EncoderLSTM( (embedding): Embedding(14303, 300, padding_idx=0) (dropout): Dropout(p=0.1, inplace=False) (lstm): LSTM(300, 512, batch_first=True, bidirectional=True) ) (decoder): DecoderLSTM( (embedding): Embedding(22834, 300, padding_idx=0) (dropout): Dropout(p=0.1, inplace=False) (lstm): LSTM(300, 1024, batch_first=True) (hidden2vocab): Linear(in_features=1024, out_features=22834, bias=True) (log_softmax): LogSoftmax(dim=-1) ) ) 2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,295 Training Hyperparameters: 2024-07-29 10:41:51,295 - max_epochs: 10 2024-07-29 10:41:51,295 - learning_rate: 0.001 2024-07-29 10:41:51,295 - batch_size: 128 2024-07-29 10:41:51,295 - patience: 5 2024-07-29 10:41:51,295 - scheduler_patience: 3 2024-07-29 10:41:51,295 - teacher_forcing_ratio: 0.5 2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,295 Computational Parameters: 2024-07-29 10:41:51,295 - num_workers: 4 2024-07-29 10:41:51,295 - device: device(type='cuda', index=0) 2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,295 Dataset Splits: 2024-07-29 10:41:51,295 - train: 133623 data points 2024-07-29 10:41:51,295 - dev: 19090 data points 2024-07-29 10:41:51,296 - test: 38179 data points 2024-07-29 10:41:51,296 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:41:51,296 EPOCH 1 2024-07-29 10:42:43,980 batch 104/1044 - loss 6.56599054 - lr 0.0010 - time 52.68s 2024-07-29 10:43:35,196 batch 208/1044 - loss 6.28009422 - lr 0.0010 - time 103.90s 2024-07-29 10:44:25,168 batch 312/1044 - loss 6.11249907 - lr 0.0010 - time 153.87s 2024-07-29 10:45:15,557 batch 416/1044 - loss 5.99013720 - lr 0.0010 - time 204.26s 2024-07-29 10:46:02,970 batch 520/1044 - loss 5.89236221 - lr 0.0010 - time 251.67s 2024-07-29 10:46:51,664 batch 624/1044 - loss 5.81345889 - lr 0.0010 - time 300.37s 2024-07-29 10:47:42,555 batch 728/1044 - loss 5.74780520 - lr 0.0010 - time 351.26s 2024-07-29 10:48:33,483 batch 832/1044 - loss 5.69103370 - lr 0.0010 - time 402.19s 2024-07-29 10:49:22,573 batch 936/1044 - loss 5.63910694 - lr 0.0010 - time 451.28s 2024-07-29 10:50:14,318 batch 1040/1044 - loss 5.59255154 - lr 0.0010 - time 503.02s 2024-07-29 10:50:16,234 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:50:16,235 EPOCH 1 DONE 2024-07-29 10:50:29,064 TRAIN Loss: 5.5908 2024-07-29 10:50:29,064 DEV Loss: 5.7897 2024-07-29 10:50:29,064 DEV Perplexity: 326.8995 2024-07-29 10:50:29,064 New best score! 2024-07-29 10:50:29,065 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:50:29,065 EPOCH 2 2024-07-29 10:51:19,942 batch 104/1044 - loss 5.02739687 - lr 0.0010 - time 50.88s 2024-07-29 10:52:09,803 batch 208/1044 - loss 5.01800949 - lr 0.0010 - time 100.74s 2024-07-29 10:53:04,478 batch 312/1044 - loss 5.00509294 - lr 0.0010 - time 155.41s 2024-07-29 10:53:53,594 batch 416/1044 - loss 4.98731034 - lr 0.0010 - time 204.53s 2024-07-29 10:54:43,356 batch 520/1044 - loss 4.97219816 - lr 0.0010 - time 254.29s 2024-07-29 10:55:33,584 batch 624/1044 - loss 4.96074294 - lr 0.0010 - time 304.52s 2024-07-29 10:56:24,225 batch 728/1044 - loss 4.94472581 - lr 0.0010 - time 355.16s 2024-07-29 10:57:14,355 batch 832/1044 - loss 4.93236568 - lr 0.0010 - time 405.29s 2024-07-29 10:58:06,416 batch 936/1044 - loss 4.91768116 - lr 0.0010 - time 457.35s 2024-07-29 10:58:55,326 batch 1040/1044 - loss 4.90590350 - lr 0.0010 - time 506.26s 2024-07-29 10:58:57,793 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:58:57,794 EPOCH 2 DONE 2024-07-29 10:59:10,716 TRAIN Loss: 4.9057 2024-07-29 10:59:10,716 DEV Loss: 5.7132 2024-07-29 10:59:10,716 DEV Perplexity: 302.8460 2024-07-29 10:59:10,716 New best score! 2024-07-29 10:59:10,717 ---------------------------------------------------------------------------------------------------- 2024-07-29 10:59:10,717 EPOCH 3 2024-07-29 11:00:00,967 batch 104/1044 - loss 4.61237117 - lr 0.0010 - time 50.25s 2024-07-29 11:00:50,203 batch 208/1044 - loss 4.62395983 - lr 0.0010 - time 99.49s 2024-07-29 11:01:40,773 batch 312/1044 - loss 4.61521491 - lr 0.0010 - time 150.06s 2024-07-29 11:02:39,135 batch 416/1044 - loss 4.61224452 - lr 0.0010 - time 208.42s 2024-07-29 11:03:30,115 batch 520/1044 - loss 4.60275617 - lr 0.0010 - time 259.40s 2024-07-29 11:04:16,479 batch 624/1044 - loss 4.59871728 - lr 0.0010 - time 305.76s 2024-07-29 11:05:07,213 batch 728/1044 - loss 4.59086315 - lr 0.0010 - time 356.50s 2024-07-29 11:05:57,731 batch 832/1044 - loss 4.58489406 - lr 0.0010 - time 407.01s 2024-07-29 11:06:46,315 batch 936/1044 - loss 4.57758889 - lr 0.0010 - time 455.60s 2024-07-29 11:07:34,550 batch 1040/1044 - loss 4.56970717 - lr 0.0010 - time 503.83s 2024-07-29 11:07:36,805 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:07:36,806 EPOCH 3 DONE 2024-07-29 11:07:49,727 TRAIN Loss: 4.5697 2024-07-29 11:07:49,728 DEV Loss: 5.5772 2024-07-29 11:07:49,728 DEV Perplexity: 264.3216 2024-07-29 11:07:49,728 New best score! 2024-07-29 11:07:49,729 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:07:49,729 EPOCH 4 2024-07-29 11:08:37,495 batch 104/1044 - loss 4.31793029 - lr 0.0010 - time 47.77s 2024-07-29 11:09:28,126 batch 208/1044 - loss 4.31731233 - lr 0.0010 - time 98.40s 2024-07-29 11:10:17,772 batch 312/1044 - loss 4.31730254 - lr 0.0010 - time 148.04s 2024-07-29 11:11:11,397 batch 416/1044 - loss 4.31653301 - lr 0.0010 - time 201.67s 2024-07-29 11:12:01,968 batch 520/1044 - loss 4.32179287 - lr 0.0010 - time 252.24s 2024-07-29 11:12:52,660 batch 624/1044 - loss 4.32694693 - lr 0.0010 - time 302.93s 2024-07-29 11:13:42,592 batch 728/1044 - loss 4.32466568 - lr 0.0010 - time 352.86s 2024-07-29 11:14:33,972 batch 832/1044 - loss 4.32141261 - lr 0.0010 - time 404.24s 2024-07-29 11:15:25,165 batch 936/1044 - loss 4.31979928 - lr 0.0010 - time 455.44s 2024-07-29 11:16:12,909 batch 1040/1044 - loss 4.31766206 - lr 0.0010 - time 503.18s 2024-07-29 11:16:14,722 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:16:14,723 EPOCH 4 DONE 2024-07-29 11:16:27,582 TRAIN Loss: 4.3179 2024-07-29 11:16:27,583 DEV Loss: 5.5061 2024-07-29 11:16:27,583 DEV Perplexity: 246.1892 2024-07-29 11:16:27,583 New best score! 2024-07-29 11:16:27,584 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:16:27,584 EPOCH 5 2024-07-29 11:17:19,390 batch 104/1044 - loss 4.10156497 - lr 0.0010 - time 51.81s 2024-07-29 11:18:08,736 batch 208/1044 - loss 4.09617276 - lr 0.0010 - time 101.15s 2024-07-29 11:18:57,718 batch 312/1044 - loss 4.10813874 - lr 0.0010 - time 150.13s 2024-07-29 11:19:53,482 batch 416/1044 - loss 4.11702962 - lr 0.0010 - time 205.90s 2024-07-29 11:20:43,773 batch 520/1044 - loss 4.11525546 - lr 0.0010 - time 256.19s 2024-07-29 11:21:34,723 batch 624/1044 - loss 4.11790551 - lr 0.0010 - time 307.14s 2024-07-29 11:22:23,462 batch 728/1044 - loss 4.12154044 - lr 0.0010 - time 355.88s 2024-07-29 11:23:11,115 batch 832/1044 - loss 4.12138260 - lr 0.0010 - time 403.53s 2024-07-29 11:24:00,337 batch 936/1044 - loss 4.12506736 - lr 0.0010 - time 452.75s 2024-07-29 11:24:50,964 batch 1040/1044 - loss 4.12429898 - lr 0.0010 - time 503.38s 2024-07-29 11:24:52,983 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:24:52,984 EPOCH 5 DONE 2024-07-29 11:25:05,723 TRAIN Loss: 4.1247 2024-07-29 11:25:05,723 DEV Loss: 5.4289 2024-07-29 11:25:05,723 DEV Perplexity: 227.8912 2024-07-29 11:25:05,723 New best score! 2024-07-29 11:25:05,724 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:25:05,724 EPOCH 6 2024-07-29 11:25:59,338 batch 104/1044 - loss 3.89071036 - lr 0.0010 - time 53.61s 2024-07-29 11:26:50,131 batch 208/1044 - loss 3.91066583 - lr 0.0010 - time 104.41s 2024-07-29 11:27:39,605 batch 312/1044 - loss 3.92001536 - lr 0.0010 - time 153.88s 2024-07-29 11:28:26,705 batch 416/1044 - loss 3.91852045 - lr 0.0010 - time 200.98s 2024-07-29 11:29:21,163 batch 520/1044 - loss 3.92671625 - lr 0.0010 - time 255.44s 2024-07-29 11:30:09,942 batch 624/1044 - loss 3.93454336 - lr 0.0010 - time 304.22s 2024-07-29 11:31:02,918 batch 728/1044 - loss 3.94077764 - lr 0.0010 - time 357.19s 2024-07-29 11:31:53,528 batch 832/1044 - loss 3.94676249 - lr 0.0010 - time 407.80s 2024-07-29 11:32:41,961 batch 936/1044 - loss 3.95203299 - lr 0.0010 - time 456.24s 2024-07-29 11:33:31,394 batch 1040/1044 - loss 3.95468071 - lr 0.0010 - time 505.67s 2024-07-29 11:33:33,260 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:33:33,261 EPOCH 6 DONE 2024-07-29 11:33:46,131 TRAIN Loss: 3.9546 2024-07-29 11:33:46,131 DEV Loss: 5.4532 2024-07-29 11:33:46,131 DEV Perplexity: 233.4940 2024-07-29 11:33:46,131 No improvement for 1 epoch(s) 2024-07-29 11:33:46,131 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:33:46,131 EPOCH 7 2024-07-29 11:34:36,037 batch 104/1044 - loss 3.75202120 - lr 0.0010 - time 49.91s 2024-07-29 11:35:23,211 batch 208/1044 - loss 3.76428310 - lr 0.0010 - time 97.08s 2024-07-29 11:36:15,737 batch 312/1044 - loss 3.76220069 - lr 0.0010 - time 149.61s 2024-07-29 11:37:06,599 batch 416/1044 - loss 3.76866076 - lr 0.0010 - time 200.47s 2024-07-29 11:37:57,102 batch 520/1044 - loss 3.78008501 - lr 0.0010 - time 250.97s 2024-07-29 11:38:48,470 batch 624/1044 - loss 3.78899940 - lr 0.0010 - time 302.34s 2024-07-29 11:39:37,561 batch 728/1044 - loss 3.79675758 - lr 0.0010 - time 351.43s 2024-07-29 11:40:26,884 batch 832/1044 - loss 3.80079628 - lr 0.0010 - time 400.75s 2024-07-29 11:41:15,559 batch 936/1044 - loss 3.80748021 - lr 0.0010 - time 449.43s 2024-07-29 11:42:04,567 batch 1040/1044 - loss 3.81313193 - lr 0.0010 - time 498.44s 2024-07-29 11:42:06,383 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:42:06,384 EPOCH 7 DONE 2024-07-29 11:42:19,326 TRAIN Loss: 3.8132 2024-07-29 11:42:19,326 DEV Loss: 5.4459 2024-07-29 11:42:19,326 DEV Perplexity: 231.8110 2024-07-29 11:42:19,326 No improvement for 2 epoch(s) 2024-07-29 11:42:19,326 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:42:19,326 EPOCH 8 2024-07-29 11:43:11,878 batch 104/1044 - loss 3.64161499 - lr 0.0010 - time 52.55s 2024-07-29 11:44:02,812 batch 208/1044 - loss 3.66078973 - lr 0.0010 - time 103.49s 2024-07-29 11:44:54,581 batch 312/1044 - loss 3.66940373 - lr 0.0010 - time 155.26s 2024-07-29 11:45:42,502 batch 416/1044 - loss 3.67283917 - lr 0.0010 - time 203.18s 2024-07-29 11:46:32,748 batch 520/1044 - loss 3.67896443 - lr 0.0010 - time 253.42s 2024-07-29 11:47:19,611 batch 624/1044 - loss 3.68378819 - lr 0.0010 - time 300.28s 2024-07-29 11:48:12,844 batch 728/1044 - loss 3.68957532 - lr 0.0010 - time 353.52s 2024-07-29 11:49:01,503 batch 832/1044 - loss 3.69448218 - lr 0.0010 - time 402.18s 2024-07-29 11:49:51,030 batch 936/1044 - loss 3.70412089 - lr 0.0010 - time 451.70s 2024-07-29 11:50:42,780 batch 1040/1044 - loss 3.70785985 - lr 0.0010 - time 503.45s 2024-07-29 11:50:44,516 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:50:44,517 EPOCH 8 DONE 2024-07-29 11:50:57,332 TRAIN Loss: 3.7082 2024-07-29 11:50:57,332 DEV Loss: 5.4909 2024-07-29 11:50:57,332 DEV Perplexity: 242.4722 2024-07-29 11:50:57,332 No improvement for 3 epoch(s) 2024-07-29 11:50:57,332 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:50:57,332 EPOCH 9 2024-07-29 11:51:48,335 batch 104/1044 - loss 3.51649693 - lr 0.0010 - time 51.00s 2024-07-29 11:52:36,237 batch 208/1044 - loss 3.53223671 - lr 0.0010 - time 98.90s 2024-07-29 11:53:24,958 batch 312/1044 - loss 3.54294675 - lr 0.0010 - time 147.63s 2024-07-29 11:54:15,450 batch 416/1044 - loss 3.55195141 - lr 0.0010 - time 198.12s 2024-07-29 11:55:05,536 batch 520/1044 - loss 3.55877144 - lr 0.0010 - time 248.20s 2024-07-29 11:56:00,052 batch 624/1044 - loss 3.56457711 - lr 0.0010 - time 302.72s 2024-07-29 11:56:51,205 batch 728/1044 - loss 3.57586517 - lr 0.0010 - time 353.87s 2024-07-29 11:57:39,310 batch 832/1044 - loss 3.58044331 - lr 0.0010 - time 401.98s 2024-07-29 11:58:31,929 batch 936/1044 - loss 3.58557119 - lr 0.0010 - time 454.60s 2024-07-29 11:59:20,796 batch 1040/1044 - loss 3.59259140 - lr 0.0010 - time 503.46s 2024-07-29 11:59:22,537 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:59:22,537 EPOCH 9 DONE 2024-07-29 11:59:35,430 TRAIN Loss: 3.5929 2024-07-29 11:59:35,430 DEV Loss: 5.5051 2024-07-29 11:59:35,430 DEV Perplexity: 245.9499 2024-07-29 11:59:35,430 No improvement for 4 epoch(s) 2024-07-29 11:59:35,430 ---------------------------------------------------------------------------------------------------- 2024-07-29 11:59:35,430 EPOCH 10 2024-07-29 12:00:26,025 batch 104/1044 - loss 3.40724007 - lr 0.0001 - time 50.59s 2024-07-29 12:01:13,195 batch 208/1044 - loss 3.39549031 - lr 0.0001 - time 97.76s 2024-07-29 12:02:02,790 batch 312/1044 - loss 3.38177447 - lr 0.0001 - time 147.36s 2024-07-29 12:02:54,277 batch 416/1044 - loss 3.37592916 - lr 0.0001 - time 198.85s 2024-07-29 12:03:45,463 batch 520/1044 - loss 3.37005038 - lr 0.0001 - time 250.03s 2024-07-29 12:04:33,863 batch 624/1044 - loss 3.37062476 - lr 0.0001 - time 298.43s 2024-07-29 12:05:26,246 batch 728/1044 - loss 3.37177335 - lr 0.0001 - time 350.82s 2024-07-29 12:06:15,720 batch 832/1044 - loss 3.37051947 - lr 0.0001 - time 400.29s 2024-07-29 12:07:04,090 batch 936/1044 - loss 3.36859261 - lr 0.0001 - time 448.66s 2024-07-29 12:07:56,434 batch 1040/1044 - loss 3.36913985 - lr 0.0001 - time 501.00s 2024-07-29 12:07:58,775 ---------------------------------------------------------------------------------------------------- 2024-07-29 12:07:58,776 EPOCH 10 DONE 2024-07-29 12:08:11,643 TRAIN Loss: 3.3688 2024-07-29 12:08:11,644 DEV Loss: 5.5078 2024-07-29 12:08:11,644 DEV Perplexity: 246.6117 2024-07-29 12:08:11,644 No improvement for 5 epoch(s) 2024-07-29 12:08:11,644 Patience reached: Terminating model training due to early stopping 2024-07-29 12:08:11,644 ---------------------------------------------------------------------------------------------------- 2024-07-29 12:08:11,644 Finished Training 2024-07-29 12:08:36,837 TEST Perplexity: 227.3162 2024-07-29 12:11:57,875 TEST BLEU = 12.94 77.1/52.4/11.1/0.6 (BP = 1.000 ratio = 1.000 hyp_len = 83 ref_len = 83)