|
2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,295 Training Model |
|
2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,295 Translator( |
|
(encoder): EncoderLSTM( |
|
(embedding): Embedding(14303, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True, bidirectional=True) |
|
) |
|
(decoder): DecoderLSTM( |
|
(embedding): Embedding(22834, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 1024, batch_first=True) |
|
(hidden2vocab): Linear(in_features=1024, out_features=22834, bias=True) |
|
(log_softmax): LogSoftmax(dim=-1) |
|
) |
|
) |
|
2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,295 Training Hyperparameters: |
|
2024-07-29 10:41:51,295 - max_epochs: 10 |
|
2024-07-29 10:41:51,295 - learning_rate: 0.001 |
|
2024-07-29 10:41:51,295 - batch_size: 128 |
|
2024-07-29 10:41:51,295 - patience: 5 |
|
2024-07-29 10:41:51,295 - scheduler_patience: 3 |
|
2024-07-29 10:41:51,295 - teacher_forcing_ratio: 0.5 |
|
2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,295 Computational Parameters: |
|
2024-07-29 10:41:51,295 - num_workers: 4 |
|
2024-07-29 10:41:51,295 - device: device(type='cuda', index=0) |
|
2024-07-29 10:41:51,295 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,295 Dataset Splits: |
|
2024-07-29 10:41:51,295 - train: 133623 data points |
|
2024-07-29 10:41:51,295 - dev: 19090 data points |
|
2024-07-29 10:41:51,296 - test: 38179 data points |
|
2024-07-29 10:41:51,296 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:41:51,296 EPOCH 1 |
|
2024-07-29 10:42:43,980 batch 104/1044 - loss 6.56599054 - lr 0.0010 - time 52.68s |
|
2024-07-29 10:43:35,196 batch 208/1044 - loss 6.28009422 - lr 0.0010 - time 103.90s |
|
2024-07-29 10:44:25,168 batch 312/1044 - loss 6.11249907 - lr 0.0010 - time 153.87s |
|
2024-07-29 10:45:15,557 batch 416/1044 - loss 5.99013720 - lr 0.0010 - time 204.26s |
|
2024-07-29 10:46:02,970 batch 520/1044 - loss 5.89236221 - lr 0.0010 - time 251.67s |
|
2024-07-29 10:46:51,664 batch 624/1044 - loss 5.81345889 - lr 0.0010 - time 300.37s |
|
2024-07-29 10:47:42,555 batch 728/1044 - loss 5.74780520 - lr 0.0010 - time 351.26s |
|
2024-07-29 10:48:33,483 batch 832/1044 - loss 5.69103370 - lr 0.0010 - time 402.19s |
|
2024-07-29 10:49:22,573 batch 936/1044 - loss 5.63910694 - lr 0.0010 - time 451.28s |
|
2024-07-29 10:50:14,318 batch 1040/1044 - loss 5.59255154 - lr 0.0010 - time 503.02s |
|
2024-07-29 10:50:16,234 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:50:16,235 EPOCH 1 DONE |
|
2024-07-29 10:50:29,064 TRAIN Loss: 5.5908 |
|
2024-07-29 10:50:29,064 DEV Loss: 5.7897 |
|
2024-07-29 10:50:29,064 DEV Perplexity: 326.8995 |
|
2024-07-29 10:50:29,064 New best score! |
|
2024-07-29 10:50:29,065 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:50:29,065 EPOCH 2 |
|
2024-07-29 10:51:19,942 batch 104/1044 - loss 5.02739687 - lr 0.0010 - time 50.88s |
|
2024-07-29 10:52:09,803 batch 208/1044 - loss 5.01800949 - lr 0.0010 - time 100.74s |
|
2024-07-29 10:53:04,478 batch 312/1044 - loss 5.00509294 - lr 0.0010 - time 155.41s |
|
2024-07-29 10:53:53,594 batch 416/1044 - loss 4.98731034 - lr 0.0010 - time 204.53s |
|
2024-07-29 10:54:43,356 batch 520/1044 - loss 4.97219816 - lr 0.0010 - time 254.29s |
|
2024-07-29 10:55:33,584 batch 624/1044 - loss 4.96074294 - lr 0.0010 - time 304.52s |
|
2024-07-29 10:56:24,225 batch 728/1044 - loss 4.94472581 - lr 0.0010 - time 355.16s |
|
2024-07-29 10:57:14,355 batch 832/1044 - loss 4.93236568 - lr 0.0010 - time 405.29s |
|
2024-07-29 10:58:06,416 batch 936/1044 - loss 4.91768116 - lr 0.0010 - time 457.35s |
|
2024-07-29 10:58:55,326 batch 1040/1044 - loss 4.90590350 - lr 0.0010 - time 506.26s |
|
2024-07-29 10:58:57,793 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:58:57,794 EPOCH 2 DONE |
|
2024-07-29 10:59:10,716 TRAIN Loss: 4.9057 |
|
2024-07-29 10:59:10,716 DEV Loss: 5.7132 |
|
2024-07-29 10:59:10,716 DEV Perplexity: 302.8460 |
|
2024-07-29 10:59:10,716 New best score! |
|
2024-07-29 10:59:10,717 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:59:10,717 EPOCH 3 |
|
2024-07-29 11:00:00,967 batch 104/1044 - loss 4.61237117 - lr 0.0010 - time 50.25s |
|
2024-07-29 11:00:50,203 batch 208/1044 - loss 4.62395983 - lr 0.0010 - time 99.49s |
|
2024-07-29 11:01:40,773 batch 312/1044 - loss 4.61521491 - lr 0.0010 - time 150.06s |
|
2024-07-29 11:02:39,135 batch 416/1044 - loss 4.61224452 - lr 0.0010 - time 208.42s |
|
2024-07-29 11:03:30,115 batch 520/1044 - loss 4.60275617 - lr 0.0010 - time 259.40s |
|
2024-07-29 11:04:16,479 batch 624/1044 - loss 4.59871728 - lr 0.0010 - time 305.76s |
|
2024-07-29 11:05:07,213 batch 728/1044 - loss 4.59086315 - lr 0.0010 - time 356.50s |
|
2024-07-29 11:05:57,731 batch 832/1044 - loss 4.58489406 - lr 0.0010 - time 407.01s |
|
2024-07-29 11:06:46,315 batch 936/1044 - loss 4.57758889 - lr 0.0010 - time 455.60s |
|
2024-07-29 11:07:34,550 batch 1040/1044 - loss 4.56970717 - lr 0.0010 - time 503.83s |
|
2024-07-29 11:07:36,805 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:07:36,806 EPOCH 3 DONE |
|
2024-07-29 11:07:49,727 TRAIN Loss: 4.5697 |
|
2024-07-29 11:07:49,728 DEV Loss: 5.5772 |
|
2024-07-29 11:07:49,728 DEV Perplexity: 264.3216 |
|
2024-07-29 11:07:49,728 New best score! |
|
2024-07-29 11:07:49,729 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:07:49,729 EPOCH 4 |
|
2024-07-29 11:08:37,495 batch 104/1044 - loss 4.31793029 - lr 0.0010 - time 47.77s |
|
2024-07-29 11:09:28,126 batch 208/1044 - loss 4.31731233 - lr 0.0010 - time 98.40s |
|
2024-07-29 11:10:17,772 batch 312/1044 - loss 4.31730254 - lr 0.0010 - time 148.04s |
|
2024-07-29 11:11:11,397 batch 416/1044 - loss 4.31653301 - lr 0.0010 - time 201.67s |
|
2024-07-29 11:12:01,968 batch 520/1044 - loss 4.32179287 - lr 0.0010 - time 252.24s |
|
2024-07-29 11:12:52,660 batch 624/1044 - loss 4.32694693 - lr 0.0010 - time 302.93s |
|
2024-07-29 11:13:42,592 batch 728/1044 - loss 4.32466568 - lr 0.0010 - time 352.86s |
|
2024-07-29 11:14:33,972 batch 832/1044 - loss 4.32141261 - lr 0.0010 - time 404.24s |
|
2024-07-29 11:15:25,165 batch 936/1044 - loss 4.31979928 - lr 0.0010 - time 455.44s |
|
2024-07-29 11:16:12,909 batch 1040/1044 - loss 4.31766206 - lr 0.0010 - time 503.18s |
|
2024-07-29 11:16:14,722 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:16:14,723 EPOCH 4 DONE |
|
2024-07-29 11:16:27,582 TRAIN Loss: 4.3179 |
|
2024-07-29 11:16:27,583 DEV Loss: 5.5061 |
|
2024-07-29 11:16:27,583 DEV Perplexity: 246.1892 |
|
2024-07-29 11:16:27,583 New best score! |
|
2024-07-29 11:16:27,584 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:16:27,584 EPOCH 5 |
|
2024-07-29 11:17:19,390 batch 104/1044 - loss 4.10156497 - lr 0.0010 - time 51.81s |
|
2024-07-29 11:18:08,736 batch 208/1044 - loss 4.09617276 - lr 0.0010 - time 101.15s |
|
2024-07-29 11:18:57,718 batch 312/1044 - loss 4.10813874 - lr 0.0010 - time 150.13s |
|
2024-07-29 11:19:53,482 batch 416/1044 - loss 4.11702962 - lr 0.0010 - time 205.90s |
|
2024-07-29 11:20:43,773 batch 520/1044 - loss 4.11525546 - lr 0.0010 - time 256.19s |
|
2024-07-29 11:21:34,723 batch 624/1044 - loss 4.11790551 - lr 0.0010 - time 307.14s |
|
2024-07-29 11:22:23,462 batch 728/1044 - loss 4.12154044 - lr 0.0010 - time 355.88s |
|
2024-07-29 11:23:11,115 batch 832/1044 - loss 4.12138260 - lr 0.0010 - time 403.53s |
|
2024-07-29 11:24:00,337 batch 936/1044 - loss 4.12506736 - lr 0.0010 - time 452.75s |
|
2024-07-29 11:24:50,964 batch 1040/1044 - loss 4.12429898 - lr 0.0010 - time 503.38s |
|
2024-07-29 11:24:52,983 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:24:52,984 EPOCH 5 DONE |
|
2024-07-29 11:25:05,723 TRAIN Loss: 4.1247 |
|
2024-07-29 11:25:05,723 DEV Loss: 5.4289 |
|
2024-07-29 11:25:05,723 DEV Perplexity: 227.8912 |
|
2024-07-29 11:25:05,723 New best score! |
|
2024-07-29 11:25:05,724 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:25:05,724 EPOCH 6 |
|
2024-07-29 11:25:59,338 batch 104/1044 - loss 3.89071036 - lr 0.0010 - time 53.61s |
|
2024-07-29 11:26:50,131 batch 208/1044 - loss 3.91066583 - lr 0.0010 - time 104.41s |
|
2024-07-29 11:27:39,605 batch 312/1044 - loss 3.92001536 - lr 0.0010 - time 153.88s |
|
2024-07-29 11:28:26,705 batch 416/1044 - loss 3.91852045 - lr 0.0010 - time 200.98s |
|
2024-07-29 11:29:21,163 batch 520/1044 - loss 3.92671625 - lr 0.0010 - time 255.44s |
|
2024-07-29 11:30:09,942 batch 624/1044 - loss 3.93454336 - lr 0.0010 - time 304.22s |
|
2024-07-29 11:31:02,918 batch 728/1044 - loss 3.94077764 - lr 0.0010 - time 357.19s |
|
2024-07-29 11:31:53,528 batch 832/1044 - loss 3.94676249 - lr 0.0010 - time 407.80s |
|
2024-07-29 11:32:41,961 batch 936/1044 - loss 3.95203299 - lr 0.0010 - time 456.24s |
|
2024-07-29 11:33:31,394 batch 1040/1044 - loss 3.95468071 - lr 0.0010 - time 505.67s |
|
2024-07-29 11:33:33,260 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:33:33,261 EPOCH 6 DONE |
|
2024-07-29 11:33:46,131 TRAIN Loss: 3.9546 |
|
2024-07-29 11:33:46,131 DEV Loss: 5.4532 |
|
2024-07-29 11:33:46,131 DEV Perplexity: 233.4940 |
|
2024-07-29 11:33:46,131 No improvement for 1 epoch(s) |
|
2024-07-29 11:33:46,131 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:33:46,131 EPOCH 7 |
|
2024-07-29 11:34:36,037 batch 104/1044 - loss 3.75202120 - lr 0.0010 - time 49.91s |
|
2024-07-29 11:35:23,211 batch 208/1044 - loss 3.76428310 - lr 0.0010 - time 97.08s |
|
2024-07-29 11:36:15,737 batch 312/1044 - loss 3.76220069 - lr 0.0010 - time 149.61s |
|
2024-07-29 11:37:06,599 batch 416/1044 - loss 3.76866076 - lr 0.0010 - time 200.47s |
|
2024-07-29 11:37:57,102 batch 520/1044 - loss 3.78008501 - lr 0.0010 - time 250.97s |
|
2024-07-29 11:38:48,470 batch 624/1044 - loss 3.78899940 - lr 0.0010 - time 302.34s |
|
2024-07-29 11:39:37,561 batch 728/1044 - loss 3.79675758 - lr 0.0010 - time 351.43s |
|
2024-07-29 11:40:26,884 batch 832/1044 - loss 3.80079628 - lr 0.0010 - time 400.75s |
|
2024-07-29 11:41:15,559 batch 936/1044 - loss 3.80748021 - lr 0.0010 - time 449.43s |
|
2024-07-29 11:42:04,567 batch 1040/1044 - loss 3.81313193 - lr 0.0010 - time 498.44s |
|
2024-07-29 11:42:06,383 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:42:06,384 EPOCH 7 DONE |
|
2024-07-29 11:42:19,326 TRAIN Loss: 3.8132 |
|
2024-07-29 11:42:19,326 DEV Loss: 5.4459 |
|
2024-07-29 11:42:19,326 DEV Perplexity: 231.8110 |
|
2024-07-29 11:42:19,326 No improvement for 2 epoch(s) |
|
2024-07-29 11:42:19,326 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:42:19,326 EPOCH 8 |
|
2024-07-29 11:43:11,878 batch 104/1044 - loss 3.64161499 - lr 0.0010 - time 52.55s |
|
2024-07-29 11:44:02,812 batch 208/1044 - loss 3.66078973 - lr 0.0010 - time 103.49s |
|
2024-07-29 11:44:54,581 batch 312/1044 - loss 3.66940373 - lr 0.0010 - time 155.26s |
|
2024-07-29 11:45:42,502 batch 416/1044 - loss 3.67283917 - lr 0.0010 - time 203.18s |
|
2024-07-29 11:46:32,748 batch 520/1044 - loss 3.67896443 - lr 0.0010 - time 253.42s |
|
2024-07-29 11:47:19,611 batch 624/1044 - loss 3.68378819 - lr 0.0010 - time 300.28s |
|
2024-07-29 11:48:12,844 batch 728/1044 - loss 3.68957532 - lr 0.0010 - time 353.52s |
|
2024-07-29 11:49:01,503 batch 832/1044 - loss 3.69448218 - lr 0.0010 - time 402.18s |
|
2024-07-29 11:49:51,030 batch 936/1044 - loss 3.70412089 - lr 0.0010 - time 451.70s |
|
2024-07-29 11:50:42,780 batch 1040/1044 - loss 3.70785985 - lr 0.0010 - time 503.45s |
|
2024-07-29 11:50:44,516 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:50:44,517 EPOCH 8 DONE |
|
2024-07-29 11:50:57,332 TRAIN Loss: 3.7082 |
|
2024-07-29 11:50:57,332 DEV Loss: 5.4909 |
|
2024-07-29 11:50:57,332 DEV Perplexity: 242.4722 |
|
2024-07-29 11:50:57,332 No improvement for 3 epoch(s) |
|
2024-07-29 11:50:57,332 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:50:57,332 EPOCH 9 |
|
2024-07-29 11:51:48,335 batch 104/1044 - loss 3.51649693 - lr 0.0010 - time 51.00s |
|
2024-07-29 11:52:36,237 batch 208/1044 - loss 3.53223671 - lr 0.0010 - time 98.90s |
|
2024-07-29 11:53:24,958 batch 312/1044 - loss 3.54294675 - lr 0.0010 - time 147.63s |
|
2024-07-29 11:54:15,450 batch 416/1044 - loss 3.55195141 - lr 0.0010 - time 198.12s |
|
2024-07-29 11:55:05,536 batch 520/1044 - loss 3.55877144 - lr 0.0010 - time 248.20s |
|
2024-07-29 11:56:00,052 batch 624/1044 - loss 3.56457711 - lr 0.0010 - time 302.72s |
|
2024-07-29 11:56:51,205 batch 728/1044 - loss 3.57586517 - lr 0.0010 - time 353.87s |
|
2024-07-29 11:57:39,310 batch 832/1044 - loss 3.58044331 - lr 0.0010 - time 401.98s |
|
2024-07-29 11:58:31,929 batch 936/1044 - loss 3.58557119 - lr 0.0010 - time 454.60s |
|
2024-07-29 11:59:20,796 batch 1040/1044 - loss 3.59259140 - lr 0.0010 - time 503.46s |
|
2024-07-29 11:59:22,537 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:59:22,537 EPOCH 9 DONE |
|
2024-07-29 11:59:35,430 TRAIN Loss: 3.5929 |
|
2024-07-29 11:59:35,430 DEV Loss: 5.5051 |
|
2024-07-29 11:59:35,430 DEV Perplexity: 245.9499 |
|
2024-07-29 11:59:35,430 No improvement for 4 epoch(s) |
|
2024-07-29 11:59:35,430 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 11:59:35,430 EPOCH 10 |
|
2024-07-29 12:00:26,025 batch 104/1044 - loss 3.40724007 - lr 0.0001 - time 50.59s |
|
2024-07-29 12:01:13,195 batch 208/1044 - loss 3.39549031 - lr 0.0001 - time 97.76s |
|
2024-07-29 12:02:02,790 batch 312/1044 - loss 3.38177447 - lr 0.0001 - time 147.36s |
|
2024-07-29 12:02:54,277 batch 416/1044 - loss 3.37592916 - lr 0.0001 - time 198.85s |
|
2024-07-29 12:03:45,463 batch 520/1044 - loss 3.37005038 - lr 0.0001 - time 250.03s |
|
2024-07-29 12:04:33,863 batch 624/1044 - loss 3.37062476 - lr 0.0001 - time 298.43s |
|
2024-07-29 12:05:26,246 batch 728/1044 - loss 3.37177335 - lr 0.0001 - time 350.82s |
|
2024-07-29 12:06:15,720 batch 832/1044 - loss 3.37051947 - lr 0.0001 - time 400.29s |
|
2024-07-29 12:07:04,090 batch 936/1044 - loss 3.36859261 - lr 0.0001 - time 448.66s |
|
2024-07-29 12:07:56,434 batch 1040/1044 - loss 3.36913985 - lr 0.0001 - time 501.00s |
|
2024-07-29 12:07:58,775 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 12:07:58,776 EPOCH 10 DONE |
|
2024-07-29 12:08:11,643 TRAIN Loss: 3.3688 |
|
2024-07-29 12:08:11,644 DEV Loss: 5.5078 |
|
2024-07-29 12:08:11,644 DEV Perplexity: 246.6117 |
|
2024-07-29 12:08:11,644 No improvement for 5 epoch(s) |
|
2024-07-29 12:08:11,644 Patience reached: Terminating model training due to early stopping |
|
2024-07-29 12:08:11,644 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 12:08:11,644 Finished Training |
|
2024-07-29 12:08:36,837 TEST Perplexity: 227.3162 |
|
2024-07-29 12:11:57,875 TEST BLEU = 12.94 77.1/52.4/11.1/0.6 (BP = 1.000 ratio = 1.000 hyp_len = 83 ref_len = 83) |
|
|