|
2024-07-29 09:42:42,423 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,423 Training Model |
|
2024-07-29 09:42:42,423 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,423 Translator( |
|
(encoder): EncoderLSTM( |
|
(embedding): Embedding(22834, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True) |
|
) |
|
(decoder): DecoderLSTM( |
|
(embedding): Embedding(14303, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True) |
|
(attention): DotProductAttention( |
|
(softmax): Softmax(dim=-1) |
|
(combined2hidden): Sequential( |
|
(0): Linear(in_features=1024, out_features=512, bias=True) |
|
(1): ReLU() |
|
) |
|
) |
|
(hidden2vocab): Linear(in_features=512, out_features=14303, bias=True) |
|
(log_softmax): LogSoftmax(dim=-1) |
|
) |
|
) |
|
2024-07-29 09:42:42,423 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,423 Training Hyperparameters: |
|
2024-07-29 09:42:42,423 - max_epochs: 10 |
|
2024-07-29 09:42:42,423 - learning_rate: 0.001 |
|
2024-07-29 09:42:42,423 - batch_size: 128 |
|
2024-07-29 09:42:42,423 - patience: 5 |
|
2024-07-29 09:42:42,423 - scheduler_patience: 3 |
|
2024-07-29 09:42:42,423 - teacher_forcing_ratio: 0.5 |
|
2024-07-29 09:42:42,423 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,423 Computational Parameters: |
|
2024-07-29 09:42:42,423 - num_workers: 4 |
|
2024-07-29 09:42:42,423 - device: device(type='cuda', index=0) |
|
2024-07-29 09:42:42,423 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,423 Dataset Splits: |
|
2024-07-29 09:42:42,423 - train: 133623 data points |
|
2024-07-29 09:42:42,423 - dev: 19090 data points |
|
2024-07-29 09:42:42,423 - test: 38179 data points |
|
2024-07-29 09:42:42,424 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:42:42,424 EPOCH 1 |
|
2024-07-29 09:43:17,154 batch 104/1044 - loss 6.42034669 - lr 0.0010 - time 34.73s |
|
2024-07-29 09:43:54,707 batch 208/1044 - loss 6.17063731 - lr 0.0010 - time 72.28s |
|
2024-07-29 09:44:32,541 batch 312/1044 - loss 6.00517355 - lr 0.0010 - time 110.12s |
|
2024-07-29 09:45:10,795 batch 416/1044 - loss 5.87077612 - lr 0.0010 - time 148.37s |
|
2024-07-29 09:45:47,671 batch 520/1044 - loss 5.75463560 - lr 0.0010 - time 185.25s |
|
2024-07-29 09:46:24,949 batch 624/1044 - loss 5.65824632 - lr 0.0010 - time 222.53s |
|
2024-07-29 09:47:03,552 batch 728/1044 - loss 5.56939856 - lr 0.0010 - time 261.13s |
|
2024-07-29 09:47:40,687 batch 832/1044 - loss 5.49128213 - lr 0.0010 - time 298.26s |
|
2024-07-29 09:48:18,127 batch 936/1044 - loss 5.41966415 - lr 0.0010 - time 335.70s |
|
2024-07-29 09:48:55,119 batch 1040/1044 - loss 5.35489992 - lr 0.0010 - time 372.70s |
|
2024-07-29 09:48:56,680 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:48:56,681 EPOCH 1 DONE |
|
2024-07-29 09:49:06,113 TRAIN Loss: 5.3525 |
|
2024-07-29 09:49:06,114 DEV Loss: 5.5692 |
|
2024-07-29 09:49:06,114 DEV Perplexity: 262.2315 |
|
2024-07-29 09:49:06,114 New best score! |
|
2024-07-29 09:49:06,115 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:49:06,115 EPOCH 2 |
|
2024-07-29 09:49:41,222 batch 104/1044 - loss 4.62738995 - lr 0.0010 - time 35.11s |
|
2024-07-29 09:50:17,864 batch 208/1044 - loss 4.59759969 - lr 0.0010 - time 71.75s |
|
2024-07-29 09:50:53,411 batch 312/1044 - loss 4.57657494 - lr 0.0010 - time 107.30s |
|
2024-07-29 09:51:31,209 batch 416/1044 - loss 4.54348163 - lr 0.0010 - time 145.09s |
|
2024-07-29 09:52:11,697 batch 520/1044 - loss 4.51823422 - lr 0.0010 - time 185.58s |
|
2024-07-29 09:52:48,926 batch 624/1044 - loss 4.49001330 - lr 0.0010 - time 222.81s |
|
2024-07-29 09:53:24,588 batch 728/1044 - loss 4.46876206 - lr 0.0010 - time 258.47s |
|
2024-07-29 09:54:02,468 batch 832/1044 - loss 4.44477118 - lr 0.0010 - time 296.35s |
|
2024-07-29 09:54:39,911 batch 936/1044 - loss 4.42371725 - lr 0.0010 - time 333.80s |
|
2024-07-29 09:55:16,492 batch 1040/1044 - loss 4.40068238 - lr 0.0010 - time 370.38s |
|
2024-07-29 09:55:18,277 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:55:18,279 EPOCH 2 DONE |
|
2024-07-29 09:55:27,546 TRAIN Loss: 4.3997 |
|
2024-07-29 09:55:27,546 DEV Loss: 5.2857 |
|
2024-07-29 09:55:27,546 DEV Perplexity: 197.4908 |
|
2024-07-29 09:55:27,546 New best score! |
|
2024-07-29 09:55:27,547 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:55:27,547 EPOCH 3 |
|
2024-07-29 09:56:04,874 batch 104/1044 - loss 4.04292682 - lr 0.0010 - time 37.33s |
|
2024-07-29 09:56:44,240 batch 208/1044 - loss 4.04458403 - lr 0.0010 - time 76.69s |
|
2024-07-29 09:57:19,595 batch 312/1044 - loss 4.04015087 - lr 0.0010 - time 112.05s |
|
2024-07-29 09:57:58,341 batch 416/1044 - loss 4.03473626 - lr 0.0010 - time 150.79s |
|
2024-07-29 09:58:33,685 batch 520/1044 - loss 4.02294693 - lr 0.0010 - time 186.14s |
|
2024-07-29 09:59:09,374 batch 624/1044 - loss 4.00945110 - lr 0.0010 - time 221.83s |
|
2024-07-29 09:59:49,125 batch 728/1044 - loss 4.00042684 - lr 0.0010 - time 261.58s |
|
2024-07-29 10:00:26,299 batch 832/1044 - loss 3.99049270 - lr 0.0010 - time 298.75s |
|
2024-07-29 10:01:03,713 batch 936/1044 - loss 3.97934972 - lr 0.0010 - time 336.17s |
|
2024-07-29 10:01:40,625 batch 1040/1044 - loss 3.96891846 - lr 0.0010 - time 373.08s |
|
2024-07-29 10:01:41,787 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:01:41,789 EPOCH 3 DONE |
|
2024-07-29 10:01:51,163 TRAIN Loss: 3.9687 |
|
2024-07-29 10:01:51,163 DEV Loss: 5.2440 |
|
2024-07-29 10:01:51,163 DEV Perplexity: 189.4295 |
|
2024-07-29 10:01:51,163 New best score! |
|
2024-07-29 10:01:51,164 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:01:51,164 EPOCH 4 |
|
2024-07-29 10:02:31,057 batch 104/1044 - loss 3.74893653 - lr 0.0010 - time 39.89s |
|
2024-07-29 10:03:05,331 batch 208/1044 - loss 3.75399486 - lr 0.0010 - time 74.17s |
|
2024-07-29 10:03:41,466 batch 312/1044 - loss 3.75771751 - lr 0.0010 - time 110.30s |
|
2024-07-29 10:04:15,960 batch 416/1044 - loss 3.75979321 - lr 0.0010 - time 144.80s |
|
2024-07-29 10:04:55,428 batch 520/1044 - loss 3.75057765 - lr 0.0010 - time 184.26s |
|
2024-07-29 10:05:33,137 batch 624/1044 - loss 3.74305481 - lr 0.0010 - time 221.97s |
|
2024-07-29 10:06:09,059 batch 728/1044 - loss 3.73923583 - lr 0.0010 - time 257.89s |
|
2024-07-29 10:06:47,012 batch 832/1044 - loss 3.73675085 - lr 0.0010 - time 295.85s |
|
2024-07-29 10:07:23,641 batch 936/1044 - loss 3.73419790 - lr 0.0010 - time 332.48s |
|
2024-07-29 10:07:58,748 batch 1040/1044 - loss 3.72953442 - lr 0.0010 - time 367.58s |
|
2024-07-29 10:08:00,245 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:08:00,246 EPOCH 4 DONE |
|
2024-07-29 10:08:09,716 TRAIN Loss: 3.7292 |
|
2024-07-29 10:08:09,717 DEV Loss: 5.1546 |
|
2024-07-29 10:08:09,717 DEV Perplexity: 173.2260 |
|
2024-07-29 10:08:09,717 New best score! |
|
2024-07-29 10:08:09,718 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:08:09,718 EPOCH 5 |
|
2024-07-29 10:08:48,898 batch 104/1044 - loss 3.53810529 - lr 0.0010 - time 39.18s |
|
2024-07-29 10:09:24,261 batch 208/1044 - loss 3.54713277 - lr 0.0010 - time 74.54s |
|
2024-07-29 10:09:59,554 batch 312/1044 - loss 3.55520624 - lr 0.0010 - time 109.84s |
|
2024-07-29 10:10:35,964 batch 416/1044 - loss 3.54529557 - lr 0.0010 - time 146.25s |
|
2024-07-29 10:11:13,273 batch 520/1044 - loss 3.53952308 - lr 0.0010 - time 183.56s |
|
2024-07-29 10:11:49,699 batch 624/1044 - loss 3.53902453 - lr 0.0010 - time 219.98s |
|
2024-07-29 10:12:26,577 batch 728/1044 - loss 3.54207764 - lr 0.0010 - time 256.86s |
|
2024-07-29 10:13:03,988 batch 832/1044 - loss 3.54191658 - lr 0.0010 - time 294.27s |
|
2024-07-29 10:13:44,152 batch 936/1044 - loss 3.54287420 - lr 0.0010 - time 334.43s |
|
2024-07-29 10:14:19,848 batch 1040/1044 - loss 3.54355186 - lr 0.0010 - time 370.13s |
|
2024-07-29 10:14:21,679 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:14:21,680 EPOCH 5 DONE |
|
2024-07-29 10:14:31,157 TRAIN Loss: 3.5436 |
|
2024-07-29 10:14:31,157 DEV Loss: 5.1595 |
|
2024-07-29 10:14:31,157 DEV Perplexity: 174.0773 |
|
2024-07-29 10:14:31,157 No improvement for 1 epoch(s) |
|
2024-07-29 10:14:31,157 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:14:31,157 EPOCH 6 |
|
2024-07-29 10:15:09,004 batch 104/1044 - loss 3.37988193 - lr 0.0010 - time 37.85s |
|
2024-07-29 10:15:46,449 batch 208/1044 - loss 3.39972965 - lr 0.0010 - time 75.29s |
|
2024-07-29 10:16:23,877 batch 312/1044 - loss 3.41839841 - lr 0.0010 - time 112.72s |
|
2024-07-29 10:17:02,860 batch 416/1044 - loss 3.42049147 - lr 0.0010 - time 151.70s |
|
2024-07-29 10:17:39,715 batch 520/1044 - loss 3.42189572 - lr 0.0010 - time 188.56s |
|
2024-07-29 10:18:16,287 batch 624/1044 - loss 3.41934290 - lr 0.0010 - time 225.13s |
|
2024-07-29 10:18:49,350 batch 728/1044 - loss 3.42369204 - lr 0.0010 - time 258.19s |
|
2024-07-29 10:19:27,406 batch 832/1044 - loss 3.42245102 - lr 0.0010 - time 296.25s |
|
2024-07-29 10:20:04,324 batch 936/1044 - loss 3.42058108 - lr 0.0010 - time 333.17s |
|
2024-07-29 10:20:39,261 batch 1040/1044 - loss 3.42255051 - lr 0.0010 - time 368.10s |
|
2024-07-29 10:20:43,715 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:20:43,717 EPOCH 6 DONE |
|
2024-07-29 10:20:53,217 TRAIN Loss: 3.4223 |
|
2024-07-29 10:20:53,218 DEV Loss: 5.1826 |
|
2024-07-29 10:20:53,218 DEV Perplexity: 178.1495 |
|
2024-07-29 10:20:53,218 No improvement for 2 epoch(s) |
|
2024-07-29 10:20:53,218 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:20:53,218 EPOCH 7 |
|
2024-07-29 10:21:31,444 batch 104/1044 - loss 3.29632874 - lr 0.0010 - time 38.23s |
|
2024-07-29 10:22:10,060 batch 208/1044 - loss 3.29179441 - lr 0.0010 - time 76.84s |
|
2024-07-29 10:22:45,065 batch 312/1044 - loss 3.28852440 - lr 0.0010 - time 111.85s |
|
2024-07-29 10:23:21,129 batch 416/1044 - loss 3.29654682 - lr 0.0010 - time 147.91s |
|
2024-07-29 10:23:58,897 batch 520/1044 - loss 3.30062932 - lr 0.0010 - time 185.68s |
|
2024-07-29 10:24:37,910 batch 624/1044 - loss 3.31254658 - lr 0.0010 - time 224.69s |
|
2024-07-29 10:25:15,978 batch 728/1044 - loss 3.31376025 - lr 0.0010 - time 262.76s |
|
2024-07-29 10:25:53,003 batch 832/1044 - loss 3.31953892 - lr 0.0010 - time 299.79s |
|
2024-07-29 10:26:30,024 batch 936/1044 - loss 3.32268426 - lr 0.0010 - time 336.81s |
|
2024-07-29 10:27:05,685 batch 1040/1044 - loss 3.32460238 - lr 0.0010 - time 372.47s |
|
2024-07-29 10:27:06,955 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:27:06,957 EPOCH 7 DONE |
|
2024-07-29 10:27:16,539 TRAIN Loss: 3.3246 |
|
2024-07-29 10:27:16,539 DEV Loss: 5.2310 |
|
2024-07-29 10:27:16,539 DEV Perplexity: 186.9724 |
|
2024-07-29 10:27:16,539 No improvement for 3 epoch(s) |
|
2024-07-29 10:27:16,539 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:27:16,539 EPOCH 8 |
|
2024-07-29 10:27:55,681 batch 104/1044 - loss 3.18067933 - lr 0.0010 - time 39.14s |
|
2024-07-29 10:28:30,973 batch 208/1044 - loss 3.20228673 - lr 0.0010 - time 74.43s |
|
2024-07-29 10:29:06,064 batch 312/1044 - loss 3.20549937 - lr 0.0010 - time 109.53s |
|
2024-07-29 10:29:43,870 batch 416/1044 - loss 3.21897588 - lr 0.0010 - time 147.33s |
|
2024-07-29 10:30:19,159 batch 520/1044 - loss 3.22153870 - lr 0.0010 - time 182.62s |
|
2024-07-29 10:30:55,565 batch 624/1044 - loss 3.22599725 - lr 0.0010 - time 219.03s |
|
2024-07-29 10:31:33,714 batch 728/1044 - loss 3.22878759 - lr 0.0010 - time 257.18s |
|
2024-07-29 10:32:10,440 batch 832/1044 - loss 3.23212968 - lr 0.0010 - time 293.90s |
|
2024-07-29 10:32:48,422 batch 936/1044 - loss 3.23624962 - lr 0.0010 - time 331.88s |
|
2024-07-29 10:33:24,964 batch 1040/1044 - loss 3.23659680 - lr 0.0010 - time 368.42s |
|
2024-07-29 10:33:26,214 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:33:26,216 EPOCH 8 DONE |
|
2024-07-29 10:33:35,755 TRAIN Loss: 3.2367 |
|
2024-07-29 10:33:35,756 DEV Loss: 5.2968 |
|
2024-07-29 10:33:35,756 DEV Perplexity: 199.6878 |
|
2024-07-29 10:33:35,756 No improvement for 4 epoch(s) |
|
2024-07-29 10:33:35,756 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:33:35,756 EPOCH 9 |
|
2024-07-29 10:34:15,083 batch 104/1044 - loss 3.08033091 - lr 0.0001 - time 39.33s |
|
2024-07-29 10:34:52,691 batch 208/1044 - loss 3.07522689 - lr 0.0001 - time 76.93s |
|
2024-07-29 10:35:29,151 batch 312/1044 - loss 3.06626054 - lr 0.0001 - time 113.39s |
|
2024-07-29 10:36:06,720 batch 416/1044 - loss 3.06839789 - lr 0.0001 - time 150.96s |
|
2024-07-29 10:36:41,167 batch 520/1044 - loss 3.06539460 - lr 0.0001 - time 185.41s |
|
2024-07-29 10:37:17,074 batch 624/1044 - loss 3.06574041 - lr 0.0001 - time 221.32s |
|
2024-07-29 10:37:54,392 batch 728/1044 - loss 3.06843089 - lr 0.0001 - time 258.64s |
|
2024-07-29 10:38:31,689 batch 832/1044 - loss 3.06777010 - lr 0.0001 - time 295.93s |
|
2024-07-29 10:39:06,956 batch 936/1044 - loss 3.06646013 - lr 0.0001 - time 331.20s |
|
2024-07-29 10:39:45,993 batch 1040/1044 - loss 3.06478271 - lr 0.0001 - time 370.24s |
|
2024-07-29 10:39:47,096 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:39:47,098 EPOCH 9 DONE |
|
2024-07-29 10:39:56,496 TRAIN Loss: 3.0646 |
|
2024-07-29 10:39:56,497 DEV Loss: 5.1945 |
|
2024-07-29 10:39:56,497 DEV Perplexity: 180.2739 |
|
2024-07-29 10:39:56,497 No improvement for 5 epoch(s) |
|
2024-07-29 10:39:56,497 Patience reached: Terminating model training due to early stopping |
|
2024-07-29 10:39:56,497 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 10:39:56,497 Finished Training |
|
2024-07-29 10:40:14,449 TEST Perplexity: 173.0781 |
|
2024-07-29 10:49:34,588 TEST BLEU = 17.27 82.9/65.2/22.1/0.7 (BP = 1.000 ratio = 1.000 hyp_len = 70 ref_len = 70) |
|
|