stefan-it's picture
Upload folder using huggingface_hub
8d6afac
2023-10-17 20:05:58,315 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,316 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 20:05:58,316 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,316 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Train: 5901 sentences
2023-10-17 20:05:58,317 (train_with_dev=False, train_with_test=False)
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Training Params:
2023-10-17 20:05:58,317 - learning_rate: "3e-05"
2023-10-17 20:05:58,317 - mini_batch_size: "4"
2023-10-17 20:05:58,317 - max_epochs: "10"
2023-10-17 20:05:58,317 - shuffle: "True"
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Plugins:
2023-10-17 20:05:58,317 - TensorboardLogger
2023-10-17 20:05:58,317 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 20:05:58,317 - metric: "('micro avg', 'f1-score')"
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Computation:
2023-10-17 20:05:58,317 - compute on device: cuda:0
2023-10-17 20:05:58,317 - embedding storage: none
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:58,317 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 20:06:05,703 epoch 1 - iter 147/1476 - loss 2.88376740 - time (sec): 7.38 - samples/sec: 2399.29 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:06:12,547 epoch 1 - iter 294/1476 - loss 1.83405663 - time (sec): 14.23 - samples/sec: 2329.66 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:06:20,094 epoch 1 - iter 441/1476 - loss 1.34685714 - time (sec): 21.78 - samples/sec: 2364.86 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:06:27,578 epoch 1 - iter 588/1476 - loss 1.08468691 - time (sec): 29.26 - samples/sec: 2373.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:06:34,521 epoch 1 - iter 735/1476 - loss 0.93275820 - time (sec): 36.20 - samples/sec: 2362.49 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:06:41,355 epoch 1 - iter 882/1476 - loss 0.83450367 - time (sec): 43.04 - samples/sec: 2330.69 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:06:48,377 epoch 1 - iter 1029/1476 - loss 0.75458015 - time (sec): 50.06 - samples/sec: 2317.34 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:06:55,693 epoch 1 - iter 1176/1476 - loss 0.68725226 - time (sec): 57.37 - samples/sec: 2303.13 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:07:03,442 epoch 1 - iter 1323/1476 - loss 0.63285062 - time (sec): 65.12 - samples/sec: 2281.72 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:07:10,621 epoch 1 - iter 1470/1476 - loss 0.58387807 - time (sec): 72.30 - samples/sec: 2294.27 - lr: 0.000030 - momentum: 0.000000
2023-10-17 20:07:10,883 ----------------------------------------------------------------------------------------------------
2023-10-17 20:07:10,883 EPOCH 1 done: loss 0.5825 - lr: 0.000030
2023-10-17 20:07:17,234 DEV : loss 0.12818463146686554 - f1-score (micro avg) 0.7213
2023-10-17 20:07:17,263 saving best model
2023-10-17 20:07:17,633 ----------------------------------------------------------------------------------------------------
2023-10-17 20:07:24,643 epoch 2 - iter 147/1476 - loss 0.13820773 - time (sec): 7.01 - samples/sec: 2385.67 - lr: 0.000030 - momentum: 0.000000
2023-10-17 20:07:32,026 epoch 2 - iter 294/1476 - loss 0.13981224 - time (sec): 14.39 - samples/sec: 2427.68 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:07:39,398 epoch 2 - iter 441/1476 - loss 0.13892246 - time (sec): 21.76 - samples/sec: 2406.79 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:07:46,926 epoch 2 - iter 588/1476 - loss 0.13506201 - time (sec): 29.29 - samples/sec: 2319.51 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:07:54,318 epoch 2 - iter 735/1476 - loss 0.13369013 - time (sec): 36.68 - samples/sec: 2242.62 - lr: 0.000028 - momentum: 0.000000
2023-10-17 20:08:01,477 epoch 2 - iter 882/1476 - loss 0.13371218 - time (sec): 43.84 - samples/sec: 2227.85 - lr: 0.000028 - momentum: 0.000000
2023-10-17 20:08:09,013 epoch 2 - iter 1029/1476 - loss 0.13126252 - time (sec): 51.38 - samples/sec: 2221.33 - lr: 0.000028 - momentum: 0.000000
2023-10-17 20:08:16,524 epoch 2 - iter 1176/1476 - loss 0.13158893 - time (sec): 58.89 - samples/sec: 2215.15 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:08:24,525 epoch 2 - iter 1323/1476 - loss 0.13111484 - time (sec): 66.89 - samples/sec: 2220.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:08:31,884 epoch 2 - iter 1470/1476 - loss 0.13044166 - time (sec): 74.25 - samples/sec: 2233.50 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:08:32,152 ----------------------------------------------------------------------------------------------------
2023-10-17 20:08:32,153 EPOCH 2 done: loss 0.1302 - lr: 0.000027
2023-10-17 20:08:43,623 DEV : loss 0.11906815320253372 - f1-score (micro avg) 0.8161
2023-10-17 20:08:43,656 saving best model
2023-10-17 20:08:44,148 ----------------------------------------------------------------------------------------------------
2023-10-17 20:08:51,602 epoch 3 - iter 147/1476 - loss 0.06487959 - time (sec): 7.45 - samples/sec: 2371.36 - lr: 0.000026 - momentum: 0.000000
2023-10-17 20:08:58,751 epoch 3 - iter 294/1476 - loss 0.07490643 - time (sec): 14.60 - samples/sec: 2407.47 - lr: 0.000026 - momentum: 0.000000
2023-10-17 20:09:05,781 epoch 3 - iter 441/1476 - loss 0.07115129 - time (sec): 21.63 - samples/sec: 2400.27 - lr: 0.000026 - momentum: 0.000000
2023-10-17 20:09:12,619 epoch 3 - iter 588/1476 - loss 0.07499880 - time (sec): 28.47 - samples/sec: 2387.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 20:09:19,700 epoch 3 - iter 735/1476 - loss 0.08049723 - time (sec): 35.55 - samples/sec: 2374.58 - lr: 0.000025 - momentum: 0.000000
2023-10-17 20:09:26,769 epoch 3 - iter 882/1476 - loss 0.08104214 - time (sec): 42.62 - samples/sec: 2339.21 - lr: 0.000025 - momentum: 0.000000
2023-10-17 20:09:34,338 epoch 3 - iter 1029/1476 - loss 0.08301973 - time (sec): 50.19 - samples/sec: 2344.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:09:41,499 epoch 3 - iter 1176/1476 - loss 0.08364485 - time (sec): 57.35 - samples/sec: 2332.93 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:09:48,739 epoch 3 - iter 1323/1476 - loss 0.08295442 - time (sec): 64.59 - samples/sec: 2321.23 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:09:56,409 epoch 3 - iter 1470/1476 - loss 0.08380446 - time (sec): 72.26 - samples/sec: 2296.93 - lr: 0.000023 - momentum: 0.000000
2023-10-17 20:09:56,681 ----------------------------------------------------------------------------------------------------
2023-10-17 20:09:56,681 EPOCH 3 done: loss 0.0838 - lr: 0.000023
2023-10-17 20:10:08,037 DEV : loss 0.1379304975271225 - f1-score (micro avg) 0.8223
2023-10-17 20:10:08,071 saving best model
2023-10-17 20:10:08,547 ----------------------------------------------------------------------------------------------------
2023-10-17 20:10:15,658 epoch 4 - iter 147/1476 - loss 0.05625078 - time (sec): 7.11 - samples/sec: 2241.51 - lr: 0.000023 - momentum: 0.000000
2023-10-17 20:10:23,083 epoch 4 - iter 294/1476 - loss 0.05457959 - time (sec): 14.53 - samples/sec: 2321.35 - lr: 0.000023 - momentum: 0.000000
2023-10-17 20:10:30,071 epoch 4 - iter 441/1476 - loss 0.05927497 - time (sec): 21.52 - samples/sec: 2279.64 - lr: 0.000022 - momentum: 0.000000
2023-10-17 20:10:37,495 epoch 4 - iter 588/1476 - loss 0.05943946 - time (sec): 28.94 - samples/sec: 2265.71 - lr: 0.000022 - momentum: 0.000000
2023-10-17 20:10:45,002 epoch 4 - iter 735/1476 - loss 0.06174984 - time (sec): 36.45 - samples/sec: 2202.49 - lr: 0.000022 - momentum: 0.000000
2023-10-17 20:10:52,447 epoch 4 - iter 882/1476 - loss 0.05984732 - time (sec): 43.90 - samples/sec: 2211.74 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:10:59,329 epoch 4 - iter 1029/1476 - loss 0.05718144 - time (sec): 50.78 - samples/sec: 2222.57 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:11:06,848 epoch 4 - iter 1176/1476 - loss 0.05612735 - time (sec): 58.30 - samples/sec: 2247.18 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:11:13,792 epoch 4 - iter 1323/1476 - loss 0.05566318 - time (sec): 65.24 - samples/sec: 2254.84 - lr: 0.000020 - momentum: 0.000000
2023-10-17 20:11:21,672 epoch 4 - iter 1470/1476 - loss 0.05600539 - time (sec): 73.12 - samples/sec: 2266.68 - lr: 0.000020 - momentum: 0.000000
2023-10-17 20:11:21,955 ----------------------------------------------------------------------------------------------------
2023-10-17 20:11:21,955 EPOCH 4 done: loss 0.0559 - lr: 0.000020
2023-10-17 20:11:33,291 DEV : loss 0.16167429089546204 - f1-score (micro avg) 0.846
2023-10-17 20:11:33,323 saving best model
2023-10-17 20:11:33,784 ----------------------------------------------------------------------------------------------------
2023-10-17 20:11:41,069 epoch 5 - iter 147/1476 - loss 0.03312893 - time (sec): 7.28 - samples/sec: 2445.55 - lr: 0.000020 - momentum: 0.000000
2023-10-17 20:11:47,781 epoch 5 - iter 294/1476 - loss 0.03390059 - time (sec): 13.99 - samples/sec: 2407.91 - lr: 0.000019 - momentum: 0.000000
2023-10-17 20:11:54,966 epoch 5 - iter 441/1476 - loss 0.03294051 - time (sec): 21.18 - samples/sec: 2384.13 - lr: 0.000019 - momentum: 0.000000
2023-10-17 20:12:02,147 epoch 5 - iter 588/1476 - loss 0.03900188 - time (sec): 28.36 - samples/sec: 2358.13 - lr: 0.000019 - momentum: 0.000000
2023-10-17 20:12:09,336 epoch 5 - iter 735/1476 - loss 0.03809314 - time (sec): 35.55 - samples/sec: 2359.36 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:12:16,735 epoch 5 - iter 882/1476 - loss 0.03709148 - time (sec): 42.95 - samples/sec: 2337.85 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:12:23,992 epoch 5 - iter 1029/1476 - loss 0.03763883 - time (sec): 50.21 - samples/sec: 2316.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:12:30,787 epoch 5 - iter 1176/1476 - loss 0.03885216 - time (sec): 57.00 - samples/sec: 2309.32 - lr: 0.000017 - momentum: 0.000000
2023-10-17 20:12:38,354 epoch 5 - iter 1323/1476 - loss 0.03762999 - time (sec): 64.57 - samples/sec: 2325.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 20:12:45,409 epoch 5 - iter 1470/1476 - loss 0.03723655 - time (sec): 71.62 - samples/sec: 2317.05 - lr: 0.000017 - momentum: 0.000000
2023-10-17 20:12:45,675 ----------------------------------------------------------------------------------------------------
2023-10-17 20:12:45,675 EPOCH 5 done: loss 0.0377 - lr: 0.000017
2023-10-17 20:12:57,231 DEV : loss 0.1790267527103424 - f1-score (micro avg) 0.8426
2023-10-17 20:12:57,261 ----------------------------------------------------------------------------------------------------
2023-10-17 20:13:04,516 epoch 6 - iter 147/1476 - loss 0.02474197 - time (sec): 7.25 - samples/sec: 2182.35 - lr: 0.000016 - momentum: 0.000000
2023-10-17 20:13:11,743 epoch 6 - iter 294/1476 - loss 0.02176561 - time (sec): 14.48 - samples/sec: 2281.17 - lr: 0.000016 - momentum: 0.000000
2023-10-17 20:13:19,002 epoch 6 - iter 441/1476 - loss 0.02055555 - time (sec): 21.74 - samples/sec: 2289.19 - lr: 0.000016 - momentum: 0.000000
2023-10-17 20:13:26,628 epoch 6 - iter 588/1476 - loss 0.02265555 - time (sec): 29.37 - samples/sec: 2240.42 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:13:33,937 epoch 6 - iter 735/1476 - loss 0.02383975 - time (sec): 36.68 - samples/sec: 2233.14 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:13:41,057 epoch 6 - iter 882/1476 - loss 0.02491593 - time (sec): 43.80 - samples/sec: 2228.19 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:13:48,543 epoch 6 - iter 1029/1476 - loss 0.02373826 - time (sec): 51.28 - samples/sec: 2242.09 - lr: 0.000014 - momentum: 0.000000
2023-10-17 20:13:55,584 epoch 6 - iter 1176/1476 - loss 0.02425560 - time (sec): 58.32 - samples/sec: 2255.37 - lr: 0.000014 - momentum: 0.000000
2023-10-17 20:14:02,711 epoch 6 - iter 1323/1476 - loss 0.02426721 - time (sec): 65.45 - samples/sec: 2257.42 - lr: 0.000014 - momentum: 0.000000
2023-10-17 20:14:10,120 epoch 6 - iter 1470/1476 - loss 0.02573153 - time (sec): 72.86 - samples/sec: 2275.89 - lr: 0.000013 - momentum: 0.000000
2023-10-17 20:14:10,406 ----------------------------------------------------------------------------------------------------
2023-10-17 20:14:10,406 EPOCH 6 done: loss 0.0256 - lr: 0.000013
2023-10-17 20:14:21,960 DEV : loss 0.1823473572731018 - f1-score (micro avg) 0.8415
2023-10-17 20:14:21,993 ----------------------------------------------------------------------------------------------------
2023-10-17 20:14:29,433 epoch 7 - iter 147/1476 - loss 0.01049509 - time (sec): 7.44 - samples/sec: 2270.28 - lr: 0.000013 - momentum: 0.000000
2023-10-17 20:14:36,187 epoch 7 - iter 294/1476 - loss 0.01510363 - time (sec): 14.19 - samples/sec: 2341.53 - lr: 0.000013 - momentum: 0.000000
2023-10-17 20:14:43,513 epoch 7 - iter 441/1476 - loss 0.01732408 - time (sec): 21.52 - samples/sec: 2376.45 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:14:50,969 epoch 7 - iter 588/1476 - loss 0.01690878 - time (sec): 28.97 - samples/sec: 2369.98 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:14:58,196 epoch 7 - iter 735/1476 - loss 0.02010782 - time (sec): 36.20 - samples/sec: 2330.85 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:15:05,260 epoch 7 - iter 882/1476 - loss 0.02004143 - time (sec): 43.27 - samples/sec: 2338.44 - lr: 0.000011 - momentum: 0.000000
2023-10-17 20:15:12,552 epoch 7 - iter 1029/1476 - loss 0.01839535 - time (sec): 50.56 - samples/sec: 2313.93 - lr: 0.000011 - momentum: 0.000000
2023-10-17 20:15:19,971 epoch 7 - iter 1176/1476 - loss 0.01888196 - time (sec): 57.98 - samples/sec: 2313.87 - lr: 0.000011 - momentum: 0.000000
2023-10-17 20:15:27,136 epoch 7 - iter 1323/1476 - loss 0.01831645 - time (sec): 65.14 - samples/sec: 2320.63 - lr: 0.000010 - momentum: 0.000000
2023-10-17 20:15:33,941 epoch 7 - iter 1470/1476 - loss 0.01784839 - time (sec): 71.95 - samples/sec: 2305.36 - lr: 0.000010 - momentum: 0.000000
2023-10-17 20:15:34,199 ----------------------------------------------------------------------------------------------------
2023-10-17 20:15:34,200 EPOCH 7 done: loss 0.0180 - lr: 0.000010
2023-10-17 20:15:45,643 DEV : loss 0.19402551651000977 - f1-score (micro avg) 0.8431
2023-10-17 20:15:45,677 ----------------------------------------------------------------------------------------------------
2023-10-17 20:15:52,842 epoch 8 - iter 147/1476 - loss 0.01320226 - time (sec): 7.16 - samples/sec: 2278.37 - lr: 0.000010 - momentum: 0.000000
2023-10-17 20:16:00,302 epoch 8 - iter 294/1476 - loss 0.01657591 - time (sec): 14.62 - samples/sec: 2333.78 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:16:07,287 epoch 8 - iter 441/1476 - loss 0.01541718 - time (sec): 21.61 - samples/sec: 2302.97 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:16:14,178 epoch 8 - iter 588/1476 - loss 0.01364886 - time (sec): 28.50 - samples/sec: 2306.16 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:16:21,411 epoch 8 - iter 735/1476 - loss 0.01449721 - time (sec): 35.73 - samples/sec: 2315.68 - lr: 0.000008 - momentum: 0.000000
2023-10-17 20:16:28,411 epoch 8 - iter 882/1476 - loss 0.01320813 - time (sec): 42.73 - samples/sec: 2299.05 - lr: 0.000008 - momentum: 0.000000
2023-10-17 20:16:36,244 epoch 8 - iter 1029/1476 - loss 0.01499181 - time (sec): 50.57 - samples/sec: 2327.41 - lr: 0.000008 - momentum: 0.000000
2023-10-17 20:16:43,224 epoch 8 - iter 1176/1476 - loss 0.01478780 - time (sec): 57.55 - samples/sec: 2319.55 - lr: 0.000007 - momentum: 0.000000
2023-10-17 20:16:50,451 epoch 8 - iter 1323/1476 - loss 0.01429358 - time (sec): 64.77 - samples/sec: 2320.02 - lr: 0.000007 - momentum: 0.000000
2023-10-17 20:16:57,506 epoch 8 - iter 1470/1476 - loss 0.01418919 - time (sec): 71.83 - samples/sec: 2303.65 - lr: 0.000007 - momentum: 0.000000
2023-10-17 20:16:57,853 ----------------------------------------------------------------------------------------------------
2023-10-17 20:16:57,853 EPOCH 8 done: loss 0.0141 - lr: 0.000007
2023-10-17 20:17:09,346 DEV : loss 0.20007802546024323 - f1-score (micro avg) 0.8496
2023-10-17 20:17:09,379 saving best model
2023-10-17 20:17:09,862 ----------------------------------------------------------------------------------------------------
2023-10-17 20:17:17,475 epoch 9 - iter 147/1476 - loss 0.00436533 - time (sec): 7.61 - samples/sec: 2367.42 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:17:24,847 epoch 9 - iter 294/1476 - loss 0.00510740 - time (sec): 14.98 - samples/sec: 2429.53 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:17:32,428 epoch 9 - iter 441/1476 - loss 0.00763822 - time (sec): 22.56 - samples/sec: 2423.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:17:39,465 epoch 9 - iter 588/1476 - loss 0.00743456 - time (sec): 29.60 - samples/sec: 2361.44 - lr: 0.000005 - momentum: 0.000000
2023-10-17 20:17:47,117 epoch 9 - iter 735/1476 - loss 0.00689757 - time (sec): 37.25 - samples/sec: 2308.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 20:17:54,005 epoch 9 - iter 882/1476 - loss 0.00620989 - time (sec): 44.14 - samples/sec: 2316.95 - lr: 0.000005 - momentum: 0.000000
2023-10-17 20:18:00,853 epoch 9 - iter 1029/1476 - loss 0.00730072 - time (sec): 50.99 - samples/sec: 2301.79 - lr: 0.000004 - momentum: 0.000000
2023-10-17 20:18:07,914 epoch 9 - iter 1176/1476 - loss 0.00714572 - time (sec): 58.05 - samples/sec: 2289.46 - lr: 0.000004 - momentum: 0.000000
2023-10-17 20:18:15,625 epoch 9 - iter 1323/1476 - loss 0.00707158 - time (sec): 65.76 - samples/sec: 2299.61 - lr: 0.000004 - momentum: 0.000000
2023-10-17 20:18:22,423 epoch 9 - iter 1470/1476 - loss 0.00773284 - time (sec): 72.56 - samples/sec: 2283.54 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:18:22,726 ----------------------------------------------------------------------------------------------------
2023-10-17 20:18:22,726 EPOCH 9 done: loss 0.0078 - lr: 0.000003
2023-10-17 20:18:34,319 DEV : loss 0.2041151374578476 - f1-score (micro avg) 0.8596
2023-10-17 20:18:34,349 saving best model
2023-10-17 20:18:34,828 ----------------------------------------------------------------------------------------------------
2023-10-17 20:18:42,621 epoch 10 - iter 147/1476 - loss 0.00709689 - time (sec): 7.79 - samples/sec: 2535.00 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:18:50,003 epoch 10 - iter 294/1476 - loss 0.00608595 - time (sec): 15.17 - samples/sec: 2439.34 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:18:57,062 epoch 10 - iter 441/1476 - loss 0.00472843 - time (sec): 22.23 - samples/sec: 2396.12 - lr: 0.000002 - momentum: 0.000000
2023-10-17 20:19:04,224 epoch 10 - iter 588/1476 - loss 0.00501675 - time (sec): 29.39 - samples/sec: 2314.37 - lr: 0.000002 - momentum: 0.000000
2023-10-17 20:19:11,140 epoch 10 - iter 735/1476 - loss 0.00493818 - time (sec): 36.31 - samples/sec: 2306.06 - lr: 0.000002 - momentum: 0.000000
2023-10-17 20:19:18,274 epoch 10 - iter 882/1476 - loss 0.00479554 - time (sec): 43.44 - samples/sec: 2297.04 - lr: 0.000001 - momentum: 0.000000
2023-10-17 20:19:25,625 epoch 10 - iter 1029/1476 - loss 0.00501958 - time (sec): 50.79 - samples/sec: 2305.35 - lr: 0.000001 - momentum: 0.000000
2023-10-17 20:19:32,682 epoch 10 - iter 1176/1476 - loss 0.00595793 - time (sec): 57.85 - samples/sec: 2291.26 - lr: 0.000001 - momentum: 0.000000
2023-10-17 20:19:39,780 epoch 10 - iter 1323/1476 - loss 0.00561353 - time (sec): 64.95 - samples/sec: 2284.04 - lr: 0.000000 - momentum: 0.000000
2023-10-17 20:19:47,478 epoch 10 - iter 1470/1476 - loss 0.00566830 - time (sec): 72.65 - samples/sec: 2283.76 - lr: 0.000000 - momentum: 0.000000
2023-10-17 20:19:47,747 ----------------------------------------------------------------------------------------------------
2023-10-17 20:19:47,748 EPOCH 10 done: loss 0.0057 - lr: 0.000000
2023-10-17 20:19:59,001 DEV : loss 0.2035822868347168 - f1-score (micro avg) 0.8602
2023-10-17 20:19:59,031 saving best model
2023-10-17 20:19:59,889 ----------------------------------------------------------------------------------------------------
2023-10-17 20:19:59,891 Loading model from best epoch ...
2023-10-17 20:20:01,237 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 20:20:07,284
Results:
- F-score (micro) 0.7934
- F-score (macro) 0.7114
- Accuracy 0.6758
By class:
precision recall f1-score support
loc 0.8474 0.8671 0.8571 858
pers 0.7487 0.8045 0.7756 537
org 0.5329 0.6136 0.5704 132
prod 0.7500 0.7377 0.7438 61
time 0.5625 0.6667 0.6102 54
micro avg 0.7730 0.8149 0.7934 1642
macro avg 0.6883 0.7379 0.7114 1642
weighted avg 0.7768 0.8149 0.7951 1642
2023-10-17 20:20:07,284 ----------------------------------------------------------------------------------------------------