|
2023-10-16 18:06:43,262 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,263 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Train: 1166 sentences |
|
2023-10-16 18:06:43,264 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Training Params: |
|
2023-10-16 18:06:43,264 - learning_rate: "3e-05" |
|
2023-10-16 18:06:43,264 - mini_batch_size: "4" |
|
2023-10-16 18:06:43,264 - max_epochs: "10" |
|
2023-10-16 18:06:43,264 - shuffle: "True" |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Plugins: |
|
2023-10-16 18:06:43,264 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:06:43,264 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Computation: |
|
2023-10-16 18:06:43,264 - compute on device: cuda:0 |
|
2023-10-16 18:06:43,264 - embedding storage: none |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:43,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:45,038 epoch 1 - iter 29/292 - loss 2.88222167 - time (sec): 1.77 - samples/sec: 2934.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:06:46,609 epoch 1 - iter 58/292 - loss 2.60446904 - time (sec): 3.34 - samples/sec: 2697.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:06:48,242 epoch 1 - iter 87/292 - loss 2.04069918 - time (sec): 4.98 - samples/sec: 2655.47 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:06:49,732 epoch 1 - iter 116/292 - loss 1.70752429 - time (sec): 6.47 - samples/sec: 2663.44 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:06:51,291 epoch 1 - iter 145/292 - loss 1.48725658 - time (sec): 8.03 - samples/sec: 2632.59 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:06:52,955 epoch 1 - iter 174/292 - loss 1.33904630 - time (sec): 9.69 - samples/sec: 2602.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:06:54,672 epoch 1 - iter 203/292 - loss 1.16634244 - time (sec): 11.41 - samples/sec: 2674.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:06:56,311 epoch 1 - iter 232/292 - loss 1.06978910 - time (sec): 13.05 - samples/sec: 2671.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:06:58,151 epoch 1 - iter 261/292 - loss 0.99870433 - time (sec): 14.89 - samples/sec: 2689.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:06:59,742 epoch 1 - iter 290/292 - loss 0.93223498 - time (sec): 16.48 - samples/sec: 2676.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:06:59,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:59,857 EPOCH 1 done: loss 0.9270 - lr: 0.000030 |
|
2023-10-16 18:07:01,225 DEV : loss 0.20976495742797852 - f1-score (micro avg) 0.3889 |
|
2023-10-16 18:07:01,232 saving best model |
|
2023-10-16 18:07:01,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:03,453 epoch 2 - iter 29/292 - loss 0.25107587 - time (sec): 1.70 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:07:05,086 epoch 2 - iter 58/292 - loss 0.24400947 - time (sec): 3.33 - samples/sec: 2503.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:07:06,734 epoch 2 - iter 87/292 - loss 0.23895746 - time (sec): 4.98 - samples/sec: 2492.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:07:08,461 epoch 2 - iter 116/292 - loss 0.23745427 - time (sec): 6.71 - samples/sec: 2479.82 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:07:10,124 epoch 2 - iter 145/292 - loss 0.23323169 - time (sec): 8.37 - samples/sec: 2511.84 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:07:11,997 epoch 2 - iter 174/292 - loss 0.23315863 - time (sec): 10.24 - samples/sec: 2574.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:07:13,713 epoch 2 - iter 203/292 - loss 0.22673487 - time (sec): 11.96 - samples/sec: 2610.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:07:15,338 epoch 2 - iter 232/292 - loss 0.22123385 - time (sec): 13.58 - samples/sec: 2631.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:07:16,935 epoch 2 - iter 261/292 - loss 0.22344619 - time (sec): 15.18 - samples/sec: 2632.59 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:07:18,600 epoch 2 - iter 290/292 - loss 0.21652095 - time (sec): 16.85 - samples/sec: 2631.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:07:18,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:18,687 EPOCH 2 done: loss 0.2162 - lr: 0.000027 |
|
2023-10-16 18:07:19,989 DEV : loss 0.14250923693180084 - f1-score (micro avg) 0.6128 |
|
2023-10-16 18:07:19,995 saving best model |
|
2023-10-16 18:07:20,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:22,235 epoch 3 - iter 29/292 - loss 0.14272483 - time (sec): 1.71 - samples/sec: 2548.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:07:23,769 epoch 3 - iter 58/292 - loss 0.12941947 - time (sec): 3.24 - samples/sec: 2716.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:07:25,434 epoch 3 - iter 87/292 - loss 0.13036619 - time (sec): 4.91 - samples/sec: 2744.37 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:07:26,970 epoch 3 - iter 116/292 - loss 0.12139105 - time (sec): 6.44 - samples/sec: 2681.79 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:07:28,568 epoch 3 - iter 145/292 - loss 0.11496517 - time (sec): 8.04 - samples/sec: 2685.19 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:07:30,170 epoch 3 - iter 174/292 - loss 0.11820563 - time (sec): 9.64 - samples/sec: 2672.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:07:32,002 epoch 3 - iter 203/292 - loss 0.12100791 - time (sec): 11.48 - samples/sec: 2706.76 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:07:33,666 epoch 3 - iter 232/292 - loss 0.11592913 - time (sec): 13.14 - samples/sec: 2704.67 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:07:35,220 epoch 3 - iter 261/292 - loss 0.11476202 - time (sec): 14.69 - samples/sec: 2706.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:07:37,049 epoch 3 - iter 290/292 - loss 0.11381579 - time (sec): 16.52 - samples/sec: 2679.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:07:37,143 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:37,143 EPOCH 3 done: loss 0.1135 - lr: 0.000023 |
|
2023-10-16 18:07:38,429 DEV : loss 0.12919388711452484 - f1-score (micro avg) 0.6814 |
|
2023-10-16 18:07:38,436 saving best model |
|
2023-10-16 18:07:38,944 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:40,795 epoch 4 - iter 29/292 - loss 0.08033603 - time (sec): 1.85 - samples/sec: 2760.24 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:07:42,390 epoch 4 - iter 58/292 - loss 0.09107035 - time (sec): 3.44 - samples/sec: 2756.18 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:07:43,995 epoch 4 - iter 87/292 - loss 0.08192222 - time (sec): 5.05 - samples/sec: 2743.24 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:07:45,617 epoch 4 - iter 116/292 - loss 0.08082958 - time (sec): 6.67 - samples/sec: 2766.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:07:47,371 epoch 4 - iter 145/292 - loss 0.07535777 - time (sec): 8.43 - samples/sec: 2795.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:07:49,099 epoch 4 - iter 174/292 - loss 0.07310859 - time (sec): 10.15 - samples/sec: 2773.92 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:07:50,665 epoch 4 - iter 203/292 - loss 0.07590109 - time (sec): 11.72 - samples/sec: 2763.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:07:52,377 epoch 4 - iter 232/292 - loss 0.07416772 - time (sec): 13.43 - samples/sec: 2702.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:07:54,003 epoch 4 - iter 261/292 - loss 0.07169329 - time (sec): 15.06 - samples/sec: 2706.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:07:55,724 epoch 4 - iter 290/292 - loss 0.06896376 - time (sec): 16.78 - samples/sec: 2641.64 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:07:55,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:55,807 EPOCH 4 done: loss 0.0688 - lr: 0.000020 |
|
2023-10-16 18:07:57,088 DEV : loss 0.12307216227054596 - f1-score (micro avg) 0.7595 |
|
2023-10-16 18:07:57,094 saving best model |
|
2023-10-16 18:07:57,646 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:07:59,313 epoch 5 - iter 29/292 - loss 0.03955736 - time (sec): 1.67 - samples/sec: 2868.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:08:01,015 epoch 5 - iter 58/292 - loss 0.04332368 - time (sec): 3.37 - samples/sec: 2807.90 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:08:02,822 epoch 5 - iter 87/292 - loss 0.04161002 - time (sec): 5.17 - samples/sec: 2811.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:08:04,459 epoch 5 - iter 116/292 - loss 0.03749425 - time (sec): 6.81 - samples/sec: 2772.83 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:08:05,992 epoch 5 - iter 145/292 - loss 0.03825239 - time (sec): 8.34 - samples/sec: 2738.86 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:08:07,530 epoch 5 - iter 174/292 - loss 0.03808185 - time (sec): 9.88 - samples/sec: 2697.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:08:09,124 epoch 5 - iter 203/292 - loss 0.03880310 - time (sec): 11.48 - samples/sec: 2671.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:08:10,845 epoch 5 - iter 232/292 - loss 0.04783043 - time (sec): 13.20 - samples/sec: 2665.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:08:12,530 epoch 5 - iter 261/292 - loss 0.04825293 - time (sec): 14.88 - samples/sec: 2628.51 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:08:14,265 epoch 5 - iter 290/292 - loss 0.05008356 - time (sec): 16.62 - samples/sec: 2662.54 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:08:14,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:14,358 EPOCH 5 done: loss 0.0499 - lr: 0.000017 |
|
2023-10-16 18:08:15,621 DEV : loss 0.1350400596857071 - f1-score (micro avg) 0.766 |
|
2023-10-16 18:08:15,626 saving best model |
|
2023-10-16 18:08:16,115 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:17,779 epoch 6 - iter 29/292 - loss 0.04105101 - time (sec): 1.66 - samples/sec: 2305.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:08:19,486 epoch 6 - iter 58/292 - loss 0.04502925 - time (sec): 3.37 - samples/sec: 2453.67 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:08:21,059 epoch 6 - iter 87/292 - loss 0.03516229 - time (sec): 4.94 - samples/sec: 2487.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:08:22,563 epoch 6 - iter 116/292 - loss 0.03210820 - time (sec): 6.45 - samples/sec: 2555.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:08:24,329 epoch 6 - iter 145/292 - loss 0.02906604 - time (sec): 8.21 - samples/sec: 2574.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:08:25,816 epoch 6 - iter 174/292 - loss 0.02948663 - time (sec): 9.70 - samples/sec: 2553.14 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:08:27,546 epoch 6 - iter 203/292 - loss 0.03338297 - time (sec): 11.43 - samples/sec: 2570.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:08:29,334 epoch 6 - iter 232/292 - loss 0.03431716 - time (sec): 13.22 - samples/sec: 2607.40 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:08:31,046 epoch 6 - iter 261/292 - loss 0.03709105 - time (sec): 14.93 - samples/sec: 2646.56 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:08:32,738 epoch 6 - iter 290/292 - loss 0.03590560 - time (sec): 16.62 - samples/sec: 2660.19 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:08:32,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:32,826 EPOCH 6 done: loss 0.0359 - lr: 0.000013 |
|
2023-10-16 18:08:34,041 DEV : loss 0.1274399310350418 - f1-score (micro avg) 0.8009 |
|
2023-10-16 18:08:34,045 saving best model |
|
2023-10-16 18:08:34,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:36,179 epoch 7 - iter 29/292 - loss 0.03485645 - time (sec): 1.61 - samples/sec: 2550.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:08:37,955 epoch 7 - iter 58/292 - loss 0.03354645 - time (sec): 3.38 - samples/sec: 2752.99 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:08:39,678 epoch 7 - iter 87/292 - loss 0.02845269 - time (sec): 5.11 - samples/sec: 2774.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:08:41,459 epoch 7 - iter 116/292 - loss 0.03214827 - time (sec): 6.89 - samples/sec: 2730.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:08:43,094 epoch 7 - iter 145/292 - loss 0.03051476 - time (sec): 8.52 - samples/sec: 2687.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:08:44,601 epoch 7 - iter 174/292 - loss 0.02794386 - time (sec): 10.03 - samples/sec: 2685.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:08:46,149 epoch 7 - iter 203/292 - loss 0.02570380 - time (sec): 11.58 - samples/sec: 2670.47 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:08:47,783 epoch 7 - iter 232/292 - loss 0.02741901 - time (sec): 13.21 - samples/sec: 2705.87 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:08:49,317 epoch 7 - iter 261/292 - loss 0.02628470 - time (sec): 14.75 - samples/sec: 2692.65 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:08:51,015 epoch 7 - iter 290/292 - loss 0.02776424 - time (sec): 16.44 - samples/sec: 2685.07 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:08:51,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:51,123 EPOCH 7 done: loss 0.0276 - lr: 0.000010 |
|
2023-10-16 18:08:52,409 DEV : loss 0.14393527805805206 - f1-score (micro avg) 0.7603 |
|
2023-10-16 18:08:52,413 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:08:54,121 epoch 8 - iter 29/292 - loss 0.02543747 - time (sec): 1.71 - samples/sec: 2790.30 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:08:55,512 epoch 8 - iter 58/292 - loss 0.02833714 - time (sec): 3.10 - samples/sec: 2578.54 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:08:57,277 epoch 8 - iter 87/292 - loss 0.02513038 - time (sec): 4.86 - samples/sec: 2585.81 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:08:59,029 epoch 8 - iter 116/292 - loss 0.02477383 - time (sec): 6.61 - samples/sec: 2622.43 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:09:00,644 epoch 8 - iter 145/292 - loss 0.02605667 - time (sec): 8.23 - samples/sec: 2653.35 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:09:02,344 epoch 8 - iter 174/292 - loss 0.02378070 - time (sec): 9.93 - samples/sec: 2690.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:09:04,137 epoch 8 - iter 203/292 - loss 0.02209407 - time (sec): 11.72 - samples/sec: 2726.41 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:09:05,761 epoch 8 - iter 232/292 - loss 0.02304147 - time (sec): 13.35 - samples/sec: 2733.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:09:07,268 epoch 8 - iter 261/292 - loss 0.02152923 - time (sec): 14.85 - samples/sec: 2696.49 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:09:08,856 epoch 8 - iter 290/292 - loss 0.02045988 - time (sec): 16.44 - samples/sec: 2689.85 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:09:08,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:08,958 EPOCH 8 done: loss 0.0204 - lr: 0.000007 |
|
2023-10-16 18:09:10,489 DEV : loss 0.153466135263443 - f1-score (micro avg) 0.7689 |
|
2023-10-16 18:09:10,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:12,364 epoch 9 - iter 29/292 - loss 0.00758189 - time (sec): 1.87 - samples/sec: 2919.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:09:14,012 epoch 9 - iter 58/292 - loss 0.01509998 - time (sec): 3.52 - samples/sec: 2654.32 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:09:15,852 epoch 9 - iter 87/292 - loss 0.02017623 - time (sec): 5.36 - samples/sec: 2611.26 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:09:17,652 epoch 9 - iter 116/292 - loss 0.02170409 - time (sec): 7.16 - samples/sec: 2590.86 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:09:19,465 epoch 9 - iter 145/292 - loss 0.01832989 - time (sec): 8.97 - samples/sec: 2530.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:09:21,042 epoch 9 - iter 174/292 - loss 0.01817301 - time (sec): 10.55 - samples/sec: 2525.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:09:22,702 epoch 9 - iter 203/292 - loss 0.01751352 - time (sec): 12.21 - samples/sec: 2524.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:09:24,524 epoch 9 - iter 232/292 - loss 0.01770380 - time (sec): 14.03 - samples/sec: 2524.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:09:26,227 epoch 9 - iter 261/292 - loss 0.01613702 - time (sec): 15.73 - samples/sec: 2538.20 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:09:27,928 epoch 9 - iter 290/292 - loss 0.01569710 - time (sec): 17.43 - samples/sec: 2542.91 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:09:28,012 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:28,012 EPOCH 9 done: loss 0.0157 - lr: 0.000003 |
|
2023-10-16 18:09:29,289 DEV : loss 0.1515672355890274 - f1-score (micro avg) 0.756 |
|
2023-10-16 18:09:29,294 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:30,951 epoch 10 - iter 29/292 - loss 0.00834902 - time (sec): 1.66 - samples/sec: 2534.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:09:32,573 epoch 10 - iter 58/292 - loss 0.00778017 - time (sec): 3.28 - samples/sec: 2506.92 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:09:34,325 epoch 10 - iter 87/292 - loss 0.00669187 - time (sec): 5.03 - samples/sec: 2503.73 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:09:36,116 epoch 10 - iter 116/292 - loss 0.00906403 - time (sec): 6.82 - samples/sec: 2544.52 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:09:37,714 epoch 10 - iter 145/292 - loss 0.00978993 - time (sec): 8.42 - samples/sec: 2531.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:09:39,369 epoch 10 - iter 174/292 - loss 0.01122421 - time (sec): 10.07 - samples/sec: 2571.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:09:40,888 epoch 10 - iter 203/292 - loss 0.01027289 - time (sec): 11.59 - samples/sec: 2600.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:09:42,713 epoch 10 - iter 232/292 - loss 0.00980163 - time (sec): 13.42 - samples/sec: 2584.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:09:44,267 epoch 10 - iter 261/292 - loss 0.00981947 - time (sec): 14.97 - samples/sec: 2603.16 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:09:46,020 epoch 10 - iter 290/292 - loss 0.01181952 - time (sec): 16.73 - samples/sec: 2643.22 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:09:46,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:46,120 EPOCH 10 done: loss 0.0124 - lr: 0.000000 |
|
2023-10-16 18:09:47,424 DEV : loss 0.15941958129405975 - f1-score (micro avg) 0.7452 |
|
2023-10-16 18:09:47,808 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:09:47,810 Loading model from best epoch ... |
|
2023-10-16 18:09:49,528 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:09:52,319 |
|
Results: |
|
- F-score (micro) 0.7547 |
|
- F-score (macro) 0.6975 |
|
- Accuracy 0.6325 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8187 0.8305 0.8245 348 |
|
LOC 0.6433 0.8084 0.7165 261 |
|
ORG 0.5128 0.3846 0.4396 52 |
|
HumanProd 0.8500 0.7727 0.8095 22 |
|
|
|
micro avg 0.7257 0.7862 0.7547 683 |
|
macro avg 0.7062 0.6991 0.6975 683 |
|
weighted avg 0.7294 0.7862 0.7534 683 |
|
|
|
2023-10-16 18:09:52,319 ---------------------------------------------------------------------------------------------------- |
|
|