2023-10-25 18:20:32,214 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,215 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 Train: 7142 sentences 2023-10-25 18:20:32,216 (train_with_dev=False, train_with_test=False) 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 Training Params: 2023-10-25 18:20:32,216 - learning_rate: "3e-05" 2023-10-25 18:20:32,216 - mini_batch_size: "4" 2023-10-25 18:20:32,216 - max_epochs: "10" 2023-10-25 18:20:32,216 - shuffle: "True" 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 Plugins: 2023-10-25 18:20:32,216 - TensorboardLogger 2023-10-25 18:20:32,216 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 18:20:32,216 - metric: "('micro avg', 'f1-score')" 2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,216 Computation: 2023-10-25 18:20:32,216 - compute on device: cuda:0 2023-10-25 18:20:32,216 - embedding storage: none 2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,217 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:20:32,217 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 18:20:41,829 epoch 1 - iter 178/1786 - loss 1.62894423 - time (sec): 9.61 - samples/sec: 2374.31 - lr: 0.000003 - momentum: 0.000000 2023-10-25 18:20:51,370 epoch 1 - iter 356/1786 - loss 1.05338819 - time (sec): 19.15 - samples/sec: 2440.03 - lr: 0.000006 - momentum: 0.000000 2023-10-25 18:21:00,897 epoch 1 - iter 534/1786 - loss 0.81520377 - time (sec): 28.68 - samples/sec: 2483.98 - lr: 0.000009 - momentum: 0.000000 2023-10-25 18:21:10,205 epoch 1 - iter 712/1786 - loss 0.65815637 - time (sec): 37.99 - samples/sec: 2589.41 - lr: 0.000012 - momentum: 0.000000 2023-10-25 18:21:18,795 epoch 1 - iter 890/1786 - loss 0.56620539 - time (sec): 46.58 - samples/sec: 2635.83 - lr: 0.000015 - momentum: 0.000000 2023-10-25 18:21:27,476 epoch 1 - iter 1068/1786 - loss 0.50136101 - time (sec): 55.26 - samples/sec: 2648.12 - lr: 0.000018 - momentum: 0.000000 2023-10-25 18:21:36,384 epoch 1 - iter 1246/1786 - loss 0.45154314 - time (sec): 64.17 - samples/sec: 2662.89 - lr: 0.000021 - momentum: 0.000000 2023-10-25 18:21:45,468 epoch 1 - iter 1424/1786 - loss 0.41094756 - time (sec): 73.25 - samples/sec: 2680.96 - lr: 0.000024 - momentum: 0.000000 2023-10-25 18:21:54,948 epoch 1 - iter 1602/1786 - loss 0.37891662 - time (sec): 82.73 - samples/sec: 2692.41 - lr: 0.000027 - momentum: 0.000000 2023-10-25 18:22:04,798 epoch 1 - iter 1780/1786 - loss 0.35716707 - time (sec): 92.58 - samples/sec: 2677.72 - lr: 0.000030 - momentum: 0.000000 2023-10-25 18:22:05,143 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:22:05,143 EPOCH 1 done: loss 0.3565 - lr: 0.000030 2023-10-25 18:22:08,918 DEV : loss 0.10421743243932724 - f1-score (micro avg) 0.7273 2023-10-25 18:22:08,940 saving best model 2023-10-25 18:22:09,391 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:22:19,092 epoch 2 - iter 178/1786 - loss 0.11382450 - time (sec): 9.70 - samples/sec: 2665.09 - lr: 0.000030 - momentum: 0.000000 2023-10-25 18:22:28,505 epoch 2 - iter 356/1786 - loss 0.11818253 - time (sec): 19.11 - samples/sec: 2534.69 - lr: 0.000029 - momentum: 0.000000 2023-10-25 18:22:37,505 epoch 2 - iter 534/1786 - loss 0.11752290 - time (sec): 28.11 - samples/sec: 2607.90 - lr: 0.000029 - momentum: 0.000000 2023-10-25 18:22:46,687 epoch 2 - iter 712/1786 - loss 0.11710291 - time (sec): 37.29 - samples/sec: 2656.23 - lr: 0.000029 - momentum: 0.000000 2023-10-25 18:22:55,874 epoch 2 - iter 890/1786 - loss 0.11761515 - time (sec): 46.48 - samples/sec: 2626.69 - lr: 0.000028 - momentum: 0.000000 2023-10-25 18:23:04,857 epoch 2 - iter 1068/1786 - loss 0.11880463 - time (sec): 55.46 - samples/sec: 2647.77 - lr: 0.000028 - momentum: 0.000000 2023-10-25 18:23:13,642 epoch 2 - iter 1246/1786 - loss 0.11830693 - time (sec): 64.25 - samples/sec: 2669.76 - lr: 0.000028 - momentum: 0.000000 2023-10-25 18:23:22,746 epoch 2 - iter 1424/1786 - loss 0.11678831 - time (sec): 73.35 - samples/sec: 2706.42 - lr: 0.000027 - momentum: 0.000000 2023-10-25 18:23:31,977 epoch 2 - iter 1602/1786 - loss 0.11600197 - time (sec): 82.58 - samples/sec: 2700.85 - lr: 0.000027 - momentum: 0.000000 2023-10-25 18:23:41,217 epoch 2 - iter 1780/1786 - loss 0.11609587 - time (sec): 91.82 - samples/sec: 2701.26 - lr: 0.000027 - momentum: 0.000000 2023-10-25 18:23:41,525 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:23:41,526 EPOCH 2 done: loss 0.1160 - lr: 0.000027 2023-10-25 18:23:46,583 DEV : loss 0.10009025037288666 - f1-score (micro avg) 0.7704 2023-10-25 18:23:46,604 saving best model 2023-10-25 18:23:47,260 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:23:56,854 epoch 3 - iter 178/1786 - loss 0.06189937 - time (sec): 9.59 - samples/sec: 2612.13 - lr: 0.000026 - momentum: 0.000000 2023-10-25 18:24:06,469 epoch 3 - iter 356/1786 - loss 0.06991416 - time (sec): 19.21 - samples/sec: 2562.50 - lr: 0.000026 - momentum: 0.000000 2023-10-25 18:24:15,985 epoch 3 - iter 534/1786 - loss 0.07505640 - time (sec): 28.72 - samples/sec: 2562.39 - lr: 0.000026 - momentum: 0.000000 2023-10-25 18:24:25,614 epoch 3 - iter 712/1786 - loss 0.07238654 - time (sec): 38.35 - samples/sec: 2574.30 - lr: 0.000025 - momentum: 0.000000 2023-10-25 18:24:35,379 epoch 3 - iter 890/1786 - loss 0.07251223 - time (sec): 48.12 - samples/sec: 2571.82 - lr: 0.000025 - momentum: 0.000000 2023-10-25 18:24:45,040 epoch 3 - iter 1068/1786 - loss 0.07322538 - time (sec): 57.78 - samples/sec: 2580.46 - lr: 0.000025 - momentum: 0.000000 2023-10-25 18:24:54,249 epoch 3 - iter 1246/1786 - loss 0.07152561 - time (sec): 66.98 - samples/sec: 2609.81 - lr: 0.000024 - momentum: 0.000000 2023-10-25 18:25:03,582 epoch 3 - iter 1424/1786 - loss 0.07161969 - time (sec): 76.32 - samples/sec: 2579.33 - lr: 0.000024 - momentum: 0.000000 2023-10-25 18:25:12,455 epoch 3 - iter 1602/1786 - loss 0.07144466 - time (sec): 85.19 - samples/sec: 2609.06 - lr: 0.000024 - momentum: 0.000000 2023-10-25 18:25:21,964 epoch 3 - iter 1780/1786 - loss 0.07137444 - time (sec): 94.70 - samples/sec: 2618.28 - lr: 0.000023 - momentum: 0.000000 2023-10-25 18:25:22,284 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:25:22,284 EPOCH 3 done: loss 0.0713 - lr: 0.000023 2023-10-25 18:25:27,382 DEV : loss 0.13084866106510162 - f1-score (micro avg) 0.7918 2023-10-25 18:25:27,404 saving best model 2023-10-25 18:25:28,076 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:25:36,796 epoch 4 - iter 178/1786 - loss 0.04603763 - time (sec): 8.72 - samples/sec: 2822.79 - lr: 0.000023 - momentum: 0.000000 2023-10-25 18:25:45,700 epoch 4 - iter 356/1786 - loss 0.04595061 - time (sec): 17.62 - samples/sec: 2858.45 - lr: 0.000023 - momentum: 0.000000 2023-10-25 18:25:54,721 epoch 4 - iter 534/1786 - loss 0.05049014 - time (sec): 26.64 - samples/sec: 2796.43 - lr: 0.000022 - momentum: 0.000000 2023-10-25 18:26:04,026 epoch 4 - iter 712/1786 - loss 0.05319326 - time (sec): 35.95 - samples/sec: 2750.89 - lr: 0.000022 - momentum: 0.000000 2023-10-25 18:26:13,466 epoch 4 - iter 890/1786 - loss 0.05285730 - time (sec): 45.39 - samples/sec: 2713.26 - lr: 0.000022 - momentum: 0.000000 2023-10-25 18:26:22,918 epoch 4 - iter 1068/1786 - loss 0.05374856 - time (sec): 54.84 - samples/sec: 2711.40 - lr: 0.000021 - momentum: 0.000000 2023-10-25 18:26:32,404 epoch 4 - iter 1246/1786 - loss 0.05460603 - time (sec): 64.33 - samples/sec: 2687.26 - lr: 0.000021 - momentum: 0.000000 2023-10-25 18:26:42,081 epoch 4 - iter 1424/1786 - loss 0.05410043 - time (sec): 74.00 - samples/sec: 2681.28 - lr: 0.000021 - momentum: 0.000000 2023-10-25 18:26:51,822 epoch 4 - iter 1602/1786 - loss 0.05349229 - time (sec): 83.74 - samples/sec: 2668.08 - lr: 0.000020 - momentum: 0.000000 2023-10-25 18:27:01,217 epoch 4 - iter 1780/1786 - loss 0.05316503 - time (sec): 93.14 - samples/sec: 2663.63 - lr: 0.000020 - momentum: 0.000000 2023-10-25 18:27:01,526 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:27:01,527 EPOCH 4 done: loss 0.0532 - lr: 0.000020 2023-10-25 18:27:06,049 DEV : loss 0.16789670288562775 - f1-score (micro avg) 0.7829 2023-10-25 18:27:06,070 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:27:15,697 epoch 5 - iter 178/1786 - loss 0.05389475 - time (sec): 9.63 - samples/sec: 2455.06 - lr: 0.000020 - momentum: 0.000000 2023-10-25 18:27:25,190 epoch 5 - iter 356/1786 - loss 0.04260056 - time (sec): 19.12 - samples/sec: 2632.79 - lr: 0.000019 - momentum: 0.000000 2023-10-25 18:27:34,006 epoch 5 - iter 534/1786 - loss 0.04052769 - time (sec): 27.93 - samples/sec: 2688.86 - lr: 0.000019 - momentum: 0.000000 2023-10-25 18:27:43,195 epoch 5 - iter 712/1786 - loss 0.04001294 - time (sec): 37.12 - samples/sec: 2703.06 - lr: 0.000019 - momentum: 0.000000 2023-10-25 18:27:52,669 epoch 5 - iter 890/1786 - loss 0.03975722 - time (sec): 46.60 - samples/sec: 2699.77 - lr: 0.000018 - momentum: 0.000000 2023-10-25 18:28:02,256 epoch 5 - iter 1068/1786 - loss 0.03974050 - time (sec): 56.18 - samples/sec: 2654.81 - lr: 0.000018 - momentum: 0.000000 2023-10-25 18:28:11,497 epoch 5 - iter 1246/1786 - loss 0.03952798 - time (sec): 65.43 - samples/sec: 2666.55 - lr: 0.000018 - momentum: 0.000000 2023-10-25 18:28:20,489 epoch 5 - iter 1424/1786 - loss 0.03903402 - time (sec): 74.42 - samples/sec: 2667.01 - lr: 0.000017 - momentum: 0.000000 2023-10-25 18:28:29,211 epoch 5 - iter 1602/1786 - loss 0.03844957 - time (sec): 83.14 - samples/sec: 2669.18 - lr: 0.000017 - momentum: 0.000000 2023-10-25 18:28:38,265 epoch 5 - iter 1780/1786 - loss 0.03829347 - time (sec): 92.19 - samples/sec: 2687.67 - lr: 0.000017 - momentum: 0.000000 2023-10-25 18:28:38,578 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:28:38,578 EPOCH 5 done: loss 0.0382 - lr: 0.000017 2023-10-25 18:28:44,071 DEV : loss 0.19442911446094513 - f1-score (micro avg) 0.7802 2023-10-25 18:28:44,094 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:28:53,424 epoch 6 - iter 178/1786 - loss 0.03398055 - time (sec): 9.33 - samples/sec: 2700.33 - lr: 0.000016 - momentum: 0.000000 2023-10-25 18:29:02,610 epoch 6 - iter 356/1786 - loss 0.03316916 - time (sec): 18.51 - samples/sec: 2757.68 - lr: 0.000016 - momentum: 0.000000 2023-10-25 18:29:12,181 epoch 6 - iter 534/1786 - loss 0.03040477 - time (sec): 28.09 - samples/sec: 2648.55 - lr: 0.000016 - momentum: 0.000000 2023-10-25 18:29:21,873 epoch 6 - iter 712/1786 - loss 0.03212632 - time (sec): 37.78 - samples/sec: 2628.07 - lr: 0.000015 - momentum: 0.000000 2023-10-25 18:29:31,591 epoch 6 - iter 890/1786 - loss 0.02992094 - time (sec): 47.50 - samples/sec: 2632.24 - lr: 0.000015 - momentum: 0.000000 2023-10-25 18:29:41,332 epoch 6 - iter 1068/1786 - loss 0.02930104 - time (sec): 57.24 - samples/sec: 2618.93 - lr: 0.000015 - momentum: 0.000000 2023-10-25 18:29:50,819 epoch 6 - iter 1246/1786 - loss 0.02972192 - time (sec): 66.72 - samples/sec: 2622.50 - lr: 0.000014 - momentum: 0.000000 2023-10-25 18:30:00,534 epoch 6 - iter 1424/1786 - loss 0.02944805 - time (sec): 76.44 - samples/sec: 2594.65 - lr: 0.000014 - momentum: 0.000000 2023-10-25 18:30:09,882 epoch 6 - iter 1602/1786 - loss 0.02958451 - time (sec): 85.79 - samples/sec: 2626.36 - lr: 0.000014 - momentum: 0.000000 2023-10-25 18:30:18,857 epoch 6 - iter 1780/1786 - loss 0.03003769 - time (sec): 94.76 - samples/sec: 2618.30 - lr: 0.000013 - momentum: 0.000000 2023-10-25 18:30:19,162 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:30:19,163 EPOCH 6 done: loss 0.0300 - lr: 0.000013 2023-10-25 18:30:23,457 DEV : loss 0.18333885073661804 - f1-score (micro avg) 0.7925 2023-10-25 18:30:23,480 saving best model 2023-10-25 18:30:24,184 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:30:34,775 epoch 7 - iter 178/1786 - loss 0.02162460 - time (sec): 10.59 - samples/sec: 2448.54 - lr: 0.000013 - momentum: 0.000000 2023-10-25 18:30:44,255 epoch 7 - iter 356/1786 - loss 0.01908762 - time (sec): 20.07 - samples/sec: 2453.18 - lr: 0.000013 - momentum: 0.000000 2023-10-25 18:30:53,682 epoch 7 - iter 534/1786 - loss 0.01976296 - time (sec): 29.50 - samples/sec: 2516.85 - lr: 0.000012 - momentum: 0.000000 2023-10-25 18:31:02,712 epoch 7 - iter 712/1786 - loss 0.02174272 - time (sec): 38.53 - samples/sec: 2564.31 - lr: 0.000012 - momentum: 0.000000 2023-10-25 18:31:11,779 epoch 7 - iter 890/1786 - loss 0.02278711 - time (sec): 47.59 - samples/sec: 2635.47 - lr: 0.000012 - momentum: 0.000000 2023-10-25 18:31:20,900 epoch 7 - iter 1068/1786 - loss 0.02154762 - time (sec): 56.71 - samples/sec: 2659.10 - lr: 0.000011 - momentum: 0.000000 2023-10-25 18:31:29,668 epoch 7 - iter 1246/1786 - loss 0.02092074 - time (sec): 65.48 - samples/sec: 2699.75 - lr: 0.000011 - momentum: 0.000000 2023-10-25 18:31:38,781 epoch 7 - iter 1424/1786 - loss 0.02100336 - time (sec): 74.60 - samples/sec: 2665.33 - lr: 0.000011 - momentum: 0.000000 2023-10-25 18:31:48,049 epoch 7 - iter 1602/1786 - loss 0.02189701 - time (sec): 83.86 - samples/sec: 2662.93 - lr: 0.000010 - momentum: 0.000000 2023-10-25 18:31:57,161 epoch 7 - iter 1780/1786 - loss 0.02145412 - time (sec): 92.98 - samples/sec: 2665.12 - lr: 0.000010 - momentum: 0.000000 2023-10-25 18:31:57,481 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:31:57,481 EPOCH 7 done: loss 0.0214 - lr: 0.000010 2023-10-25 18:32:01,934 DEV : loss 0.19056054949760437 - f1-score (micro avg) 0.8075 2023-10-25 18:32:01,956 saving best model 2023-10-25 18:32:02,611 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:32:12,182 epoch 8 - iter 178/1786 - loss 0.02612736 - time (sec): 9.57 - samples/sec: 2588.30 - lr: 0.000010 - momentum: 0.000000 2023-10-25 18:32:21,872 epoch 8 - iter 356/1786 - loss 0.01894543 - time (sec): 19.26 - samples/sec: 2516.13 - lr: 0.000009 - momentum: 0.000000 2023-10-25 18:32:31,455 epoch 8 - iter 534/1786 - loss 0.01672658 - time (sec): 28.84 - samples/sec: 2550.32 - lr: 0.000009 - momentum: 0.000000 2023-10-25 18:32:40,981 epoch 8 - iter 712/1786 - loss 0.01545569 - time (sec): 38.37 - samples/sec: 2583.47 - lr: 0.000009 - momentum: 0.000000 2023-10-25 18:32:50,491 epoch 8 - iter 890/1786 - loss 0.01420894 - time (sec): 47.88 - samples/sec: 2582.39 - lr: 0.000008 - momentum: 0.000000 2023-10-25 18:32:59,910 epoch 8 - iter 1068/1786 - loss 0.01478084 - time (sec): 57.30 - samples/sec: 2565.98 - lr: 0.000008 - momentum: 0.000000 2023-10-25 18:33:09,307 epoch 8 - iter 1246/1786 - loss 0.01595451 - time (sec): 66.69 - samples/sec: 2572.39 - lr: 0.000008 - momentum: 0.000000 2023-10-25 18:33:18,275 epoch 8 - iter 1424/1786 - loss 0.01587470 - time (sec): 75.66 - samples/sec: 2598.46 - lr: 0.000007 - momentum: 0.000000 2023-10-25 18:33:27,281 epoch 8 - iter 1602/1786 - loss 0.01586368 - time (sec): 84.67 - samples/sec: 2636.76 - lr: 0.000007 - momentum: 0.000000 2023-10-25 18:33:36,357 epoch 8 - iter 1780/1786 - loss 0.01584675 - time (sec): 93.74 - samples/sec: 2645.49 - lr: 0.000007 - momentum: 0.000000 2023-10-25 18:33:36,660 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:33:36,660 EPOCH 8 done: loss 0.0158 - lr: 0.000007 2023-10-25 18:33:42,265 DEV : loss 0.19766351580619812 - f1-score (micro avg) 0.8038 2023-10-25 18:33:42,287 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:33:51,969 epoch 9 - iter 178/1786 - loss 0.01028428 - time (sec): 9.68 - samples/sec: 2633.27 - lr: 0.000006 - momentum: 0.000000 2023-10-25 18:34:01,084 epoch 9 - iter 356/1786 - loss 0.00955984 - time (sec): 18.80 - samples/sec: 2571.59 - lr: 0.000006 - momentum: 0.000000 2023-10-25 18:34:10,368 epoch 9 - iter 534/1786 - loss 0.01151104 - time (sec): 28.08 - samples/sec: 2581.54 - lr: 0.000006 - momentum: 0.000000 2023-10-25 18:34:19,236 epoch 9 - iter 712/1786 - loss 0.01207848 - time (sec): 36.95 - samples/sec: 2625.24 - lr: 0.000005 - momentum: 0.000000 2023-10-25 18:34:28,301 epoch 9 - iter 890/1786 - loss 0.01138794 - time (sec): 46.01 - samples/sec: 2612.46 - lr: 0.000005 - momentum: 0.000000 2023-10-25 18:34:37,234 epoch 9 - iter 1068/1786 - loss 0.01116143 - time (sec): 54.95 - samples/sec: 2672.29 - lr: 0.000005 - momentum: 0.000000 2023-10-25 18:34:46,369 epoch 9 - iter 1246/1786 - loss 0.01155486 - time (sec): 64.08 - samples/sec: 2669.89 - lr: 0.000004 - momentum: 0.000000 2023-10-25 18:34:55,509 epoch 9 - iter 1424/1786 - loss 0.01114669 - time (sec): 73.22 - samples/sec: 2685.06 - lr: 0.000004 - momentum: 0.000000 2023-10-25 18:35:04,412 epoch 9 - iter 1602/1786 - loss 0.01110376 - time (sec): 82.12 - samples/sec: 2697.08 - lr: 0.000004 - momentum: 0.000000 2023-10-25 18:35:13,124 epoch 9 - iter 1780/1786 - loss 0.01074968 - time (sec): 90.84 - samples/sec: 2731.45 - lr: 0.000003 - momentum: 0.000000 2023-10-25 18:35:13,422 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:35:13,423 EPOCH 9 done: loss 0.0108 - lr: 0.000003 2023-10-25 18:35:17,832 DEV : loss 0.21357937157154083 - f1-score (micro avg) 0.8104 2023-10-25 18:35:17,855 saving best model 2023-10-25 18:35:18,543 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:35:28,085 epoch 10 - iter 178/1786 - loss 0.00637564 - time (sec): 9.54 - samples/sec: 2632.83 - lr: 0.000003 - momentum: 0.000000 2023-10-25 18:35:37,383 epoch 10 - iter 356/1786 - loss 0.00589645 - time (sec): 18.84 - samples/sec: 2586.16 - lr: 0.000003 - momentum: 0.000000 2023-10-25 18:35:46,529 epoch 10 - iter 534/1786 - loss 0.00533950 - time (sec): 27.98 - samples/sec: 2649.13 - lr: 0.000002 - momentum: 0.000000 2023-10-25 18:35:55,356 epoch 10 - iter 712/1786 - loss 0.00617715 - time (sec): 36.81 - samples/sec: 2702.22 - lr: 0.000002 - momentum: 0.000000 2023-10-25 18:36:04,322 epoch 10 - iter 890/1786 - loss 0.00620859 - time (sec): 45.78 - samples/sec: 2721.41 - lr: 0.000002 - momentum: 0.000000 2023-10-25 18:36:13,367 epoch 10 - iter 1068/1786 - loss 0.00641849 - time (sec): 54.82 - samples/sec: 2715.20 - lr: 0.000001 - momentum: 0.000000 2023-10-25 18:36:22,629 epoch 10 - iter 1246/1786 - loss 0.00676503 - time (sec): 64.08 - samples/sec: 2702.32 - lr: 0.000001 - momentum: 0.000000 2023-10-25 18:36:32,265 epoch 10 - iter 1424/1786 - loss 0.00651546 - time (sec): 73.72 - samples/sec: 2700.44 - lr: 0.000001 - momentum: 0.000000 2023-10-25 18:36:41,692 epoch 10 - iter 1602/1786 - loss 0.00660118 - time (sec): 83.15 - samples/sec: 2690.29 - lr: 0.000000 - momentum: 0.000000 2023-10-25 18:36:50,765 epoch 10 - iter 1780/1786 - loss 0.00682484 - time (sec): 92.22 - samples/sec: 2689.05 - lr: 0.000000 - momentum: 0.000000 2023-10-25 18:36:51,073 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:36:51,074 EPOCH 10 done: loss 0.0068 - lr: 0.000000 2023-10-25 18:36:56,195 DEV : loss 0.21570290625095367 - f1-score (micro avg) 0.8096 2023-10-25 18:36:56,710 ---------------------------------------------------------------------------------------------------- 2023-10-25 18:36:56,712 Loading model from best epoch ... 2023-10-25 18:36:58,657 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 18:37:13,089 Results: - F-score (micro) 0.6996 - F-score (macro) 0.6261 - Accuracy 0.5539 By class: precision recall f1-score support LOC 0.6930 0.7050 0.6990 1095 PER 0.7827 0.7796 0.7812 1012 ORG 0.4655 0.5854 0.5186 357 HumanProd 0.3966 0.6970 0.5055 33 micro avg 0.6820 0.7181 0.6996 2497 macro avg 0.5844 0.6918 0.6261 2497 weighted avg 0.6929 0.7181 0.7039 2497 2023-10-25 18:37:13,089 ----------------------------------------------------------------------------------------------------