|
2023-10-25 18:20:32,214 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,215 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 Train: 7142 sentences |
|
2023-10-25 18:20:32,216 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 Training Params: |
|
2023-10-25 18:20:32,216 - learning_rate: "3e-05" |
|
2023-10-25 18:20:32,216 - mini_batch_size: "4" |
|
2023-10-25 18:20:32,216 - max_epochs: "10" |
|
2023-10-25 18:20:32,216 - shuffle: "True" |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 Plugins: |
|
2023-10-25 18:20:32,216 - TensorboardLogger |
|
2023-10-25 18:20:32,216 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 18:20:32,216 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 18:20:32,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,216 Computation: |
|
2023-10-25 18:20:32,216 - compute on device: cuda:0 |
|
2023-10-25 18:20:32,216 - embedding storage: none |
|
2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,217 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:20:32,217 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 18:20:41,829 epoch 1 - iter 178/1786 - loss 1.62894423 - time (sec): 9.61 - samples/sec: 2374.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:20:51,370 epoch 1 - iter 356/1786 - loss 1.05338819 - time (sec): 19.15 - samples/sec: 2440.03 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:21:00,897 epoch 1 - iter 534/1786 - loss 0.81520377 - time (sec): 28.68 - samples/sec: 2483.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:21:10,205 epoch 1 - iter 712/1786 - loss 0.65815637 - time (sec): 37.99 - samples/sec: 2589.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:21:18,795 epoch 1 - iter 890/1786 - loss 0.56620539 - time (sec): 46.58 - samples/sec: 2635.83 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:21:27,476 epoch 1 - iter 1068/1786 - loss 0.50136101 - time (sec): 55.26 - samples/sec: 2648.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:21:36,384 epoch 1 - iter 1246/1786 - loss 0.45154314 - time (sec): 64.17 - samples/sec: 2662.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:21:45,468 epoch 1 - iter 1424/1786 - loss 0.41094756 - time (sec): 73.25 - samples/sec: 2680.96 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:21:54,948 epoch 1 - iter 1602/1786 - loss 0.37891662 - time (sec): 82.73 - samples/sec: 2692.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:22:04,798 epoch 1 - iter 1780/1786 - loss 0.35716707 - time (sec): 92.58 - samples/sec: 2677.72 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:22:05,143 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:22:05,143 EPOCH 1 done: loss 0.3565 - lr: 0.000030 |
|
2023-10-25 18:22:08,918 DEV : loss 0.10421743243932724 - f1-score (micro avg) 0.7273 |
|
2023-10-25 18:22:08,940 saving best model |
|
2023-10-25 18:22:09,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:22:19,092 epoch 2 - iter 178/1786 - loss 0.11382450 - time (sec): 9.70 - samples/sec: 2665.09 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:22:28,505 epoch 2 - iter 356/1786 - loss 0.11818253 - time (sec): 19.11 - samples/sec: 2534.69 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:22:37,505 epoch 2 - iter 534/1786 - loss 0.11752290 - time (sec): 28.11 - samples/sec: 2607.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:22:46,687 epoch 2 - iter 712/1786 - loss 0.11710291 - time (sec): 37.29 - samples/sec: 2656.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:22:55,874 epoch 2 - iter 890/1786 - loss 0.11761515 - time (sec): 46.48 - samples/sec: 2626.69 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:23:04,857 epoch 2 - iter 1068/1786 - loss 0.11880463 - time (sec): 55.46 - samples/sec: 2647.77 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:23:13,642 epoch 2 - iter 1246/1786 - loss 0.11830693 - time (sec): 64.25 - samples/sec: 2669.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:23:22,746 epoch 2 - iter 1424/1786 - loss 0.11678831 - time (sec): 73.35 - samples/sec: 2706.42 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:23:31,977 epoch 2 - iter 1602/1786 - loss 0.11600197 - time (sec): 82.58 - samples/sec: 2700.85 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:23:41,217 epoch 2 - iter 1780/1786 - loss 0.11609587 - time (sec): 91.82 - samples/sec: 2701.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:23:41,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:23:41,526 EPOCH 2 done: loss 0.1160 - lr: 0.000027 |
|
2023-10-25 18:23:46,583 DEV : loss 0.10009025037288666 - f1-score (micro avg) 0.7704 |
|
2023-10-25 18:23:46,604 saving best model |
|
2023-10-25 18:23:47,260 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:23:56,854 epoch 3 - iter 178/1786 - loss 0.06189937 - time (sec): 9.59 - samples/sec: 2612.13 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:24:06,469 epoch 3 - iter 356/1786 - loss 0.06991416 - time (sec): 19.21 - samples/sec: 2562.50 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:24:15,985 epoch 3 - iter 534/1786 - loss 0.07505640 - time (sec): 28.72 - samples/sec: 2562.39 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:24:25,614 epoch 3 - iter 712/1786 - loss 0.07238654 - time (sec): 38.35 - samples/sec: 2574.30 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:24:35,379 epoch 3 - iter 890/1786 - loss 0.07251223 - time (sec): 48.12 - samples/sec: 2571.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:24:45,040 epoch 3 - iter 1068/1786 - loss 0.07322538 - time (sec): 57.78 - samples/sec: 2580.46 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:24:54,249 epoch 3 - iter 1246/1786 - loss 0.07152561 - time (sec): 66.98 - samples/sec: 2609.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:25:03,582 epoch 3 - iter 1424/1786 - loss 0.07161969 - time (sec): 76.32 - samples/sec: 2579.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:25:12,455 epoch 3 - iter 1602/1786 - loss 0.07144466 - time (sec): 85.19 - samples/sec: 2609.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:25:21,964 epoch 3 - iter 1780/1786 - loss 0.07137444 - time (sec): 94.70 - samples/sec: 2618.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:25:22,284 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:25:22,284 EPOCH 3 done: loss 0.0713 - lr: 0.000023 |
|
2023-10-25 18:25:27,382 DEV : loss 0.13084866106510162 - f1-score (micro avg) 0.7918 |
|
2023-10-25 18:25:27,404 saving best model |
|
2023-10-25 18:25:28,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:25:36,796 epoch 4 - iter 178/1786 - loss 0.04603763 - time (sec): 8.72 - samples/sec: 2822.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:25:45,700 epoch 4 - iter 356/1786 - loss 0.04595061 - time (sec): 17.62 - samples/sec: 2858.45 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:25:54,721 epoch 4 - iter 534/1786 - loss 0.05049014 - time (sec): 26.64 - samples/sec: 2796.43 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:26:04,026 epoch 4 - iter 712/1786 - loss 0.05319326 - time (sec): 35.95 - samples/sec: 2750.89 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:26:13,466 epoch 4 - iter 890/1786 - loss 0.05285730 - time (sec): 45.39 - samples/sec: 2713.26 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:26:22,918 epoch 4 - iter 1068/1786 - loss 0.05374856 - time (sec): 54.84 - samples/sec: 2711.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:26:32,404 epoch 4 - iter 1246/1786 - loss 0.05460603 - time (sec): 64.33 - samples/sec: 2687.26 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:26:42,081 epoch 4 - iter 1424/1786 - loss 0.05410043 - time (sec): 74.00 - samples/sec: 2681.28 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:26:51,822 epoch 4 - iter 1602/1786 - loss 0.05349229 - time (sec): 83.74 - samples/sec: 2668.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:27:01,217 epoch 4 - iter 1780/1786 - loss 0.05316503 - time (sec): 93.14 - samples/sec: 2663.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:27:01,526 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:27:01,527 EPOCH 4 done: loss 0.0532 - lr: 0.000020 |
|
2023-10-25 18:27:06,049 DEV : loss 0.16789670288562775 - f1-score (micro avg) 0.7829 |
|
2023-10-25 18:27:06,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:27:15,697 epoch 5 - iter 178/1786 - loss 0.05389475 - time (sec): 9.63 - samples/sec: 2455.06 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:27:25,190 epoch 5 - iter 356/1786 - loss 0.04260056 - time (sec): 19.12 - samples/sec: 2632.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:27:34,006 epoch 5 - iter 534/1786 - loss 0.04052769 - time (sec): 27.93 - samples/sec: 2688.86 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:27:43,195 epoch 5 - iter 712/1786 - loss 0.04001294 - time (sec): 37.12 - samples/sec: 2703.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:27:52,669 epoch 5 - iter 890/1786 - loss 0.03975722 - time (sec): 46.60 - samples/sec: 2699.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:28:02,256 epoch 5 - iter 1068/1786 - loss 0.03974050 - time (sec): 56.18 - samples/sec: 2654.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:28:11,497 epoch 5 - iter 1246/1786 - loss 0.03952798 - time (sec): 65.43 - samples/sec: 2666.55 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:28:20,489 epoch 5 - iter 1424/1786 - loss 0.03903402 - time (sec): 74.42 - samples/sec: 2667.01 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:28:29,211 epoch 5 - iter 1602/1786 - loss 0.03844957 - time (sec): 83.14 - samples/sec: 2669.18 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:28:38,265 epoch 5 - iter 1780/1786 - loss 0.03829347 - time (sec): 92.19 - samples/sec: 2687.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:28:38,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:38,578 EPOCH 5 done: loss 0.0382 - lr: 0.000017 |
|
2023-10-25 18:28:44,071 DEV : loss 0.19442911446094513 - f1-score (micro avg) 0.7802 |
|
2023-10-25 18:28:44,094 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:53,424 epoch 6 - iter 178/1786 - loss 0.03398055 - time (sec): 9.33 - samples/sec: 2700.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:29:02,610 epoch 6 - iter 356/1786 - loss 0.03316916 - time (sec): 18.51 - samples/sec: 2757.68 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:29:12,181 epoch 6 - iter 534/1786 - loss 0.03040477 - time (sec): 28.09 - samples/sec: 2648.55 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:29:21,873 epoch 6 - iter 712/1786 - loss 0.03212632 - time (sec): 37.78 - samples/sec: 2628.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:29:31,591 epoch 6 - iter 890/1786 - loss 0.02992094 - time (sec): 47.50 - samples/sec: 2632.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:29:41,332 epoch 6 - iter 1068/1786 - loss 0.02930104 - time (sec): 57.24 - samples/sec: 2618.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:29:50,819 epoch 6 - iter 1246/1786 - loss 0.02972192 - time (sec): 66.72 - samples/sec: 2622.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:30:00,534 epoch 6 - iter 1424/1786 - loss 0.02944805 - time (sec): 76.44 - samples/sec: 2594.65 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:30:09,882 epoch 6 - iter 1602/1786 - loss 0.02958451 - time (sec): 85.79 - samples/sec: 2626.36 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:30:18,857 epoch 6 - iter 1780/1786 - loss 0.03003769 - time (sec): 94.76 - samples/sec: 2618.30 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:30:19,162 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:30:19,163 EPOCH 6 done: loss 0.0300 - lr: 0.000013 |
|
2023-10-25 18:30:23,457 DEV : loss 0.18333885073661804 - f1-score (micro avg) 0.7925 |
|
2023-10-25 18:30:23,480 saving best model |
|
2023-10-25 18:30:24,184 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:30:34,775 epoch 7 - iter 178/1786 - loss 0.02162460 - time (sec): 10.59 - samples/sec: 2448.54 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:30:44,255 epoch 7 - iter 356/1786 - loss 0.01908762 - time (sec): 20.07 - samples/sec: 2453.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:30:53,682 epoch 7 - iter 534/1786 - loss 0.01976296 - time (sec): 29.50 - samples/sec: 2516.85 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:31:02,712 epoch 7 - iter 712/1786 - loss 0.02174272 - time (sec): 38.53 - samples/sec: 2564.31 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:31:11,779 epoch 7 - iter 890/1786 - loss 0.02278711 - time (sec): 47.59 - samples/sec: 2635.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:31:20,900 epoch 7 - iter 1068/1786 - loss 0.02154762 - time (sec): 56.71 - samples/sec: 2659.10 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:31:29,668 epoch 7 - iter 1246/1786 - loss 0.02092074 - time (sec): 65.48 - samples/sec: 2699.75 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:31:38,781 epoch 7 - iter 1424/1786 - loss 0.02100336 - time (sec): 74.60 - samples/sec: 2665.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:31:48,049 epoch 7 - iter 1602/1786 - loss 0.02189701 - time (sec): 83.86 - samples/sec: 2662.93 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:31:57,161 epoch 7 - iter 1780/1786 - loss 0.02145412 - time (sec): 92.98 - samples/sec: 2665.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:31:57,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:31:57,481 EPOCH 7 done: loss 0.0214 - lr: 0.000010 |
|
2023-10-25 18:32:01,934 DEV : loss 0.19056054949760437 - f1-score (micro avg) 0.8075 |
|
2023-10-25 18:32:01,956 saving best model |
|
2023-10-25 18:32:02,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:32:12,182 epoch 8 - iter 178/1786 - loss 0.02612736 - time (sec): 9.57 - samples/sec: 2588.30 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:32:21,872 epoch 8 - iter 356/1786 - loss 0.01894543 - time (sec): 19.26 - samples/sec: 2516.13 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:32:31,455 epoch 8 - iter 534/1786 - loss 0.01672658 - time (sec): 28.84 - samples/sec: 2550.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:32:40,981 epoch 8 - iter 712/1786 - loss 0.01545569 - time (sec): 38.37 - samples/sec: 2583.47 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:32:50,491 epoch 8 - iter 890/1786 - loss 0.01420894 - time (sec): 47.88 - samples/sec: 2582.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:32:59,910 epoch 8 - iter 1068/1786 - loss 0.01478084 - time (sec): 57.30 - samples/sec: 2565.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:33:09,307 epoch 8 - iter 1246/1786 - loss 0.01595451 - time (sec): 66.69 - samples/sec: 2572.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:33:18,275 epoch 8 - iter 1424/1786 - loss 0.01587470 - time (sec): 75.66 - samples/sec: 2598.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:33:27,281 epoch 8 - iter 1602/1786 - loss 0.01586368 - time (sec): 84.67 - samples/sec: 2636.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:33:36,357 epoch 8 - iter 1780/1786 - loss 0.01584675 - time (sec): 93.74 - samples/sec: 2645.49 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:33:36,660 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:33:36,660 EPOCH 8 done: loss 0.0158 - lr: 0.000007 |
|
2023-10-25 18:33:42,265 DEV : loss 0.19766351580619812 - f1-score (micro avg) 0.8038 |
|
2023-10-25 18:33:42,287 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:33:51,969 epoch 9 - iter 178/1786 - loss 0.01028428 - time (sec): 9.68 - samples/sec: 2633.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:34:01,084 epoch 9 - iter 356/1786 - loss 0.00955984 - time (sec): 18.80 - samples/sec: 2571.59 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:34:10,368 epoch 9 - iter 534/1786 - loss 0.01151104 - time (sec): 28.08 - samples/sec: 2581.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:34:19,236 epoch 9 - iter 712/1786 - loss 0.01207848 - time (sec): 36.95 - samples/sec: 2625.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:34:28,301 epoch 9 - iter 890/1786 - loss 0.01138794 - time (sec): 46.01 - samples/sec: 2612.46 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:34:37,234 epoch 9 - iter 1068/1786 - loss 0.01116143 - time (sec): 54.95 - samples/sec: 2672.29 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:34:46,369 epoch 9 - iter 1246/1786 - loss 0.01155486 - time (sec): 64.08 - samples/sec: 2669.89 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:34:55,509 epoch 9 - iter 1424/1786 - loss 0.01114669 - time (sec): 73.22 - samples/sec: 2685.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:35:04,412 epoch 9 - iter 1602/1786 - loss 0.01110376 - time (sec): 82.12 - samples/sec: 2697.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:35:13,124 epoch 9 - iter 1780/1786 - loss 0.01074968 - time (sec): 90.84 - samples/sec: 2731.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:35:13,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:35:13,423 EPOCH 9 done: loss 0.0108 - lr: 0.000003 |
|
2023-10-25 18:35:17,832 DEV : loss 0.21357937157154083 - f1-score (micro avg) 0.8104 |
|
2023-10-25 18:35:17,855 saving best model |
|
2023-10-25 18:35:18,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:35:28,085 epoch 10 - iter 178/1786 - loss 0.00637564 - time (sec): 9.54 - samples/sec: 2632.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:35:37,383 epoch 10 - iter 356/1786 - loss 0.00589645 - time (sec): 18.84 - samples/sec: 2586.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:35:46,529 epoch 10 - iter 534/1786 - loss 0.00533950 - time (sec): 27.98 - samples/sec: 2649.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:35:55,356 epoch 10 - iter 712/1786 - loss 0.00617715 - time (sec): 36.81 - samples/sec: 2702.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:36:04,322 epoch 10 - iter 890/1786 - loss 0.00620859 - time (sec): 45.78 - samples/sec: 2721.41 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:36:13,367 epoch 10 - iter 1068/1786 - loss 0.00641849 - time (sec): 54.82 - samples/sec: 2715.20 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:36:22,629 epoch 10 - iter 1246/1786 - loss 0.00676503 - time (sec): 64.08 - samples/sec: 2702.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:36:32,265 epoch 10 - iter 1424/1786 - loss 0.00651546 - time (sec): 73.72 - samples/sec: 2700.44 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:36:41,692 epoch 10 - iter 1602/1786 - loss 0.00660118 - time (sec): 83.15 - samples/sec: 2690.29 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 18:36:50,765 epoch 10 - iter 1780/1786 - loss 0.00682484 - time (sec): 92.22 - samples/sec: 2689.05 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 18:36:51,073 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:36:51,074 EPOCH 10 done: loss 0.0068 - lr: 0.000000 |
|
2023-10-25 18:36:56,195 DEV : loss 0.21570290625095367 - f1-score (micro avg) 0.8096 |
|
2023-10-25 18:36:56,710 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:36:56,712 Loading model from best epoch ... |
|
2023-10-25 18:36:58,657 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 18:37:13,089 |
|
Results: |
|
- F-score (micro) 0.6996 |
|
- F-score (macro) 0.6261 |
|
- Accuracy 0.5539 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6930 0.7050 0.6990 1095 |
|
PER 0.7827 0.7796 0.7812 1012 |
|
ORG 0.4655 0.5854 0.5186 357 |
|
HumanProd 0.3966 0.6970 0.5055 33 |
|
|
|
micro avg 0.6820 0.7181 0.6996 2497 |
|
macro avg 0.5844 0.6918 0.6261 2497 |
|
weighted avg 0.6929 0.7181 0.7039 2497 |
|
|
|
2023-10-25 18:37:13,089 ---------------------------------------------------------------------------------------------------- |
|
|