2023-10-18 16:39:59,112 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,112 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 Train: 966 sentences 2023-10-18 16:39:59,113 (train_with_dev=False, train_with_test=False) 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 Training Params: 2023-10-18 16:39:59,113 - learning_rate: "3e-05" 2023-10-18 16:39:59,113 - mini_batch_size: "8" 2023-10-18 16:39:59,113 - max_epochs: "10" 2023-10-18 16:39:59,113 - shuffle: "True" 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 Plugins: 2023-10-18 16:39:59,113 - TensorboardLogger 2023-10-18 16:39:59,113 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:39:59,113 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,113 Computation: 2023-10-18 16:39:59,113 - compute on device: cuda:0 2023-10-18 16:39:59,113 - embedding storage: none 2023-10-18 16:39:59,113 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,114 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 16:39:59,114 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,114 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:39:59,114 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:39:59,385 epoch 1 - iter 12/121 - loss 3.43294543 - time (sec): 0.27 - samples/sec: 8914.03 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:39:59,650 epoch 1 - iter 24/121 - loss 3.36410792 - time (sec): 0.54 - samples/sec: 8814.95 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:39:59,915 epoch 1 - iter 36/121 - loss 3.35975326 - time (sec): 0.80 - samples/sec: 8923.45 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:40:00,158 epoch 1 - iter 48/121 - loss 3.31536026 - time (sec): 1.04 - samples/sec: 9020.17 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:40:00,396 epoch 1 - iter 60/121 - loss 3.18264314 - time (sec): 1.28 - samples/sec: 9643.42 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:40:00,630 epoch 1 - iter 72/121 - loss 3.06758848 - time (sec): 1.52 - samples/sec: 10029.68 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:40:00,909 epoch 1 - iter 84/121 - loss 2.94673667 - time (sec): 1.80 - samples/sec: 9802.31 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:40:01,177 epoch 1 - iter 96/121 - loss 2.81299194 - time (sec): 2.06 - samples/sec: 9796.93 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:40:01,438 epoch 1 - iter 108/121 - loss 2.68628684 - time (sec): 2.32 - samples/sec: 9644.00 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:40:01,711 epoch 1 - iter 120/121 - loss 2.55391040 - time (sec): 2.60 - samples/sec: 9480.62 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:40:01,730 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:01,731 EPOCH 1 done: loss 2.5456 - lr: 0.000030 2023-10-18 16:40:02,263 DEV : loss 0.759665310382843 - f1-score (micro avg) 0.0 2023-10-18 16:40:02,268 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:02,545 epoch 2 - iter 12/121 - loss 1.01847183 - time (sec): 0.28 - samples/sec: 8377.96 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:40:02,818 epoch 2 - iter 24/121 - loss 0.96066464 - time (sec): 0.55 - samples/sec: 8853.51 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:40:03,088 epoch 2 - iter 36/121 - loss 0.89224663 - time (sec): 0.82 - samples/sec: 9088.97 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:40:03,369 epoch 2 - iter 48/121 - loss 0.89067514 - time (sec): 1.10 - samples/sec: 9092.90 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:40:03,658 epoch 2 - iter 60/121 - loss 0.87749403 - time (sec): 1.39 - samples/sec: 9029.47 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:40:03,932 epoch 2 - iter 72/121 - loss 0.85128176 - time (sec): 1.66 - samples/sec: 9045.67 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:40:04,199 epoch 2 - iter 84/121 - loss 0.82318925 - time (sec): 1.93 - samples/sec: 9025.65 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:40:04,461 epoch 2 - iter 96/121 - loss 0.80949346 - time (sec): 2.19 - samples/sec: 8949.79 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:40:04,723 epoch 2 - iter 108/121 - loss 0.79942863 - time (sec): 2.45 - samples/sec: 8985.69 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:40:04,999 epoch 2 - iter 120/121 - loss 0.79205803 - time (sec): 2.73 - samples/sec: 9013.41 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:40:05,018 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:05,018 EPOCH 2 done: loss 0.7917 - lr: 0.000027 2023-10-18 16:40:05,441 DEV : loss 0.5911577343940735 - f1-score (micro avg) 0.0 2023-10-18 16:40:05,445 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:05,709 epoch 3 - iter 12/121 - loss 0.64201092 - time (sec): 0.26 - samples/sec: 10044.62 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:40:05,963 epoch 3 - iter 24/121 - loss 0.66796251 - time (sec): 0.52 - samples/sec: 9798.95 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:40:06,200 epoch 3 - iter 36/121 - loss 0.68039215 - time (sec): 0.75 - samples/sec: 10393.92 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:40:06,420 epoch 3 - iter 48/121 - loss 0.67705985 - time (sec): 0.97 - samples/sec: 10220.86 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:40:06,650 epoch 3 - iter 60/121 - loss 0.68112119 - time (sec): 1.20 - samples/sec: 10197.20 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:40:06,881 epoch 3 - iter 72/121 - loss 0.66941790 - time (sec): 1.44 - samples/sec: 10361.91 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:40:07,111 epoch 3 - iter 84/121 - loss 0.66549965 - time (sec): 1.67 - samples/sec: 10385.10 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:40:07,349 epoch 3 - iter 96/121 - loss 0.63044341 - time (sec): 1.90 - samples/sec: 10484.03 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:40:07,571 epoch 3 - iter 108/121 - loss 0.62108182 - time (sec): 2.13 - samples/sec: 10490.57 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:40:07,793 epoch 3 - iter 120/121 - loss 0.61528230 - time (sec): 2.35 - samples/sec: 10505.52 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:40:07,807 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:07,807 EPOCH 3 done: loss 0.6169 - lr: 0.000023 2023-10-18 16:40:08,234 DEV : loss 0.4769611358642578 - f1-score (micro avg) 0.0916 2023-10-18 16:40:08,238 saving best model 2023-10-18 16:40:08,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:08,563 epoch 4 - iter 12/121 - loss 0.67903091 - time (sec): 0.30 - samples/sec: 8723.73 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:40:08,812 epoch 4 - iter 24/121 - loss 0.60323732 - time (sec): 0.54 - samples/sec: 9566.70 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:40:09,091 epoch 4 - iter 36/121 - loss 0.58912876 - time (sec): 0.82 - samples/sec: 9285.07 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:40:09,345 epoch 4 - iter 48/121 - loss 0.55789760 - time (sec): 1.08 - samples/sec: 9118.61 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:40:09,625 epoch 4 - iter 60/121 - loss 0.55462056 - time (sec): 1.36 - samples/sec: 9168.80 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:40:09,884 epoch 4 - iter 72/121 - loss 0.53492878 - time (sec): 1.62 - samples/sec: 9148.08 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:40:10,153 epoch 4 - iter 84/121 - loss 0.52454586 - time (sec): 1.89 - samples/sec: 9033.81 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:40:10,399 epoch 4 - iter 96/121 - loss 0.52293449 - time (sec): 2.13 - samples/sec: 9094.17 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:40:10,620 epoch 4 - iter 108/121 - loss 0.52839545 - time (sec): 2.35 - samples/sec: 9361.67 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:40:10,844 epoch 4 - iter 120/121 - loss 0.52842268 - time (sec): 2.58 - samples/sec: 9557.97 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:40:10,860 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:10,860 EPOCH 4 done: loss 0.5274 - lr: 0.000020 2023-10-18 16:40:11,289 DEV : loss 0.42786967754364014 - f1-score (micro avg) 0.3269 2023-10-18 16:40:11,293 saving best model 2023-10-18 16:40:11,327 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:11,617 epoch 5 - iter 12/121 - loss 0.48586362 - time (sec): 0.29 - samples/sec: 8848.10 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:40:11,895 epoch 5 - iter 24/121 - loss 0.49930675 - time (sec): 0.57 - samples/sec: 9132.97 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:40:12,166 epoch 5 - iter 36/121 - loss 0.48700716 - time (sec): 0.84 - samples/sec: 8945.33 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:40:12,375 epoch 5 - iter 48/121 - loss 0.48368153 - time (sec): 1.05 - samples/sec: 9424.79 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:40:12,600 epoch 5 - iter 60/121 - loss 0.49992923 - time (sec): 1.27 - samples/sec: 9799.11 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:40:12,821 epoch 5 - iter 72/121 - loss 0.49467927 - time (sec): 1.49 - samples/sec: 9880.29 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:40:13,092 epoch 5 - iter 84/121 - loss 0.49737874 - time (sec): 1.76 - samples/sec: 9781.01 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:40:13,367 epoch 5 - iter 96/121 - loss 0.49103817 - time (sec): 2.04 - samples/sec: 9561.85 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:40:13,635 epoch 5 - iter 108/121 - loss 0.48716929 - time (sec): 2.31 - samples/sec: 9522.08 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:40:13,914 epoch 5 - iter 120/121 - loss 0.48448242 - time (sec): 2.59 - samples/sec: 9491.28 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:40:13,936 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:13,936 EPOCH 5 done: loss 0.4829 - lr: 0.000017 2023-10-18 16:40:14,362 DEV : loss 0.380738228559494 - f1-score (micro avg) 0.4448 2023-10-18 16:40:14,367 saving best model 2023-10-18 16:40:14,403 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:14,690 epoch 6 - iter 12/121 - loss 0.51032298 - time (sec): 0.29 - samples/sec: 7018.54 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:40:14,964 epoch 6 - iter 24/121 - loss 0.50418911 - time (sec): 0.56 - samples/sec: 7904.35 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:40:15,246 epoch 6 - iter 36/121 - loss 0.47183751 - time (sec): 0.84 - samples/sec: 8203.20 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:40:15,519 epoch 6 - iter 48/121 - loss 0.44949935 - time (sec): 1.11 - samples/sec: 8569.22 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:40:15,785 epoch 6 - iter 60/121 - loss 0.47883652 - time (sec): 1.38 - samples/sec: 8805.71 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:40:16,052 epoch 6 - iter 72/121 - loss 0.47262661 - time (sec): 1.65 - samples/sec: 8834.69 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:40:16,330 epoch 6 - iter 84/121 - loss 0.46817208 - time (sec): 1.93 - samples/sec: 8850.52 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:40:16,606 epoch 6 - iter 96/121 - loss 0.46557446 - time (sec): 2.20 - samples/sec: 8961.54 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:40:16,880 epoch 6 - iter 108/121 - loss 0.45798626 - time (sec): 2.48 - samples/sec: 8918.81 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:40:17,162 epoch 6 - iter 120/121 - loss 0.46009913 - time (sec): 2.76 - samples/sec: 8920.80 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:40:17,181 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:17,181 EPOCH 6 done: loss 0.4592 - lr: 0.000013 2023-10-18 16:40:17,606 DEV : loss 0.36484095454216003 - f1-score (micro avg) 0.4551 2023-10-18 16:40:17,611 saving best model 2023-10-18 16:40:17,644 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:17,906 epoch 7 - iter 12/121 - loss 0.49431941 - time (sec): 0.26 - samples/sec: 8412.18 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:40:18,167 epoch 7 - iter 24/121 - loss 0.43850212 - time (sec): 0.52 - samples/sec: 8685.46 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:40:18,417 epoch 7 - iter 36/121 - loss 0.44177369 - time (sec): 0.77 - samples/sec: 9101.19 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:40:18,698 epoch 7 - iter 48/121 - loss 0.42905329 - time (sec): 1.05 - samples/sec: 9175.64 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:40:18,974 epoch 7 - iter 60/121 - loss 0.43947644 - time (sec): 1.33 - samples/sec: 9011.63 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:40:19,239 epoch 7 - iter 72/121 - loss 0.42924037 - time (sec): 1.59 - samples/sec: 9106.25 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:40:19,505 epoch 7 - iter 84/121 - loss 0.42635991 - time (sec): 1.86 - samples/sec: 9054.04 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:40:19,771 epoch 7 - iter 96/121 - loss 0.42133453 - time (sec): 2.13 - samples/sec: 9106.14 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:40:20,059 epoch 7 - iter 108/121 - loss 0.42600472 - time (sec): 2.41 - samples/sec: 9165.57 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:40:20,325 epoch 7 - iter 120/121 - loss 0.43427704 - time (sec): 2.68 - samples/sec: 9190.02 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:40:20,343 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:20,343 EPOCH 7 done: loss 0.4341 - lr: 0.000010 2023-10-18 16:40:20,776 DEV : loss 0.3425357937812805 - f1-score (micro avg) 0.465 2023-10-18 16:40:20,780 saving best model 2023-10-18 16:40:20,812 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:21,084 epoch 8 - iter 12/121 - loss 0.45433872 - time (sec): 0.27 - samples/sec: 9654.75 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:40:21,351 epoch 8 - iter 24/121 - loss 0.45787438 - time (sec): 0.54 - samples/sec: 8831.74 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:40:21,645 epoch 8 - iter 36/121 - loss 0.44116618 - time (sec): 0.83 - samples/sec: 8724.92 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:40:21,927 epoch 8 - iter 48/121 - loss 0.43346820 - time (sec): 1.11 - samples/sec: 8506.88 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:40:22,197 epoch 8 - iter 60/121 - loss 0.42692781 - time (sec): 1.38 - samples/sec: 8473.59 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:40:22,477 epoch 8 - iter 72/121 - loss 0.42512799 - time (sec): 1.66 - samples/sec: 8692.49 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:40:22,759 epoch 8 - iter 84/121 - loss 0.42623312 - time (sec): 1.95 - samples/sec: 8798.17 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:40:23,033 epoch 8 - iter 96/121 - loss 0.41903127 - time (sec): 2.22 - samples/sec: 8741.64 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:40:23,311 epoch 8 - iter 108/121 - loss 0.40935925 - time (sec): 2.50 - samples/sec: 8747.64 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:40:23,586 epoch 8 - iter 120/121 - loss 0.41727674 - time (sec): 2.77 - samples/sec: 8886.74 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:40:23,604 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:23,605 EPOCH 8 done: loss 0.4165 - lr: 0.000007 2023-10-18 16:40:24,050 DEV : loss 0.3300197124481201 - f1-score (micro avg) 0.4792 2023-10-18 16:40:24,055 saving best model 2023-10-18 16:40:24,088 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:24,379 epoch 9 - iter 12/121 - loss 0.31680954 - time (sec): 0.29 - samples/sec: 8401.16 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:40:24,672 epoch 9 - iter 24/121 - loss 0.34146626 - time (sec): 0.58 - samples/sec: 8596.49 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:40:24,969 epoch 9 - iter 36/121 - loss 0.37171341 - time (sec): 0.88 - samples/sec: 8800.82 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:40:25,251 epoch 9 - iter 48/121 - loss 0.40458221 - time (sec): 1.16 - samples/sec: 8658.67 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:40:25,542 epoch 9 - iter 60/121 - loss 0.40041685 - time (sec): 1.45 - samples/sec: 8701.56 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:40:25,823 epoch 9 - iter 72/121 - loss 0.40389257 - time (sec): 1.73 - samples/sec: 8749.48 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:40:26,101 epoch 9 - iter 84/121 - loss 0.39905521 - time (sec): 2.01 - samples/sec: 8710.64 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:40:26,393 epoch 9 - iter 96/121 - loss 0.40640857 - time (sec): 2.30 - samples/sec: 8600.67 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:40:26,676 epoch 9 - iter 108/121 - loss 0.39802464 - time (sec): 2.59 - samples/sec: 8619.92 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:40:26,965 epoch 9 - iter 120/121 - loss 0.39431360 - time (sec): 2.88 - samples/sec: 8556.41 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:40:26,985 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:26,985 EPOCH 9 done: loss 0.3943 - lr: 0.000004 2023-10-18 16:40:27,429 DEV : loss 0.32414817810058594 - f1-score (micro avg) 0.4836 2023-10-18 16:40:27,435 saving best model 2023-10-18 16:40:27,465 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:27,760 epoch 10 - iter 12/121 - loss 0.30736628 - time (sec): 0.29 - samples/sec: 8126.50 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:40:28,033 epoch 10 - iter 24/121 - loss 0.36451023 - time (sec): 0.57 - samples/sec: 8417.82 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:40:28,303 epoch 10 - iter 36/121 - loss 0.37892960 - time (sec): 0.84 - samples/sec: 8696.06 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:40:28,579 epoch 10 - iter 48/121 - loss 0.38779538 - time (sec): 1.11 - samples/sec: 8703.39 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:40:28,842 epoch 10 - iter 60/121 - loss 0.39890677 - time (sec): 1.38 - samples/sec: 8711.04 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:40:29,105 epoch 10 - iter 72/121 - loss 0.39534237 - time (sec): 1.64 - samples/sec: 8650.94 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:40:29,381 epoch 10 - iter 84/121 - loss 0.39605100 - time (sec): 1.92 - samples/sec: 8818.52 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:40:29,647 epoch 10 - iter 96/121 - loss 0.38971761 - time (sec): 2.18 - samples/sec: 8903.14 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:40:29,921 epoch 10 - iter 108/121 - loss 0.39231633 - time (sec): 2.45 - samples/sec: 8912.78 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:40:30,193 epoch 10 - iter 120/121 - loss 0.39046127 - time (sec): 2.73 - samples/sec: 8997.99 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:40:30,213 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:30,214 EPOCH 10 done: loss 0.3914 - lr: 0.000000 2023-10-18 16:40:30,658 DEV : loss 0.32508569955825806 - f1-score (micro avg) 0.4779 2023-10-18 16:40:30,693 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:40:30,693 Loading model from best epoch ... 2023-10-18 16:40:30,763 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 16:40:31,178 Results: - F-score (micro) 0.4353 - F-score (macro) 0.2008 - Accuracy 0.2931 By class: precision recall f1-score support pers 0.4795 0.5899 0.5290 139 scope 0.4430 0.5116 0.4748 129 work 0.0000 0.0000 0.0000 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.4625 0.4111 0.4353 360 macro avg 0.1845 0.2203 0.2008 360 weighted avg 0.3439 0.4111 0.3744 360 2023-10-18 16:40:31,179 ----------------------------------------------------------------------------------------------------