2023-10-18 16:36:23,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,556 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:36:23,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,556 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-18 16:36:23,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,556 Train: 966 sentences 2023-10-18 16:36:23,556 (train_with_dev=False, train_with_test=False) 2023-10-18 16:36:23,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,556 Training Params: 2023-10-18 16:36:23,557 - learning_rate: "5e-05" 2023-10-18 16:36:23,557 - mini_batch_size: "4" 2023-10-18 16:36:23,557 - max_epochs: "10" 2023-10-18 16:36:23,557 - shuffle: "True" 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 Plugins: 2023-10-18 16:36:23,557 - TensorboardLogger 2023-10-18 16:36:23,557 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:36:23,557 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 Computation: 2023-10-18 16:36:23,557 - compute on device: cuda:0 2023-10-18 16:36:23,557 - embedding storage: none 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:23,557 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:36:23,903 epoch 1 - iter 24/242 - loss 4.06894328 - time (sec): 0.35 - samples/sec: 7303.90 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:36:24,299 epoch 1 - iter 48/242 - loss 4.02616879 - time (sec): 0.74 - samples/sec: 6639.93 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:36:24,687 epoch 1 - iter 72/242 - loss 3.82600153 - time (sec): 1.13 - samples/sec: 6689.58 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:36:25,057 epoch 1 - iter 96/242 - loss 3.59033518 - time (sec): 1.50 - samples/sec: 6922.07 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:36:25,433 epoch 1 - iter 120/242 - loss 3.38518381 - time (sec): 1.88 - samples/sec: 6651.40 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:36:25,802 epoch 1 - iter 144/242 - loss 3.11809328 - time (sec): 2.24 - samples/sec: 6616.40 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:36:26,173 epoch 1 - iter 168/242 - loss 2.85941045 - time (sec): 2.62 - samples/sec: 6562.65 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:36:26,542 epoch 1 - iter 192/242 - loss 2.60859374 - time (sec): 2.98 - samples/sec: 6557.08 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:36:26,921 epoch 1 - iter 216/242 - loss 2.38057612 - time (sec): 3.36 - samples/sec: 6626.07 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:36:27,289 epoch 1 - iter 240/242 - loss 2.22565024 - time (sec): 3.73 - samples/sec: 6578.79 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:36:27,318 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:27,318 EPOCH 1 done: loss 2.2152 - lr: 0.000049 2023-10-18 16:36:27,805 DEV : loss 0.6376588940620422 - f1-score (micro avg) 0.0 2023-10-18 16:36:27,810 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:28,173 epoch 2 - iter 24/242 - loss 0.71028364 - time (sec): 0.36 - samples/sec: 6819.79 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:36:28,539 epoch 2 - iter 48/242 - loss 0.71336343 - time (sec): 0.73 - samples/sec: 7166.82 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:36:28,907 epoch 2 - iter 72/242 - loss 0.67520329 - time (sec): 1.10 - samples/sec: 7266.90 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:36:29,273 epoch 2 - iter 96/242 - loss 0.66293218 - time (sec): 1.46 - samples/sec: 7276.61 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:36:29,622 epoch 2 - iter 120/242 - loss 0.65478994 - time (sec): 1.81 - samples/sec: 6927.43 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:36:29,980 epoch 2 - iter 144/242 - loss 0.64752733 - time (sec): 2.17 - samples/sec: 6783.23 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:36:30,346 epoch 2 - iter 168/242 - loss 0.64034132 - time (sec): 2.54 - samples/sec: 6729.42 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:36:30,713 epoch 2 - iter 192/242 - loss 0.63776300 - time (sec): 2.90 - samples/sec: 6775.14 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:36:31,074 epoch 2 - iter 216/242 - loss 0.62689871 - time (sec): 3.26 - samples/sec: 6784.92 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:36:31,458 epoch 2 - iter 240/242 - loss 0.60546341 - time (sec): 3.65 - samples/sec: 6743.08 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:36:31,484 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:31,484 EPOCH 2 done: loss 0.6054 - lr: 0.000045 2023-10-18 16:36:31,911 DEV : loss 0.44410139322280884 - f1-score (micro avg) 0.1726 2023-10-18 16:36:31,915 saving best model 2023-10-18 16:36:31,941 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:32,302 epoch 3 - iter 24/242 - loss 0.47995476 - time (sec): 0.36 - samples/sec: 6864.00 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:36:32,645 epoch 3 - iter 48/242 - loss 0.51537977 - time (sec): 0.70 - samples/sec: 6600.20 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:36:33,009 epoch 3 - iter 72/242 - loss 0.52591455 - time (sec): 1.07 - samples/sec: 6524.08 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:36:33,351 epoch 3 - iter 96/242 - loss 0.50834626 - time (sec): 1.41 - samples/sec: 6739.80 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:36:33,688 epoch 3 - iter 120/242 - loss 0.49923645 - time (sec): 1.75 - samples/sec: 6818.81 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:36:34,040 epoch 3 - iter 144/242 - loss 0.47877640 - time (sec): 2.10 - samples/sec: 6982.35 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:36:34,400 epoch 3 - iter 168/242 - loss 0.47198182 - time (sec): 2.46 - samples/sec: 6952.98 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:36:34,757 epoch 3 - iter 192/242 - loss 0.46892658 - time (sec): 2.82 - samples/sec: 6916.36 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:36:35,142 epoch 3 - iter 216/242 - loss 0.45945229 - time (sec): 3.20 - samples/sec: 6897.19 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:36:35,519 epoch 3 - iter 240/242 - loss 0.45603902 - time (sec): 3.58 - samples/sec: 6859.49 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:36:35,546 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:35,547 EPOCH 3 done: loss 0.4571 - lr: 0.000039 2023-10-18 16:36:35,977 DEV : loss 0.3233063220977783 - f1-score (micro avg) 0.4841 2023-10-18 16:36:35,981 saving best model 2023-10-18 16:36:36,017 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:36,391 epoch 4 - iter 24/242 - loss 0.45516957 - time (sec): 0.37 - samples/sec: 7271.44 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:36:36,760 epoch 4 - iter 48/242 - loss 0.40167627 - time (sec): 0.74 - samples/sec: 6707.20 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:36:37,131 epoch 4 - iter 72/242 - loss 0.40130941 - time (sec): 1.11 - samples/sec: 6552.25 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:36:37,514 epoch 4 - iter 96/242 - loss 0.39325755 - time (sec): 1.50 - samples/sec: 6506.63 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:36:37,903 epoch 4 - iter 120/242 - loss 0.38875783 - time (sec): 1.88 - samples/sec: 6623.69 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:36:38,271 epoch 4 - iter 144/242 - loss 0.38181180 - time (sec): 2.25 - samples/sec: 6590.83 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:36:38,647 epoch 4 - iter 168/242 - loss 0.37363013 - time (sec): 2.63 - samples/sec: 6529.96 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:36:39,013 epoch 4 - iter 192/242 - loss 0.36751058 - time (sec): 3.00 - samples/sec: 6612.55 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:36:39,388 epoch 4 - iter 216/242 - loss 0.37614560 - time (sec): 3.37 - samples/sec: 6580.07 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:36:39,751 epoch 4 - iter 240/242 - loss 0.37512641 - time (sec): 3.73 - samples/sec: 6591.02 - lr: 0.000033 - momentum: 0.000000 2023-10-18 16:36:39,779 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:39,779 EPOCH 4 done: loss 0.3752 - lr: 0.000033 2023-10-18 16:36:40,207 DEV : loss 0.2869265079498291 - f1-score (micro avg) 0.4813 2023-10-18 16:36:40,211 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:40,588 epoch 5 - iter 24/242 - loss 0.34496146 - time (sec): 0.38 - samples/sec: 6220.26 - lr: 0.000033 - momentum: 0.000000 2023-10-18 16:36:40,973 epoch 5 - iter 48/242 - loss 0.34981845 - time (sec): 0.76 - samples/sec: 6280.07 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:36:41,332 epoch 5 - iter 72/242 - loss 0.34066324 - time (sec): 1.12 - samples/sec: 6387.39 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:36:41,722 epoch 5 - iter 96/242 - loss 0.34406469 - time (sec): 1.51 - samples/sec: 6597.13 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:36:42,097 epoch 5 - iter 120/242 - loss 0.34773483 - time (sec): 1.89 - samples/sec: 6664.73 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:36:42,484 epoch 5 - iter 144/242 - loss 0.34325992 - time (sec): 2.27 - samples/sec: 6613.90 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:36:42,869 epoch 5 - iter 168/242 - loss 0.34035695 - time (sec): 2.66 - samples/sec: 6510.16 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:36:43,256 epoch 5 - iter 192/242 - loss 0.33886687 - time (sec): 3.04 - samples/sec: 6527.83 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:36:43,626 epoch 5 - iter 216/242 - loss 0.34406810 - time (sec): 3.41 - samples/sec: 6519.10 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:36:43,992 epoch 5 - iter 240/242 - loss 0.33921636 - time (sec): 3.78 - samples/sec: 6507.41 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:36:44,020 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:44,020 EPOCH 5 done: loss 0.3406 - lr: 0.000028 2023-10-18 16:36:44,444 DEV : loss 0.2742924690246582 - f1-score (micro avg) 0.4738 2023-10-18 16:36:44,449 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:44,817 epoch 6 - iter 24/242 - loss 0.31470992 - time (sec): 0.37 - samples/sec: 6960.30 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:36:45,180 epoch 6 - iter 48/242 - loss 0.33456371 - time (sec): 0.73 - samples/sec: 7029.92 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:36:45,545 epoch 6 - iter 72/242 - loss 0.30996261 - time (sec): 1.10 - samples/sec: 6973.68 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:36:45,924 epoch 6 - iter 96/242 - loss 0.31807708 - time (sec): 1.48 - samples/sec: 6892.55 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:36:46,313 epoch 6 - iter 120/242 - loss 0.31619253 - time (sec): 1.86 - samples/sec: 6800.34 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:36:46,706 epoch 6 - iter 144/242 - loss 0.30833559 - time (sec): 2.26 - samples/sec: 6714.31 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:36:47,071 epoch 6 - iter 168/242 - loss 0.30530212 - time (sec): 2.62 - samples/sec: 6621.78 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:36:47,459 epoch 6 - iter 192/242 - loss 0.31101338 - time (sec): 3.01 - samples/sec: 6580.76 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:36:47,826 epoch 6 - iter 216/242 - loss 0.31555222 - time (sec): 3.38 - samples/sec: 6603.40 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:36:48,218 epoch 6 - iter 240/242 - loss 0.31269087 - time (sec): 3.77 - samples/sec: 6528.91 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:36:48,243 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:48,243 EPOCH 6 done: loss 0.3112 - lr: 0.000022 2023-10-18 16:36:48,681 DEV : loss 0.25722795724868774 - f1-score (micro avg) 0.4952 2023-10-18 16:36:48,685 saving best model 2023-10-18 16:36:48,721 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:49,070 epoch 7 - iter 24/242 - loss 0.28473485 - time (sec): 0.35 - samples/sec: 8026.85 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:36:49,410 epoch 7 - iter 48/242 - loss 0.28032042 - time (sec): 0.69 - samples/sec: 7682.13 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:36:49,749 epoch 7 - iter 72/242 - loss 0.28315683 - time (sec): 1.03 - samples/sec: 7471.26 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:36:50,084 epoch 7 - iter 96/242 - loss 0.29487916 - time (sec): 1.36 - samples/sec: 7312.20 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:36:50,414 epoch 7 - iter 120/242 - loss 0.29815540 - time (sec): 1.69 - samples/sec: 7239.42 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:36:50,787 epoch 7 - iter 144/242 - loss 0.29268716 - time (sec): 2.06 - samples/sec: 7062.20 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:36:51,176 epoch 7 - iter 168/242 - loss 0.29539971 - time (sec): 2.45 - samples/sec: 6963.28 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:36:51,556 epoch 7 - iter 192/242 - loss 0.29791900 - time (sec): 2.83 - samples/sec: 6917.39 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:36:51,946 epoch 7 - iter 216/242 - loss 0.29986562 - time (sec): 3.22 - samples/sec: 6849.24 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:36:52,324 epoch 7 - iter 240/242 - loss 0.29782140 - time (sec): 3.60 - samples/sec: 6830.49 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:36:52,351 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:52,352 EPOCH 7 done: loss 0.2979 - lr: 0.000017 2023-10-18 16:36:52,783 DEV : loss 0.2522064745426178 - f1-score (micro avg) 0.5171 2023-10-18 16:36:52,787 saving best model 2023-10-18 16:36:52,823 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:53,199 epoch 8 - iter 24/242 - loss 0.41692853 - time (sec): 0.37 - samples/sec: 7290.35 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:36:53,552 epoch 8 - iter 48/242 - loss 0.35604078 - time (sec): 0.73 - samples/sec: 6821.61 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:36:53,915 epoch 8 - iter 72/242 - loss 0.32862463 - time (sec): 1.09 - samples/sec: 6801.12 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:36:54,296 epoch 8 - iter 96/242 - loss 0.30818150 - time (sec): 1.47 - samples/sec: 6767.13 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:36:54,674 epoch 8 - iter 120/242 - loss 0.29987867 - time (sec): 1.85 - samples/sec: 6785.04 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:36:55,043 epoch 8 - iter 144/242 - loss 0.29155435 - time (sec): 2.22 - samples/sec: 6669.29 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:36:55,419 epoch 8 - iter 168/242 - loss 0.28828688 - time (sec): 2.60 - samples/sec: 6654.60 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:36:55,807 epoch 8 - iter 192/242 - loss 0.28787647 - time (sec): 2.98 - samples/sec: 6646.64 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:36:56,187 epoch 8 - iter 216/242 - loss 0.29399445 - time (sec): 3.36 - samples/sec: 6649.16 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:36:56,576 epoch 8 - iter 240/242 - loss 0.28922123 - time (sec): 3.75 - samples/sec: 6566.51 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:36:56,606 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:56,607 EPOCH 8 done: loss 0.2887 - lr: 0.000011 2023-10-18 16:36:57,041 DEV : loss 0.2508045434951782 - f1-score (micro avg) 0.5126 2023-10-18 16:36:57,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:36:57,407 epoch 9 - iter 24/242 - loss 0.31471641 - time (sec): 0.36 - samples/sec: 6546.29 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:36:57,779 epoch 9 - iter 48/242 - loss 0.30325743 - time (sec): 0.73 - samples/sec: 6389.28 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:36:58,145 epoch 9 - iter 72/242 - loss 0.28445914 - time (sec): 1.10 - samples/sec: 6611.58 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:36:58,499 epoch 9 - iter 96/242 - loss 0.27733067 - time (sec): 1.45 - samples/sec: 6587.85 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:36:58,857 epoch 9 - iter 120/242 - loss 0.28394335 - time (sec): 1.81 - samples/sec: 6657.66 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:36:59,226 epoch 9 - iter 144/242 - loss 0.28730005 - time (sec): 2.18 - samples/sec: 6708.11 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:36:59,592 epoch 9 - iter 168/242 - loss 0.29711454 - time (sec): 2.55 - samples/sec: 6718.46 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:36:59,962 epoch 9 - iter 192/242 - loss 0.29329494 - time (sec): 2.92 - samples/sec: 6707.05 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:37:00,335 epoch 9 - iter 216/242 - loss 0.28875404 - time (sec): 3.29 - samples/sec: 6731.96 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:00,702 epoch 9 - iter 240/242 - loss 0.28175821 - time (sec): 3.66 - samples/sec: 6731.70 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:00,729 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:00,729 EPOCH 9 done: loss 0.2822 - lr: 0.000006 2023-10-18 16:37:01,161 DEV : loss 0.24359287321567535 - f1-score (micro avg) 0.5346 2023-10-18 16:37:01,166 saving best model 2023-10-18 16:37:01,200 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:01,568 epoch 10 - iter 24/242 - loss 0.26971381 - time (sec): 0.37 - samples/sec: 5710.03 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:37:01,933 epoch 10 - iter 48/242 - loss 0.27723315 - time (sec): 0.73 - samples/sec: 6180.39 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:37:02,292 epoch 10 - iter 72/242 - loss 0.27734124 - time (sec): 1.09 - samples/sec: 6415.68 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:37:02,665 epoch 10 - iter 96/242 - loss 0.28231197 - time (sec): 1.46 - samples/sec: 6552.71 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:03,035 epoch 10 - iter 120/242 - loss 0.27821230 - time (sec): 1.83 - samples/sec: 6595.37 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:03,427 epoch 10 - iter 144/242 - loss 0.27186754 - time (sec): 2.23 - samples/sec: 6586.10 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:37:03,804 epoch 10 - iter 168/242 - loss 0.27995109 - time (sec): 2.60 - samples/sec: 6562.68 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:37:04,180 epoch 10 - iter 192/242 - loss 0.28424878 - time (sec): 2.98 - samples/sec: 6557.62 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:37:04,562 epoch 10 - iter 216/242 - loss 0.27775247 - time (sec): 3.36 - samples/sec: 6536.76 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:37:04,904 epoch 10 - iter 240/242 - loss 0.27507515 - time (sec): 3.70 - samples/sec: 6662.50 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:37:04,927 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:04,927 EPOCH 10 done: loss 0.2750 - lr: 0.000000 2023-10-18 16:37:05,368 DEV : loss 0.2437872439622879 - f1-score (micro avg) 0.5329 2023-10-18 16:37:05,400 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:05,401 Loading model from best epoch ... 2023-10-18 16:37:05,478 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 16:37:05,894 Results: - F-score (micro) 0.4903 - F-score (macro) 0.2649 - Accuracy 0.3486 By class: precision recall f1-score support scope 0.3547 0.5581 0.4337 129 pers 0.5707 0.7554 0.6502 139 work 0.4643 0.1625 0.2407 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.4578 0.5278 0.4903 360 macro avg 0.2779 0.2952 0.2649 360 weighted avg 0.4506 0.5278 0.4600 360 2023-10-18 16:37:05,895 ----------------------------------------------------------------------------------------------------