flair-hipe-2022-ajmc-de / training.log
stefan-it's picture
Upload folder using huggingface_hub
8734368
raw
history blame
24.2 kB
2023-10-18 14:34:24,829 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,829 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:34:24,829 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Train: 1100 sentences
2023-10-18 14:34:24,830 (train_with_dev=False, train_with_test=False)
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Training Params:
2023-10-18 14:34:24,830 - learning_rate: "5e-05"
2023-10-18 14:34:24,830 - mini_batch_size: "4"
2023-10-18 14:34:24,830 - max_epochs: "10"
2023-10-18 14:34:24,830 - shuffle: "True"
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Plugins:
2023-10-18 14:34:24,830 - TensorboardLogger
2023-10-18 14:34:24,830 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:34:24,830 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Computation:
2023-10-18 14:34:24,830 - compute on device: cuda:0
2023-10-18 14:34:24,830 - embedding storage: none
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:24,830 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:34:25,241 epoch 1 - iter 27/275 - loss 3.94457406 - time (sec): 0.41 - samples/sec: 5973.53 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:34:25,651 epoch 1 - iter 54/275 - loss 3.97957719 - time (sec): 0.82 - samples/sec: 5721.37 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:34:26,051 epoch 1 - iter 81/275 - loss 3.82844636 - time (sec): 1.22 - samples/sec: 5541.40 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:34:26,457 epoch 1 - iter 108/275 - loss 3.64553980 - time (sec): 1.63 - samples/sec: 5422.35 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:34:26,871 epoch 1 - iter 135/275 - loss 3.35546642 - time (sec): 2.04 - samples/sec: 5558.03 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:34:27,288 epoch 1 - iter 162/275 - loss 3.06649694 - time (sec): 2.46 - samples/sec: 5526.20 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:34:27,700 epoch 1 - iter 189/275 - loss 2.82331060 - time (sec): 2.87 - samples/sec: 5497.83 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:34:28,102 epoch 1 - iter 216/275 - loss 2.59254150 - time (sec): 3.27 - samples/sec: 5551.63 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:34:28,507 epoch 1 - iter 243/275 - loss 2.42230854 - time (sec): 3.68 - samples/sec: 5476.43 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:34:28,903 epoch 1 - iter 270/275 - loss 2.29568740 - time (sec): 4.07 - samples/sec: 5490.59 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:34:28,979 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:28,979 EPOCH 1 done: loss 2.2689 - lr: 0.000049
2023-10-18 14:34:29,218 DEV : loss 0.8229100704193115 - f1-score (micro avg) 0.0
2023-10-18 14:34:29,222 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:29,620 epoch 2 - iter 27/275 - loss 0.84797441 - time (sec): 0.40 - samples/sec: 6212.02 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:34:30,019 epoch 2 - iter 54/275 - loss 0.89177898 - time (sec): 0.80 - samples/sec: 5916.38 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:34:30,418 epoch 2 - iter 81/275 - loss 0.91951425 - time (sec): 1.20 - samples/sec: 5871.17 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:34:30,826 epoch 2 - iter 108/275 - loss 0.89409210 - time (sec): 1.60 - samples/sec: 5771.55 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:34:31,201 epoch 2 - iter 135/275 - loss 0.88598127 - time (sec): 1.98 - samples/sec: 5696.78 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:34:31,610 epoch 2 - iter 162/275 - loss 0.85285616 - time (sec): 2.39 - samples/sec: 5704.02 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:34:32,021 epoch 2 - iter 189/275 - loss 0.82927359 - time (sec): 2.80 - samples/sec: 5684.11 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:34:32,418 epoch 2 - iter 216/275 - loss 0.80856883 - time (sec): 3.20 - samples/sec: 5597.78 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:34:32,831 epoch 2 - iter 243/275 - loss 0.78287459 - time (sec): 3.61 - samples/sec: 5624.79 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:34:33,243 epoch 2 - iter 270/275 - loss 0.76347923 - time (sec): 4.02 - samples/sec: 5554.65 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:34:33,315 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:33,315 EPOCH 2 done: loss 0.7612 - lr: 0.000045
2023-10-18 14:34:33,669 DEV : loss 0.5238796472549438 - f1-score (micro avg) 0.1895
2023-10-18 14:34:33,673 saving best model
2023-10-18 14:34:33,704 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:34,111 epoch 3 - iter 27/275 - loss 0.63330269 - time (sec): 0.41 - samples/sec: 5574.87 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:34:34,519 epoch 3 - iter 54/275 - loss 0.57059336 - time (sec): 0.81 - samples/sec: 5551.88 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:34:34,925 epoch 3 - iter 81/275 - loss 0.59892710 - time (sec): 1.22 - samples/sec: 5646.07 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:34:35,480 epoch 3 - iter 108/275 - loss 0.57642150 - time (sec): 1.77 - samples/sec: 5218.10 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:34:35,882 epoch 3 - iter 135/275 - loss 0.57839286 - time (sec): 2.18 - samples/sec: 5263.09 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:34:36,322 epoch 3 - iter 162/275 - loss 0.56175351 - time (sec): 2.62 - samples/sec: 5233.27 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:34:36,746 epoch 3 - iter 189/275 - loss 0.55386261 - time (sec): 3.04 - samples/sec: 5233.22 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:34:37,157 epoch 3 - iter 216/275 - loss 0.55331858 - time (sec): 3.45 - samples/sec: 5239.51 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:34:37,574 epoch 3 - iter 243/275 - loss 0.55295982 - time (sec): 3.87 - samples/sec: 5236.73 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:34:37,979 epoch 3 - iter 270/275 - loss 0.55071202 - time (sec): 4.27 - samples/sec: 5252.53 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:34:38,055 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:38,055 EPOCH 3 done: loss 0.5485 - lr: 0.000039
2023-10-18 14:34:38,415 DEV : loss 0.382687509059906 - f1-score (micro avg) 0.4746
2023-10-18 14:34:38,419 saving best model
2023-10-18 14:34:38,454 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:38,866 epoch 4 - iter 27/275 - loss 0.52440291 - time (sec): 0.41 - samples/sec: 5364.19 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:34:39,254 epoch 4 - iter 54/275 - loss 0.49379413 - time (sec): 0.80 - samples/sec: 5455.30 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:34:39,668 epoch 4 - iter 81/275 - loss 0.46685441 - time (sec): 1.21 - samples/sec: 5269.79 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:34:40,086 epoch 4 - iter 108/275 - loss 0.48160009 - time (sec): 1.63 - samples/sec: 5271.61 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:34:40,522 epoch 4 - iter 135/275 - loss 0.46798370 - time (sec): 2.07 - samples/sec: 5408.94 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:34:40,947 epoch 4 - iter 162/275 - loss 0.45395448 - time (sec): 2.49 - samples/sec: 5374.43 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:34:41,367 epoch 4 - iter 189/275 - loss 0.45338045 - time (sec): 2.91 - samples/sec: 5377.17 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:34:41,776 epoch 4 - iter 216/275 - loss 0.44578276 - time (sec): 3.32 - samples/sec: 5410.32 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:34:42,192 epoch 4 - iter 243/275 - loss 0.44115371 - time (sec): 3.74 - samples/sec: 5409.32 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:34:42,606 epoch 4 - iter 270/275 - loss 0.43517083 - time (sec): 4.15 - samples/sec: 5379.63 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:34:42,692 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:42,692 EPOCH 4 done: loss 0.4346 - lr: 0.000034
2023-10-18 14:34:43,058 DEV : loss 0.33680784702301025 - f1-score (micro avg) 0.5485
2023-10-18 14:34:43,062 saving best model
2023-10-18 14:34:43,096 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:43,497 epoch 5 - iter 27/275 - loss 0.34448775 - time (sec): 0.40 - samples/sec: 6541.62 - lr: 0.000033 - momentum: 0.000000
2023-10-18 14:34:43,884 epoch 5 - iter 54/275 - loss 0.35793341 - time (sec): 0.79 - samples/sec: 5918.05 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:34:44,276 epoch 5 - iter 81/275 - loss 0.37488252 - time (sec): 1.18 - samples/sec: 5855.39 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:34:44,689 epoch 5 - iter 108/275 - loss 0.38765131 - time (sec): 1.59 - samples/sec: 5872.04 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:34:45,096 epoch 5 - iter 135/275 - loss 0.38999235 - time (sec): 2.00 - samples/sec: 5703.31 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:34:45,505 epoch 5 - iter 162/275 - loss 0.38841687 - time (sec): 2.41 - samples/sec: 5644.53 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:34:45,905 epoch 5 - iter 189/275 - loss 0.38974761 - time (sec): 2.81 - samples/sec: 5610.05 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:34:46,321 epoch 5 - iter 216/275 - loss 0.39216271 - time (sec): 3.23 - samples/sec: 5552.62 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:34:46,737 epoch 5 - iter 243/275 - loss 0.38836120 - time (sec): 3.64 - samples/sec: 5498.94 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:34:47,146 epoch 5 - iter 270/275 - loss 0.39178740 - time (sec): 4.05 - samples/sec: 5524.38 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:34:47,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:47,217 EPOCH 5 done: loss 0.3892 - lr: 0.000028
2023-10-18 14:34:47,587 DEV : loss 0.286026269197464 - f1-score (micro avg) 0.6083
2023-10-18 14:34:47,591 saving best model
2023-10-18 14:34:47,626 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:48,018 epoch 6 - iter 27/275 - loss 0.48487282 - time (sec): 0.39 - samples/sec: 5038.91 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:34:48,409 epoch 6 - iter 54/275 - loss 0.39612221 - time (sec): 0.78 - samples/sec: 5401.34 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:34:48,806 epoch 6 - iter 81/275 - loss 0.39666419 - time (sec): 1.18 - samples/sec: 5570.88 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:34:49,200 epoch 6 - iter 108/275 - loss 0.38652012 - time (sec): 1.57 - samples/sec: 5614.65 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:34:49,610 epoch 6 - iter 135/275 - loss 0.36749916 - time (sec): 1.98 - samples/sec: 5518.23 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:34:50,028 epoch 6 - iter 162/275 - loss 0.36658781 - time (sec): 2.40 - samples/sec: 5590.13 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:34:50,430 epoch 6 - iter 189/275 - loss 0.35919349 - time (sec): 2.80 - samples/sec: 5581.01 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:34:50,827 epoch 6 - iter 216/275 - loss 0.34957161 - time (sec): 3.20 - samples/sec: 5574.48 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:34:51,248 epoch 6 - iter 243/275 - loss 0.35441459 - time (sec): 3.62 - samples/sec: 5600.18 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:34:51,648 epoch 6 - iter 270/275 - loss 0.35201354 - time (sec): 4.02 - samples/sec: 5573.54 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:34:51,725 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:51,725 EPOCH 6 done: loss 0.3508 - lr: 0.000022
2023-10-18 14:34:52,106 DEV : loss 0.26645413041114807 - f1-score (micro avg) 0.6255
2023-10-18 14:34:52,111 saving best model
2023-10-18 14:34:52,145 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:52,555 epoch 7 - iter 27/275 - loss 0.37548681 - time (sec): 0.41 - samples/sec: 5082.83 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:34:52,958 epoch 7 - iter 54/275 - loss 0.34514987 - time (sec): 0.81 - samples/sec: 5253.35 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:34:53,358 epoch 7 - iter 81/275 - loss 0.35183308 - time (sec): 1.21 - samples/sec: 5228.42 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:34:53,767 epoch 7 - iter 108/275 - loss 0.34792121 - time (sec): 1.62 - samples/sec: 5148.08 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:34:54,188 epoch 7 - iter 135/275 - loss 0.34092810 - time (sec): 2.04 - samples/sec: 5228.21 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:34:54,594 epoch 7 - iter 162/275 - loss 0.33335193 - time (sec): 2.45 - samples/sec: 5352.35 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:34:55,011 epoch 7 - iter 189/275 - loss 0.33956542 - time (sec): 2.87 - samples/sec: 5392.50 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:34:55,420 epoch 7 - iter 216/275 - loss 0.33319421 - time (sec): 3.27 - samples/sec: 5427.20 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:34:55,832 epoch 7 - iter 243/275 - loss 0.32691358 - time (sec): 3.69 - samples/sec: 5474.15 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:34:56,241 epoch 7 - iter 270/275 - loss 0.32870171 - time (sec): 4.10 - samples/sec: 5479.50 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:34:56,313 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:56,313 EPOCH 7 done: loss 0.3272 - lr: 0.000017
2023-10-18 14:34:56,679 DEV : loss 0.2574481666088104 - f1-score (micro avg) 0.6277
2023-10-18 14:34:56,683 saving best model
2023-10-18 14:34:56,718 ----------------------------------------------------------------------------------------------------
2023-10-18 14:34:57,134 epoch 8 - iter 27/275 - loss 0.35232911 - time (sec): 0.42 - samples/sec: 5472.65 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:34:57,541 epoch 8 - iter 54/275 - loss 0.32320199 - time (sec): 0.82 - samples/sec: 5193.04 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:34:57,932 epoch 8 - iter 81/275 - loss 0.31388709 - time (sec): 1.21 - samples/sec: 5306.17 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:34:58,336 epoch 8 - iter 108/275 - loss 0.32588481 - time (sec): 1.62 - samples/sec: 5448.85 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:34:58,750 epoch 8 - iter 135/275 - loss 0.30666641 - time (sec): 2.03 - samples/sec: 5539.05 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:34:59,144 epoch 8 - iter 162/275 - loss 0.29869815 - time (sec): 2.43 - samples/sec: 5520.78 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:34:59,550 epoch 8 - iter 189/275 - loss 0.30212004 - time (sec): 2.83 - samples/sec: 5503.36 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:34:59,954 epoch 8 - iter 216/275 - loss 0.30866162 - time (sec): 3.24 - samples/sec: 5459.34 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:00,366 epoch 8 - iter 243/275 - loss 0.31213850 - time (sec): 3.65 - samples/sec: 5466.83 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:00,795 epoch 8 - iter 270/275 - loss 0.31265385 - time (sec): 4.08 - samples/sec: 5484.78 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:00,877 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:00,877 EPOCH 8 done: loss 0.3117 - lr: 0.000011
2023-10-18 14:35:01,258 DEV : loss 0.24906539916992188 - f1-score (micro avg) 0.6434
2023-10-18 14:35:01,263 saving best model
2023-10-18 14:35:01,299 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:01,725 epoch 9 - iter 27/275 - loss 0.28812589 - time (sec): 0.42 - samples/sec: 5407.92 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:02,140 epoch 9 - iter 54/275 - loss 0.30351124 - time (sec): 0.84 - samples/sec: 5479.49 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:35:02,566 epoch 9 - iter 81/275 - loss 0.30512938 - time (sec): 1.27 - samples/sec: 5373.65 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:35:02,965 epoch 9 - iter 108/275 - loss 0.32208951 - time (sec): 1.67 - samples/sec: 5342.06 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:03,367 epoch 9 - iter 135/275 - loss 0.32851618 - time (sec): 2.07 - samples/sec: 5341.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:03,774 epoch 9 - iter 162/275 - loss 0.32796046 - time (sec): 2.47 - samples/sec: 5362.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:35:04,187 epoch 9 - iter 189/275 - loss 0.31928635 - time (sec): 2.89 - samples/sec: 5363.87 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:35:04,598 epoch 9 - iter 216/275 - loss 0.31189310 - time (sec): 3.30 - samples/sec: 5409.81 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:35:05,011 epoch 9 - iter 243/275 - loss 0.31212107 - time (sec): 3.71 - samples/sec: 5502.41 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:35:05,409 epoch 9 - iter 270/275 - loss 0.30980213 - time (sec): 4.11 - samples/sec: 5439.67 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:35:05,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:05,483 EPOCH 9 done: loss 0.3114 - lr: 0.000006
2023-10-18 14:35:05,873 DEV : loss 0.24717433750629425 - f1-score (micro avg) 0.6471
2023-10-18 14:35:05,878 saving best model
2023-10-18 14:35:05,918 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:06,353 epoch 10 - iter 27/275 - loss 0.25692152 - time (sec): 0.43 - samples/sec: 5151.95 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:06,767 epoch 10 - iter 54/275 - loss 0.28814194 - time (sec): 0.85 - samples/sec: 5358.13 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:07,169 epoch 10 - iter 81/275 - loss 0.29696314 - time (sec): 1.25 - samples/sec: 5240.54 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:35:07,563 epoch 10 - iter 108/275 - loss 0.29047393 - time (sec): 1.64 - samples/sec: 5305.80 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:35:07,974 epoch 10 - iter 135/275 - loss 0.30111439 - time (sec): 2.06 - samples/sec: 5365.85 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:35:08,388 epoch 10 - iter 162/275 - loss 0.30748718 - time (sec): 2.47 - samples/sec: 5416.22 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:08,811 epoch 10 - iter 189/275 - loss 0.31512691 - time (sec): 2.89 - samples/sec: 5459.24 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:09,217 epoch 10 - iter 216/275 - loss 0.30589871 - time (sec): 3.30 - samples/sec: 5491.21 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:35:09,630 epoch 10 - iter 243/275 - loss 0.30215041 - time (sec): 3.71 - samples/sec: 5434.46 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:35:10,047 epoch 10 - iter 270/275 - loss 0.29793489 - time (sec): 4.13 - samples/sec: 5420.59 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:35:10,122 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:10,122 EPOCH 10 done: loss 0.2969 - lr: 0.000000
2023-10-18 14:35:10,495 DEV : loss 0.2449171245098114 - f1-score (micro avg) 0.644
2023-10-18 14:35:10,529 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:10,529 Loading model from best epoch ...
2023-10-18 14:35:10,603 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:35:10,884
Results:
- F-score (micro) 0.6606
- F-score (macro) 0.3933
- Accuracy 0.501
By class:
precision recall f1-score support
scope 0.6216 0.6534 0.6371 176
pers 0.8962 0.7422 0.8120 128
work 0.4583 0.5946 0.5176 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.6563 0.6649 0.6606 382
macro avg 0.3952 0.3980 0.3933 382
weighted avg 0.6755 0.6649 0.6659 382
2023-10-18 14:35:10,884 ----------------------------------------------------------------------------------------------------