stefan-it's picture
Upload folder using huggingface_hub
b743d26
raw
history blame
23.9 kB
2023-10-18 16:37:46,705 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Train: 966 sentences
2023-10-18 16:37:46,706 (train_with_dev=False, train_with_test=False)
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Training Params:
2023-10-18 16:37:46,706 - learning_rate: "5e-05"
2023-10-18 16:37:46,706 - mini_batch_size: "8"
2023-10-18 16:37:46,706 - max_epochs: "10"
2023-10-18 16:37:46,706 - shuffle: "True"
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Plugins:
2023-10-18 16:37:46,706 - TensorboardLogger
2023-10-18 16:37:46,706 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 16:37:46,706 - metric: "('micro avg', 'f1-score')"
2023-10-18 16:37:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,706 Computation:
2023-10-18 16:37:46,707 - compute on device: cuda:0
2023-10-18 16:37:46,707 - embedding storage: none
2023-10-18 16:37:46,707 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,707 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 16:37:46,707 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,707 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:46,707 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 16:37:46,983 epoch 1 - iter 12/121 - loss 4.06423244 - time (sec): 0.28 - samples/sec: 9156.40 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:37:47,262 epoch 1 - iter 24/121 - loss 3.89967677 - time (sec): 0.55 - samples/sec: 8881.22 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:37:47,552 epoch 1 - iter 36/121 - loss 3.84560872 - time (sec): 0.84 - samples/sec: 8947.53 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:37:47,838 epoch 1 - iter 48/121 - loss 3.71254191 - time (sec): 1.13 - samples/sec: 9182.28 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:37:48,099 epoch 1 - iter 60/121 - loss 3.57296790 - time (sec): 1.39 - samples/sec: 8964.72 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:37:48,369 epoch 1 - iter 72/121 - loss 3.42366005 - time (sec): 1.66 - samples/sec: 8935.77 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:37:48,633 epoch 1 - iter 84/121 - loss 3.24611854 - time (sec): 1.93 - samples/sec: 8911.86 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:37:48,899 epoch 1 - iter 96/121 - loss 3.03337096 - time (sec): 2.19 - samples/sec: 8930.89 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:37:49,163 epoch 1 - iter 108/121 - loss 2.81189925 - time (sec): 2.46 - samples/sec: 9074.53 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:37:49,435 epoch 1 - iter 120/121 - loss 2.63276419 - time (sec): 2.73 - samples/sec: 8999.91 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:37:49,455 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:49,455 EPOCH 1 done: loss 2.6200 - lr: 0.000049
2023-10-18 16:37:49,966 DEV : loss 0.6585149765014648 - f1-score (micro avg) 0.0
2023-10-18 16:37:49,970 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:50,235 epoch 2 - iter 12/121 - loss 0.79443559 - time (sec): 0.26 - samples/sec: 9359.92 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:37:50,516 epoch 2 - iter 24/121 - loss 0.81201782 - time (sec): 0.54 - samples/sec: 9591.10 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:37:50,789 epoch 2 - iter 36/121 - loss 0.76364936 - time (sec): 0.82 - samples/sec: 9742.55 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:37:51,053 epoch 2 - iter 48/121 - loss 0.73351545 - time (sec): 1.08 - samples/sec: 9830.85 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:37:51,306 epoch 2 - iter 60/121 - loss 0.72650713 - time (sec): 1.33 - samples/sec: 9403.27 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:37:51,569 epoch 2 - iter 72/121 - loss 0.71917414 - time (sec): 1.60 - samples/sec: 9209.16 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:37:51,857 epoch 2 - iter 84/121 - loss 0.71563670 - time (sec): 1.89 - samples/sec: 9047.31 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:37:52,141 epoch 2 - iter 96/121 - loss 0.71831549 - time (sec): 2.17 - samples/sec: 9064.31 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:37:52,417 epoch 2 - iter 108/121 - loss 0.70583269 - time (sec): 2.45 - samples/sec: 9050.50 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:37:52,694 epoch 2 - iter 120/121 - loss 0.68450949 - time (sec): 2.72 - samples/sec: 9031.31 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:37:52,716 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:52,716 EPOCH 2 done: loss 0.6847 - lr: 0.000045
2023-10-18 16:37:53,147 DEV : loss 0.5573856234550476 - f1-score (micro avg) 0.0
2023-10-18 16:37:53,152 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:53,431 epoch 3 - iter 12/121 - loss 0.57685052 - time (sec): 0.28 - samples/sec: 8896.62 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:37:53,691 epoch 3 - iter 24/121 - loss 0.59677529 - time (sec): 0.54 - samples/sec: 8619.78 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:37:53,962 epoch 3 - iter 36/121 - loss 0.62216977 - time (sec): 0.81 - samples/sec: 8603.50 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:37:54,232 epoch 3 - iter 48/121 - loss 0.61021734 - time (sec): 1.08 - samples/sec: 8793.59 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:37:54,507 epoch 3 - iter 60/121 - loss 0.60690012 - time (sec): 1.36 - samples/sec: 8787.69 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:37:54,781 epoch 3 - iter 72/121 - loss 0.57933023 - time (sec): 1.63 - samples/sec: 8993.46 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:37:55,051 epoch 3 - iter 84/121 - loss 0.57146458 - time (sec): 1.90 - samples/sec: 9003.22 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:37:55,325 epoch 3 - iter 96/121 - loss 0.56608970 - time (sec): 2.17 - samples/sec: 8961.15 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:37:55,601 epoch 3 - iter 108/121 - loss 0.55476563 - time (sec): 2.45 - samples/sec: 9013.13 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:37:55,881 epoch 3 - iter 120/121 - loss 0.55137957 - time (sec): 2.73 - samples/sec: 8992.62 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:37:55,904 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:55,904 EPOCH 3 done: loss 0.5512 - lr: 0.000039
2023-10-18 16:37:56,333 DEV : loss 0.42163369059562683 - f1-score (micro avg) 0.2788
2023-10-18 16:37:56,337 saving best model
2023-10-18 16:37:56,366 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:56,647 epoch 4 - iter 12/121 - loss 0.54623942 - time (sec): 0.28 - samples/sec: 9667.83 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:37:56,927 epoch 4 - iter 24/121 - loss 0.50675691 - time (sec): 0.56 - samples/sec: 8891.17 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:37:57,194 epoch 4 - iter 36/121 - loss 0.49893346 - time (sec): 0.83 - samples/sec: 8816.67 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:37:57,468 epoch 4 - iter 48/121 - loss 0.48315990 - time (sec): 1.10 - samples/sec: 8837.45 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:37:57,752 epoch 4 - iter 60/121 - loss 0.47606222 - time (sec): 1.39 - samples/sec: 9010.30 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:37:58,025 epoch 4 - iter 72/121 - loss 0.46853704 - time (sec): 1.66 - samples/sec: 8954.16 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:37:58,294 epoch 4 - iter 84/121 - loss 0.45908363 - time (sec): 1.93 - samples/sec: 8907.61 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:37:58,567 epoch 4 - iter 96/121 - loss 0.45593541 - time (sec): 2.20 - samples/sec: 9001.07 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:37:58,829 epoch 4 - iter 108/121 - loss 0.46075997 - time (sec): 2.46 - samples/sec: 9004.61 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:37:59,109 epoch 4 - iter 120/121 - loss 0.45718074 - time (sec): 2.74 - samples/sec: 8973.14 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:37:59,127 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:59,128 EPOCH 4 done: loss 0.4567 - lr: 0.000034
2023-10-18 16:37:59,559 DEV : loss 0.3446199297904968 - f1-score (micro avg) 0.4725
2023-10-18 16:37:59,563 saving best model
2023-10-18 16:37:59,597 ----------------------------------------------------------------------------------------------------
2023-10-18 16:37:59,856 epoch 5 - iter 12/121 - loss 0.41508969 - time (sec): 0.26 - samples/sec: 9044.81 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:38:00,125 epoch 5 - iter 24/121 - loss 0.42994677 - time (sec): 0.53 - samples/sec: 9060.25 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:38:00,394 epoch 5 - iter 36/121 - loss 0.41473805 - time (sec): 0.80 - samples/sec: 8980.16 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:38:00,679 epoch 5 - iter 48/121 - loss 0.40379782 - time (sec): 1.08 - samples/sec: 9215.40 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:38:00,951 epoch 5 - iter 60/121 - loss 0.40918990 - time (sec): 1.35 - samples/sec: 9280.53 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:38:01,228 epoch 5 - iter 72/121 - loss 0.40526997 - time (sec): 1.63 - samples/sec: 9214.71 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:38:01,493 epoch 5 - iter 84/121 - loss 0.40300316 - time (sec): 1.90 - samples/sec: 9127.93 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:38:01,763 epoch 5 - iter 96/121 - loss 0.39720734 - time (sec): 2.17 - samples/sec: 9178.51 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:38:02,034 epoch 5 - iter 108/121 - loss 0.40050812 - time (sec): 2.44 - samples/sec: 9133.80 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:38:02,305 epoch 5 - iter 120/121 - loss 0.39605476 - time (sec): 2.71 - samples/sec: 9086.62 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:38:02,322 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:02,322 EPOCH 5 done: loss 0.3973 - lr: 0.000028
2023-10-18 16:38:02,752 DEV : loss 0.3135037422180176 - f1-score (micro avg) 0.4919
2023-10-18 16:38:02,756 saving best model
2023-10-18 16:38:02,789 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:03,066 epoch 6 - iter 12/121 - loss 0.35254438 - time (sec): 0.28 - samples/sec: 9292.40 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:38:03,341 epoch 6 - iter 24/121 - loss 0.36693620 - time (sec): 0.55 - samples/sec: 9320.89 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:38:03,609 epoch 6 - iter 36/121 - loss 0.36194737 - time (sec): 0.82 - samples/sec: 9335.62 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:38:03,900 epoch 6 - iter 48/121 - loss 0.37128051 - time (sec): 1.11 - samples/sec: 9156.22 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:38:04,178 epoch 6 - iter 60/121 - loss 0.36331801 - time (sec): 1.39 - samples/sec: 9133.13 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:38:04,448 epoch 6 - iter 72/121 - loss 0.35824236 - time (sec): 1.66 - samples/sec: 9138.92 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:38:04,717 epoch 6 - iter 84/121 - loss 0.35498454 - time (sec): 1.93 - samples/sec: 9010.20 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:38:04,989 epoch 6 - iter 96/121 - loss 0.36536908 - time (sec): 2.20 - samples/sec: 9007.78 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:38:05,265 epoch 6 - iter 108/121 - loss 0.37179129 - time (sec): 2.47 - samples/sec: 9011.06 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:38:05,560 epoch 6 - iter 120/121 - loss 0.36874895 - time (sec): 2.77 - samples/sec: 8883.11 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:38:05,578 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:05,578 EPOCH 6 done: loss 0.3670 - lr: 0.000022
2023-10-18 16:38:06,011 DEV : loss 0.29502981901168823 - f1-score (micro avg) 0.4904
2023-10-18 16:38:06,015 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:06,311 epoch 7 - iter 12/121 - loss 0.33538454 - time (sec): 0.30 - samples/sec: 9443.11 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:38:06,601 epoch 7 - iter 24/121 - loss 0.33269331 - time (sec): 0.59 - samples/sec: 9015.43 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:38:06,886 epoch 7 - iter 36/121 - loss 0.34057186 - time (sec): 0.87 - samples/sec: 8808.40 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:38:07,164 epoch 7 - iter 48/121 - loss 0.35006255 - time (sec): 1.15 - samples/sec: 8665.54 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:38:07,447 epoch 7 - iter 60/121 - loss 0.35051450 - time (sec): 1.43 - samples/sec: 8554.00 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:38:07,731 epoch 7 - iter 72/121 - loss 0.34679089 - time (sec): 1.72 - samples/sec: 8497.45 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:38:08,016 epoch 7 - iter 84/121 - loss 0.34961647 - time (sec): 2.00 - samples/sec: 8541.49 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:38:08,294 epoch 7 - iter 96/121 - loss 0.35121463 - time (sec): 2.28 - samples/sec: 8601.73 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:38:08,581 epoch 7 - iter 108/121 - loss 0.35352437 - time (sec): 2.57 - samples/sec: 8606.00 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:38:08,861 epoch 7 - iter 120/121 - loss 0.35152520 - time (sec): 2.84 - samples/sec: 8645.20 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:38:08,881 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:08,881 EPOCH 7 done: loss 0.3518 - lr: 0.000017
2023-10-18 16:38:09,312 DEV : loss 0.28431111574172974 - f1-score (micro avg) 0.4868
2023-10-18 16:38:09,316 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:09,610 epoch 8 - iter 12/121 - loss 0.47161874 - time (sec): 0.29 - samples/sec: 9319.18 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:38:09,888 epoch 8 - iter 24/121 - loss 0.39663415 - time (sec): 0.57 - samples/sec: 8686.15 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:38:10,184 epoch 8 - iter 36/121 - loss 0.37145716 - time (sec): 0.87 - samples/sec: 8550.75 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:38:10,456 epoch 8 - iter 48/121 - loss 0.35177998 - time (sec): 1.14 - samples/sec: 8736.95 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:38:10,738 epoch 8 - iter 60/121 - loss 0.34654712 - time (sec): 1.42 - samples/sec: 8830.25 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:38:11,013 epoch 8 - iter 72/121 - loss 0.34061403 - time (sec): 1.70 - samples/sec: 8720.62 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:38:11,301 epoch 8 - iter 84/121 - loss 0.33533672 - time (sec): 1.98 - samples/sec: 8702.28 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:38:11,582 epoch 8 - iter 96/121 - loss 0.33895771 - time (sec): 2.27 - samples/sec: 8751.23 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:38:11,865 epoch 8 - iter 108/121 - loss 0.34448416 - time (sec): 2.55 - samples/sec: 8773.46 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:38:12,143 epoch 8 - iter 120/121 - loss 0.34155459 - time (sec): 2.83 - samples/sec: 8715.12 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:38:12,162 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:12,162 EPOCH 8 done: loss 0.3409 - lr: 0.000011
2023-10-18 16:38:12,600 DEV : loss 0.277322381734848 - f1-score (micro avg) 0.4874
2023-10-18 16:38:12,604 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:12,826 epoch 9 - iter 12/121 - loss 0.36612372 - time (sec): 0.22 - samples/sec: 10690.18 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:38:13,049 epoch 9 - iter 24/121 - loss 0.35162201 - time (sec): 0.44 - samples/sec: 10550.48 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:38:13,291 epoch 9 - iter 36/121 - loss 0.34091714 - time (sec): 0.69 - samples/sec: 10582.62 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:38:13,549 epoch 9 - iter 48/121 - loss 0.32767882 - time (sec): 0.94 - samples/sec: 10136.48 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:38:13,819 epoch 9 - iter 60/121 - loss 0.33057948 - time (sec): 1.21 - samples/sec: 9924.73 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:38:14,083 epoch 9 - iter 72/121 - loss 0.32622971 - time (sec): 1.48 - samples/sec: 9892.74 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:38:14,356 epoch 9 - iter 84/121 - loss 0.33658856 - time (sec): 1.75 - samples/sec: 9768.35 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:38:14,633 epoch 9 - iter 96/121 - loss 0.33404051 - time (sec): 2.03 - samples/sec: 9642.83 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:38:14,922 epoch 9 - iter 108/121 - loss 0.32949463 - time (sec): 2.32 - samples/sec: 9555.83 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:38:15,203 epoch 9 - iter 120/121 - loss 0.32480974 - time (sec): 2.60 - samples/sec: 9472.42 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:38:15,221 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:15,221 EPOCH 9 done: loss 0.3259 - lr: 0.000006
2023-10-18 16:38:15,664 DEV : loss 0.2769020199775696 - f1-score (micro avg) 0.485
2023-10-18 16:38:15,669 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:15,941 epoch 10 - iter 12/121 - loss 0.31703877 - time (sec): 0.27 - samples/sec: 7708.05 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:38:16,205 epoch 10 - iter 24/121 - loss 0.32343603 - time (sec): 0.54 - samples/sec: 8437.40 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:38:16,478 epoch 10 - iter 36/121 - loss 0.32421369 - time (sec): 0.81 - samples/sec: 8652.52 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:38:16,753 epoch 10 - iter 48/121 - loss 0.33412753 - time (sec): 1.08 - samples/sec: 8854.06 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:38:17,025 epoch 10 - iter 60/121 - loss 0.32558920 - time (sec): 1.36 - samples/sec: 8915.05 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:38:17,304 epoch 10 - iter 72/121 - loss 0.31995198 - time (sec): 1.63 - samples/sec: 8966.51 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:38:17,583 epoch 10 - iter 84/121 - loss 0.33265760 - time (sec): 1.91 - samples/sec: 8922.66 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:38:17,849 epoch 10 - iter 96/121 - loss 0.33878559 - time (sec): 2.18 - samples/sec: 8962.03 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:38:18,113 epoch 10 - iter 108/121 - loss 0.33332984 - time (sec): 2.44 - samples/sec: 8989.20 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:38:18,396 epoch 10 - iter 120/121 - loss 0.32884184 - time (sec): 2.73 - samples/sec: 9047.08 - lr: 0.000000 - momentum: 0.000000
2023-10-18 16:38:18,413 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:18,413 EPOCH 10 done: loss 0.3288 - lr: 0.000000
2023-10-18 16:38:18,841 DEV : loss 0.2749263644218445 - f1-score (micro avg) 0.4808
2023-10-18 16:38:18,876 ----------------------------------------------------------------------------------------------------
2023-10-18 16:38:18,876 Loading model from best epoch ...
2023-10-18 16:38:18,958 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 16:38:19,355
Results:
- F-score (micro) 0.4391
- F-score (macro) 0.2022
- Accuracy 0.2964
By class:
precision recall f1-score support
scope 0.3500 0.4884 0.4078 129
pers 0.5542 0.6619 0.6033 139
work 0.0000 0.0000 0.0000 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.4480 0.4306 0.4391 360
macro avg 0.1808 0.2300 0.2022 360
weighted avg 0.3394 0.4306 0.3790 360
2023-10-18 16:38:19,355 ----------------------------------------------------------------------------------------------------