|
2023-10-16 20:18:22,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,609 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 20:18:22,609 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,609 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 20:18:22,609 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,609 Train: 1085 sentences |
|
2023-10-16 20:18:22,609 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 20:18:22,609 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 Training Params: |
|
2023-10-16 20:18:22,610 - learning_rate: "5e-05" |
|
2023-10-16 20:18:22,610 - mini_batch_size: "8" |
|
2023-10-16 20:18:22,610 - max_epochs: "10" |
|
2023-10-16 20:18:22,610 - shuffle: "True" |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 Plugins: |
|
2023-10-16 20:18:22,610 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 20:18:22,610 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 Computation: |
|
2023-10-16 20:18:22,610 - compute on device: cuda:0 |
|
2023-10-16 20:18:22,610 - embedding storage: none |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:22,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:23,900 epoch 1 - iter 13/136 - loss 2.93136957 - time (sec): 1.29 - samples/sec: 3687.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:18:25,583 epoch 1 - iter 26/136 - loss 2.61761304 - time (sec): 2.97 - samples/sec: 3560.79 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:18:27,189 epoch 1 - iter 39/136 - loss 2.10292372 - time (sec): 4.58 - samples/sec: 3480.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:18:28,654 epoch 1 - iter 52/136 - loss 1.69932419 - time (sec): 6.04 - samples/sec: 3541.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:18:30,218 epoch 1 - iter 65/136 - loss 1.46083257 - time (sec): 7.61 - samples/sec: 3509.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:18:31,384 epoch 1 - iter 78/136 - loss 1.32435059 - time (sec): 8.77 - samples/sec: 3550.71 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:18:32,960 epoch 1 - iter 91/136 - loss 1.19352850 - time (sec): 10.35 - samples/sec: 3485.30 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 20:18:34,200 epoch 1 - iter 104/136 - loss 1.10278115 - time (sec): 11.59 - samples/sec: 3501.34 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 20:18:35,599 epoch 1 - iter 117/136 - loss 1.01082586 - time (sec): 12.99 - samples/sec: 3499.00 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 20:18:36,934 epoch 1 - iter 130/136 - loss 0.93871330 - time (sec): 14.32 - samples/sec: 3480.74 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 20:18:37,489 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:37,489 EPOCH 1 done: loss 0.9142 - lr: 0.000047 |
|
2023-10-16 20:18:38,542 DEV : loss 0.17539678514003754 - f1-score (micro avg) 0.6643 |
|
2023-10-16 20:18:38,546 saving best model |
|
2023-10-16 20:18:38,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:40,211 epoch 2 - iter 13/136 - loss 0.22301634 - time (sec): 1.33 - samples/sec: 3488.45 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 20:18:41,522 epoch 2 - iter 26/136 - loss 0.18431598 - time (sec): 2.64 - samples/sec: 3631.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 20:18:42,978 epoch 2 - iter 39/136 - loss 0.16140995 - time (sec): 4.09 - samples/sec: 3598.49 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 20:18:44,206 epoch 2 - iter 52/136 - loss 0.17264568 - time (sec): 5.32 - samples/sec: 3718.67 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 20:18:45,431 epoch 2 - iter 65/136 - loss 0.18654532 - time (sec): 6.55 - samples/sec: 3641.40 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 20:18:46,877 epoch 2 - iter 78/136 - loss 0.18635291 - time (sec): 7.99 - samples/sec: 3574.94 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 20:18:48,524 epoch 2 - iter 91/136 - loss 0.17676252 - time (sec): 9.64 - samples/sec: 3538.90 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 20:18:49,939 epoch 2 - iter 104/136 - loss 0.16965661 - time (sec): 11.05 - samples/sec: 3571.79 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 20:18:51,488 epoch 2 - iter 117/136 - loss 0.16802195 - time (sec): 12.60 - samples/sec: 3573.08 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 20:18:52,961 epoch 2 - iter 130/136 - loss 0.16391174 - time (sec): 14.08 - samples/sec: 3553.33 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 20:18:53,453 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:53,453 EPOCH 2 done: loss 0.1653 - lr: 0.000045 |
|
2023-10-16 20:18:54,968 DEV : loss 0.14248891174793243 - f1-score (micro avg) 0.691 |
|
2023-10-16 20:18:54,972 saving best model |
|
2023-10-16 20:18:55,435 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:18:56,982 epoch 3 - iter 13/136 - loss 0.10095872 - time (sec): 1.55 - samples/sec: 3543.51 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 20:18:58,419 epoch 3 - iter 26/136 - loss 0.10478173 - time (sec): 2.98 - samples/sec: 3660.03 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 20:18:59,853 epoch 3 - iter 39/136 - loss 0.09423993 - time (sec): 4.42 - samples/sec: 3552.60 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 20:19:01,429 epoch 3 - iter 52/136 - loss 0.09587897 - time (sec): 5.99 - samples/sec: 3509.06 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 20:19:02,600 epoch 3 - iter 65/136 - loss 0.09997083 - time (sec): 7.16 - samples/sec: 3492.72 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 20:19:03,950 epoch 3 - iter 78/136 - loss 0.09751295 - time (sec): 8.51 - samples/sec: 3489.66 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 20:19:05,435 epoch 3 - iter 91/136 - loss 0.09486572 - time (sec): 10.00 - samples/sec: 3532.44 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 20:19:06,726 epoch 3 - iter 104/136 - loss 0.09199902 - time (sec): 11.29 - samples/sec: 3559.17 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 20:19:08,133 epoch 3 - iter 117/136 - loss 0.08768794 - time (sec): 12.70 - samples/sec: 3551.74 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 20:19:09,588 epoch 3 - iter 130/136 - loss 0.08763564 - time (sec): 14.15 - samples/sec: 3526.52 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 20:19:10,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:10,165 EPOCH 3 done: loss 0.0866 - lr: 0.000039 |
|
2023-10-16 20:19:11,832 DEV : loss 0.10237669199705124 - f1-score (micro avg) 0.8268 |
|
2023-10-16 20:19:11,836 saving best model |
|
2023-10-16 20:19:12,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:13,716 epoch 4 - iter 13/136 - loss 0.08431575 - time (sec): 1.42 - samples/sec: 3427.07 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 20:19:15,238 epoch 4 - iter 26/136 - loss 0.06713900 - time (sec): 2.95 - samples/sec: 3482.32 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 20:19:16,482 epoch 4 - iter 39/136 - loss 0.05765410 - time (sec): 4.19 - samples/sec: 3551.75 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 20:19:17,877 epoch 4 - iter 52/136 - loss 0.05665901 - time (sec): 5.58 - samples/sec: 3682.64 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 20:19:19,212 epoch 4 - iter 65/136 - loss 0.05979582 - time (sec): 6.92 - samples/sec: 3637.82 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 20:19:20,790 epoch 4 - iter 78/136 - loss 0.05790229 - time (sec): 8.50 - samples/sec: 3623.54 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 20:19:22,120 epoch 4 - iter 91/136 - loss 0.05693213 - time (sec): 9.83 - samples/sec: 3606.14 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 20:19:23,630 epoch 4 - iter 104/136 - loss 0.05628982 - time (sec): 11.34 - samples/sec: 3579.50 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 20:19:25,016 epoch 4 - iter 117/136 - loss 0.05352644 - time (sec): 12.72 - samples/sec: 3590.41 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 20:19:26,241 epoch 4 - iter 130/136 - loss 0.05384621 - time (sec): 13.95 - samples/sec: 3567.78 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 20:19:26,852 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:26,852 EPOCH 4 done: loss 0.0531 - lr: 0.000034 |
|
2023-10-16 20:19:28,342 DEV : loss 0.1108362078666687 - f1-score (micro avg) 0.792 |
|
2023-10-16 20:19:28,347 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:29,788 epoch 5 - iter 13/136 - loss 0.04268788 - time (sec): 1.44 - samples/sec: 3396.75 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 20:19:31,041 epoch 5 - iter 26/136 - loss 0.03670220 - time (sec): 2.69 - samples/sec: 3419.76 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 20:19:32,272 epoch 5 - iter 39/136 - loss 0.04010108 - time (sec): 3.92 - samples/sec: 3646.72 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 20:19:33,713 epoch 5 - iter 52/136 - loss 0.04362300 - time (sec): 5.37 - samples/sec: 3515.55 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 20:19:35,275 epoch 5 - iter 65/136 - loss 0.03800478 - time (sec): 6.93 - samples/sec: 3481.28 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 20:19:36,599 epoch 5 - iter 78/136 - loss 0.03549079 - time (sec): 8.25 - samples/sec: 3546.96 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:19:37,986 epoch 5 - iter 91/136 - loss 0.03458614 - time (sec): 9.64 - samples/sec: 3543.08 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:19:39,580 epoch 5 - iter 104/136 - loss 0.03257155 - time (sec): 11.23 - samples/sec: 3549.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:19:40,937 epoch 5 - iter 117/136 - loss 0.03505263 - time (sec): 12.59 - samples/sec: 3533.61 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:19:42,362 epoch 5 - iter 130/136 - loss 0.03522794 - time (sec): 14.01 - samples/sec: 3540.70 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:19:43,035 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:43,036 EPOCH 5 done: loss 0.0344 - lr: 0.000028 |
|
2023-10-16 20:19:44,757 DEV : loss 0.12420879304409027 - f1-score (micro avg) 0.8015 |
|
2023-10-16 20:19:44,762 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:46,094 epoch 6 - iter 13/136 - loss 0.01916100 - time (sec): 1.33 - samples/sec: 3326.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:19:47,511 epoch 6 - iter 26/136 - loss 0.02420091 - time (sec): 2.75 - samples/sec: 3358.03 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:19:49,067 epoch 6 - iter 39/136 - loss 0.02128442 - time (sec): 4.30 - samples/sec: 3389.40 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:19:50,714 epoch 6 - iter 52/136 - loss 0.02244990 - time (sec): 5.95 - samples/sec: 3469.10 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:19:52,211 epoch 6 - iter 65/136 - loss 0.02199431 - time (sec): 7.45 - samples/sec: 3475.38 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:19:53,518 epoch 6 - iter 78/136 - loss 0.02434853 - time (sec): 8.76 - samples/sec: 3477.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:19:54,879 epoch 6 - iter 91/136 - loss 0.02513952 - time (sec): 10.12 - samples/sec: 3445.86 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:19:56,241 epoch 6 - iter 104/136 - loss 0.02469020 - time (sec): 11.48 - samples/sec: 3465.60 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:19:57,761 epoch 6 - iter 117/136 - loss 0.02336871 - time (sec): 13.00 - samples/sec: 3473.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:19:59,055 epoch 6 - iter 130/136 - loss 0.02303322 - time (sec): 14.29 - samples/sec: 3518.84 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:19:59,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:19:59,505 EPOCH 6 done: loss 0.0235 - lr: 0.000023 |
|
2023-10-16 20:20:01,009 DEV : loss 0.1299697607755661 - f1-score (micro avg) 0.82 |
|
2023-10-16 20:20:01,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:02,378 epoch 7 - iter 13/136 - loss 0.02201266 - time (sec): 1.36 - samples/sec: 3644.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:20:04,027 epoch 7 - iter 26/136 - loss 0.01585502 - time (sec): 3.01 - samples/sec: 3607.62 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:20:05,485 epoch 7 - iter 39/136 - loss 0.01813674 - time (sec): 4.47 - samples/sec: 3493.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:20:07,060 epoch 7 - iter 52/136 - loss 0.01977797 - time (sec): 6.04 - samples/sec: 3551.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:20:08,369 epoch 7 - iter 65/136 - loss 0.01878455 - time (sec): 7.35 - samples/sec: 3549.15 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:20:09,893 epoch 7 - iter 78/136 - loss 0.01847453 - time (sec): 8.88 - samples/sec: 3555.26 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:20:11,172 epoch 7 - iter 91/136 - loss 0.01767757 - time (sec): 10.16 - samples/sec: 3562.64 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:20:12,435 epoch 7 - iter 104/136 - loss 0.01671361 - time (sec): 11.42 - samples/sec: 3581.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:20:13,763 epoch 7 - iter 117/136 - loss 0.01655957 - time (sec): 12.75 - samples/sec: 3559.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:20:15,116 epoch 7 - iter 130/136 - loss 0.01641038 - time (sec): 14.10 - samples/sec: 3556.12 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:20:15,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:15,632 EPOCH 7 done: loss 0.0160 - lr: 0.000017 |
|
2023-10-16 20:20:17,148 DEV : loss 0.13152331113815308 - f1-score (micro avg) 0.8315 |
|
2023-10-16 20:20:17,154 saving best model |
|
2023-10-16 20:20:17,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:19,128 epoch 8 - iter 13/136 - loss 0.01402109 - time (sec): 1.52 - samples/sec: 2768.32 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:20:20,447 epoch 8 - iter 26/136 - loss 0.01902129 - time (sec): 2.84 - samples/sec: 3178.52 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:20:21,607 epoch 8 - iter 39/136 - loss 0.01608610 - time (sec): 4.00 - samples/sec: 3256.67 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:20:23,057 epoch 8 - iter 52/136 - loss 0.01326992 - time (sec): 5.45 - samples/sec: 3408.57 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:20:24,346 epoch 8 - iter 65/136 - loss 0.01198353 - time (sec): 6.74 - samples/sec: 3474.97 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:20:25,975 epoch 8 - iter 78/136 - loss 0.01138949 - time (sec): 8.37 - samples/sec: 3455.20 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:20:27,579 epoch 8 - iter 91/136 - loss 0.01277453 - time (sec): 9.97 - samples/sec: 3455.56 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:20:28,873 epoch 8 - iter 104/136 - loss 0.01291802 - time (sec): 11.26 - samples/sec: 3477.15 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:20:30,352 epoch 8 - iter 117/136 - loss 0.01246003 - time (sec): 12.74 - samples/sec: 3471.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:20:31,831 epoch 8 - iter 130/136 - loss 0.01146785 - time (sec): 14.22 - samples/sec: 3491.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:20:32,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:32,488 EPOCH 8 done: loss 0.0123 - lr: 0.000012 |
|
2023-10-16 20:20:33,976 DEV : loss 0.15120868384838104 - f1-score (micro avg) 0.8192 |
|
2023-10-16 20:20:33,981 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:35,606 epoch 9 - iter 13/136 - loss 0.00940542 - time (sec): 1.62 - samples/sec: 3592.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:20:36,941 epoch 9 - iter 26/136 - loss 0.00670356 - time (sec): 2.96 - samples/sec: 3633.67 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:20:38,165 epoch 9 - iter 39/136 - loss 0.00947777 - time (sec): 4.18 - samples/sec: 3542.21 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:20:39,685 epoch 9 - iter 52/136 - loss 0.00891002 - time (sec): 5.70 - samples/sec: 3538.63 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:20:41,079 epoch 9 - iter 65/136 - loss 0.00918508 - time (sec): 7.10 - samples/sec: 3529.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:20:42,396 epoch 9 - iter 78/136 - loss 0.00934079 - time (sec): 8.41 - samples/sec: 3592.02 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:20:43,945 epoch 9 - iter 91/136 - loss 0.00873731 - time (sec): 9.96 - samples/sec: 3549.63 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:20:45,515 epoch 9 - iter 104/136 - loss 0.00839287 - time (sec): 11.53 - samples/sec: 3560.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:20:46,878 epoch 9 - iter 117/136 - loss 0.00857232 - time (sec): 12.90 - samples/sec: 3579.75 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:20:48,163 epoch 9 - iter 130/136 - loss 0.01058054 - time (sec): 14.18 - samples/sec: 3549.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:20:48,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:48,735 EPOCH 9 done: loss 0.0106 - lr: 0.000006 |
|
2023-10-16 20:20:50,191 DEV : loss 0.14968341588974 - f1-score (micro avg) 0.8175 |
|
2023-10-16 20:20:50,195 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:20:51,766 epoch 10 - iter 13/136 - loss 0.00201920 - time (sec): 1.57 - samples/sec: 3363.38 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:20:52,880 epoch 10 - iter 26/136 - loss 0.00257635 - time (sec): 2.68 - samples/sec: 3418.54 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:20:54,377 epoch 10 - iter 39/136 - loss 0.00284480 - time (sec): 4.18 - samples/sec: 3228.03 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:20:55,815 epoch 10 - iter 52/136 - loss 0.00540897 - time (sec): 5.62 - samples/sec: 3338.60 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:20:57,292 epoch 10 - iter 65/136 - loss 0.00496823 - time (sec): 7.10 - samples/sec: 3352.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:20:58,618 epoch 10 - iter 78/136 - loss 0.00534676 - time (sec): 8.42 - samples/sec: 3351.23 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:21:00,051 epoch 10 - iter 91/136 - loss 0.00574925 - time (sec): 9.85 - samples/sec: 3392.70 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:21:01,333 epoch 10 - iter 104/136 - loss 0.00664022 - time (sec): 11.14 - samples/sec: 3429.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:21:02,990 epoch 10 - iter 117/136 - loss 0.00799959 - time (sec): 12.79 - samples/sec: 3422.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:21:04,396 epoch 10 - iter 130/136 - loss 0.00826072 - time (sec): 14.20 - samples/sec: 3483.83 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:21:05,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:05,155 EPOCH 10 done: loss 0.0080 - lr: 0.000000 |
|
2023-10-16 20:21:06,676 DEV : loss 0.15046021342277527 - f1-score (micro avg) 0.8324 |
|
2023-10-16 20:21:06,681 saving best model |
|
2023-10-16 20:21:07,513 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:07,514 Loading model from best epoch ... |
|
2023-10-16 20:21:09,210 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 20:21:11,380 |
|
Results: |
|
- F-score (micro) 0.7767 |
|
- F-score (macro) 0.7354 |
|
- Accuracy 0.6543 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8112 0.8814 0.8449 312 |
|
PER 0.6628 0.8221 0.7339 208 |
|
ORG 0.5306 0.4727 0.5000 55 |
|
HumanProd 0.7586 1.0000 0.8627 22 |
|
|
|
micro avg 0.7319 0.8275 0.7767 597 |
|
macro avg 0.6908 0.7941 0.7354 597 |
|
weighted avg 0.7317 0.8275 0.7751 597 |
|
|
|
2023-10-16 20:21:11,380 ---------------------------------------------------------------------------------------------------- |
|
|