2023-10-13 08:20:47,659 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,660 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 08:20:47,660 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,660 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-13 08:20:47,660 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,660 Train: 1100 sentences 2023-10-13 08:20:47,661 (train_with_dev=False, train_with_test=False) 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 Training Params: 2023-10-13 08:20:47,661 - learning_rate: "3e-05" 2023-10-13 08:20:47,661 - mini_batch_size: "4" 2023-10-13 08:20:47,661 - max_epochs: "10" 2023-10-13 08:20:47,661 - shuffle: "True" 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 Plugins: 2023-10-13 08:20:47,661 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 08:20:47,661 - metric: "('micro avg', 'f1-score')" 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 Computation: 2023-10-13 08:20:47,661 - compute on device: cuda:0 2023-10-13 08:20:47,661 - embedding storage: none 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:47,661 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:20:48,913 epoch 1 - iter 27/275 - loss 3.46471210 - time (sec): 1.25 - samples/sec: 1973.61 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:20:50,137 epoch 1 - iter 54/275 - loss 3.21690469 - time (sec): 2.48 - samples/sec: 1842.73 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:20:51,343 epoch 1 - iter 81/275 - loss 2.62973915 - time (sec): 3.68 - samples/sec: 1824.34 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:20:52,578 epoch 1 - iter 108/275 - loss 2.13003340 - time (sec): 4.92 - samples/sec: 1896.07 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:20:53,768 epoch 1 - iter 135/275 - loss 1.89123736 - time (sec): 6.11 - samples/sec: 1851.12 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:20:54,944 epoch 1 - iter 162/275 - loss 1.69661405 - time (sec): 7.28 - samples/sec: 1837.83 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:20:56,139 epoch 1 - iter 189/275 - loss 1.53825028 - time (sec): 8.48 - samples/sec: 1852.13 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:20:57,424 epoch 1 - iter 216/275 - loss 1.37542680 - time (sec): 9.76 - samples/sec: 1864.31 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:20:58,620 epoch 1 - iter 243/275 - loss 1.26954431 - time (sec): 10.96 - samples/sec: 1851.10 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:20:59,790 epoch 1 - iter 270/275 - loss 1.18595637 - time (sec): 12.13 - samples/sec: 1843.92 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:21:00,002 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:00,002 EPOCH 1 done: loss 1.1682 - lr: 0.000029 2023-10-13 08:21:00,577 DEV : loss 0.24353209137916565 - f1-score (micro avg) 0.6975 2023-10-13 08:21:00,583 saving best model 2023-10-13 08:21:00,907 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:02,174 epoch 2 - iter 27/275 - loss 0.24473887 - time (sec): 1.27 - samples/sec: 1977.72 - lr: 0.000030 - momentum: 0.000000 2023-10-13 08:21:03,388 epoch 2 - iter 54/275 - loss 0.24844451 - time (sec): 2.48 - samples/sec: 1764.89 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:21:04,552 epoch 2 - iter 81/275 - loss 0.23058422 - time (sec): 3.64 - samples/sec: 1830.41 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:21:05,754 epoch 2 - iter 108/275 - loss 0.22252073 - time (sec): 4.85 - samples/sec: 1823.61 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:21:06,965 epoch 2 - iter 135/275 - loss 0.20601209 - time (sec): 6.06 - samples/sec: 1840.89 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:21:08,168 epoch 2 - iter 162/275 - loss 0.20367117 - time (sec): 7.26 - samples/sec: 1834.43 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:21:09,379 epoch 2 - iter 189/275 - loss 0.19389855 - time (sec): 8.47 - samples/sec: 1825.35 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:21:10,595 epoch 2 - iter 216/275 - loss 0.19900124 - time (sec): 9.69 - samples/sec: 1844.52 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:21:11,837 epoch 2 - iter 243/275 - loss 0.19906591 - time (sec): 10.93 - samples/sec: 1835.81 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:21:13,181 epoch 2 - iter 270/275 - loss 0.19364842 - time (sec): 12.27 - samples/sec: 1823.25 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:21:13,388 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:13,389 EPOCH 2 done: loss 0.1927 - lr: 0.000027 2023-10-13 08:21:14,091 DEV : loss 0.14056046307086945 - f1-score (micro avg) 0.823 2023-10-13 08:21:14,097 saving best model 2023-10-13 08:21:14,526 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:15,735 epoch 3 - iter 27/275 - loss 0.12909850 - time (sec): 1.20 - samples/sec: 1694.71 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:21:16,909 epoch 3 - iter 54/275 - loss 0.12629026 - time (sec): 2.38 - samples/sec: 1801.59 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:21:18,133 epoch 3 - iter 81/275 - loss 0.12097415 - time (sec): 3.60 - samples/sec: 1854.74 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:21:19,268 epoch 3 - iter 108/275 - loss 0.12012962 - time (sec): 4.74 - samples/sec: 1891.63 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:21:20,429 epoch 3 - iter 135/275 - loss 0.11435618 - time (sec): 5.90 - samples/sec: 1899.44 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:21:21,554 epoch 3 - iter 162/275 - loss 0.11286311 - time (sec): 7.02 - samples/sec: 1888.94 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:21:22,692 epoch 3 - iter 189/275 - loss 0.10764496 - time (sec): 8.16 - samples/sec: 1909.01 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:21:23,819 epoch 3 - iter 216/275 - loss 0.10669876 - time (sec): 9.29 - samples/sec: 1891.61 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:21:24,961 epoch 3 - iter 243/275 - loss 0.10981659 - time (sec): 10.43 - samples/sec: 1914.89 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:21:26,090 epoch 3 - iter 270/275 - loss 0.10712392 - time (sec): 11.56 - samples/sec: 1934.98 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:21:26,303 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:26,303 EPOCH 3 done: loss 0.1062 - lr: 0.000023 2023-10-13 08:21:27,005 DEV : loss 0.1445218026638031 - f1-score (micro avg) 0.8379 2023-10-13 08:21:27,010 saving best model 2023-10-13 08:21:27,422 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:28,586 epoch 4 - iter 27/275 - loss 0.05713240 - time (sec): 1.16 - samples/sec: 1887.06 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:21:29,761 epoch 4 - iter 54/275 - loss 0.06255333 - time (sec): 2.33 - samples/sec: 1960.77 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:21:30,931 epoch 4 - iter 81/275 - loss 0.06924934 - time (sec): 3.50 - samples/sec: 1923.76 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:21:32,082 epoch 4 - iter 108/275 - loss 0.06574498 - time (sec): 4.66 - samples/sec: 1922.14 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:21:33,239 epoch 4 - iter 135/275 - loss 0.06833962 - time (sec): 5.81 - samples/sec: 1924.72 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:21:34,376 epoch 4 - iter 162/275 - loss 0.06967985 - time (sec): 6.95 - samples/sec: 1920.30 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:21:35,504 epoch 4 - iter 189/275 - loss 0.07045302 - time (sec): 8.08 - samples/sec: 1910.81 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:21:36,630 epoch 4 - iter 216/275 - loss 0.07087796 - time (sec): 9.20 - samples/sec: 1903.55 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:21:37,804 epoch 4 - iter 243/275 - loss 0.07143033 - time (sec): 10.38 - samples/sec: 1946.55 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:21:38,942 epoch 4 - iter 270/275 - loss 0.07166938 - time (sec): 11.52 - samples/sec: 1945.54 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:21:39,148 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:39,148 EPOCH 4 done: loss 0.0717 - lr: 0.000020 2023-10-13 08:21:39,842 DEV : loss 0.15860778093338013 - f1-score (micro avg) 0.8585 2023-10-13 08:21:39,847 saving best model 2023-10-13 08:21:40,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:41,469 epoch 5 - iter 27/275 - loss 0.05709646 - time (sec): 1.21 - samples/sec: 1941.79 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:21:42,681 epoch 5 - iter 54/275 - loss 0.04815460 - time (sec): 2.42 - samples/sec: 1895.37 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:21:43,888 epoch 5 - iter 81/275 - loss 0.06941705 - time (sec): 3.62 - samples/sec: 1836.50 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:21:45,122 epoch 5 - iter 108/275 - loss 0.06404587 - time (sec): 4.86 - samples/sec: 1814.29 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:21:46,343 epoch 5 - iter 135/275 - loss 0.06111222 - time (sec): 6.08 - samples/sec: 1835.79 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:21:47,557 epoch 5 - iter 162/275 - loss 0.05913176 - time (sec): 7.29 - samples/sec: 1833.95 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:21:48,764 epoch 5 - iter 189/275 - loss 0.05684721 - time (sec): 8.50 - samples/sec: 1838.87 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:21:50,022 epoch 5 - iter 216/275 - loss 0.05272946 - time (sec): 9.76 - samples/sec: 1804.03 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:21:51,259 epoch 5 - iter 243/275 - loss 0.05244660 - time (sec): 10.99 - samples/sec: 1808.98 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:21:52,554 epoch 5 - iter 270/275 - loss 0.05184324 - time (sec): 12.29 - samples/sec: 1808.58 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:21:52,787 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:52,787 EPOCH 5 done: loss 0.0509 - lr: 0.000017 2023-10-13 08:21:53,487 DEV : loss 0.18737180531024933 - f1-score (micro avg) 0.8541 2023-10-13 08:21:53,492 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:21:54,760 epoch 6 - iter 27/275 - loss 0.05964321 - time (sec): 1.27 - samples/sec: 1795.31 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:21:55,956 epoch 6 - iter 54/275 - loss 0.05916816 - time (sec): 2.46 - samples/sec: 1850.69 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:21:57,185 epoch 6 - iter 81/275 - loss 0.05566550 - time (sec): 3.69 - samples/sec: 1839.23 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:21:58,377 epoch 6 - iter 108/275 - loss 0.05129284 - time (sec): 4.88 - samples/sec: 1833.12 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:21:59,519 epoch 6 - iter 135/275 - loss 0.05546696 - time (sec): 6.03 - samples/sec: 1832.52 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:22:00,650 epoch 6 - iter 162/275 - loss 0.05123068 - time (sec): 7.16 - samples/sec: 1861.60 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:22:01,789 epoch 6 - iter 189/275 - loss 0.05053093 - time (sec): 8.30 - samples/sec: 1870.52 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:22:02,923 epoch 6 - iter 216/275 - loss 0.04909056 - time (sec): 9.43 - samples/sec: 1886.30 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:22:04,081 epoch 6 - iter 243/275 - loss 0.05106601 - time (sec): 10.59 - samples/sec: 1898.16 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:22:05,232 epoch 6 - iter 270/275 - loss 0.04858153 - time (sec): 11.74 - samples/sec: 1903.14 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:22:05,442 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:05,442 EPOCH 6 done: loss 0.0478 - lr: 0.000013 2023-10-13 08:22:06,116 DEV : loss 0.189662367105484 - f1-score (micro avg) 0.8602 2023-10-13 08:22:06,121 saving best model 2023-10-13 08:22:06,548 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:07,762 epoch 7 - iter 27/275 - loss 0.00347272 - time (sec): 1.21 - samples/sec: 1818.10 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:22:08,960 epoch 7 - iter 54/275 - loss 0.01177133 - time (sec): 2.41 - samples/sec: 1850.93 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:22:10,157 epoch 7 - iter 81/275 - loss 0.00965081 - time (sec): 3.60 - samples/sec: 1789.82 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:22:11,355 epoch 7 - iter 108/275 - loss 0.01529999 - time (sec): 4.80 - samples/sec: 1854.76 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:22:12,549 epoch 7 - iter 135/275 - loss 0.02045110 - time (sec): 6.00 - samples/sec: 1883.22 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:22:13,723 epoch 7 - iter 162/275 - loss 0.02871106 - time (sec): 7.17 - samples/sec: 1857.90 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:22:14,928 epoch 7 - iter 189/275 - loss 0.03492264 - time (sec): 8.38 - samples/sec: 1869.32 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:22:16,145 epoch 7 - iter 216/275 - loss 0.03927162 - time (sec): 9.59 - samples/sec: 1876.35 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:22:17,354 epoch 7 - iter 243/275 - loss 0.03625023 - time (sec): 10.80 - samples/sec: 1860.06 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:22:18,564 epoch 7 - iter 270/275 - loss 0.03554710 - time (sec): 12.01 - samples/sec: 1859.93 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:22:18,780 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:18,781 EPOCH 7 done: loss 0.0357 - lr: 0.000010 2023-10-13 08:22:19,446 DEV : loss 0.17136068642139435 - f1-score (micro avg) 0.879 2023-10-13 08:22:19,451 saving best model 2023-10-13 08:22:19,866 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:21,086 epoch 8 - iter 27/275 - loss 0.01490497 - time (sec): 1.22 - samples/sec: 1813.88 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:22:22,284 epoch 8 - iter 54/275 - loss 0.01229963 - time (sec): 2.41 - samples/sec: 1820.49 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:22:23,453 epoch 8 - iter 81/275 - loss 0.03167026 - time (sec): 3.58 - samples/sec: 1864.41 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:22:24,618 epoch 8 - iter 108/275 - loss 0.02569038 - time (sec): 4.75 - samples/sec: 1841.60 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:22:25,875 epoch 8 - iter 135/275 - loss 0.03007548 - time (sec): 6.01 - samples/sec: 1848.66 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:22:27,070 epoch 8 - iter 162/275 - loss 0.03292475 - time (sec): 7.20 - samples/sec: 1849.11 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:22:28,271 epoch 8 - iter 189/275 - loss 0.02960551 - time (sec): 8.40 - samples/sec: 1887.87 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:22:29,509 epoch 8 - iter 216/275 - loss 0.02730281 - time (sec): 9.64 - samples/sec: 1865.32 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:22:30,781 epoch 8 - iter 243/275 - loss 0.02693528 - time (sec): 10.91 - samples/sec: 1856.78 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:22:31,987 epoch 8 - iter 270/275 - loss 0.02498566 - time (sec): 12.12 - samples/sec: 1843.72 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:22:32,206 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:32,206 EPOCH 8 done: loss 0.0260 - lr: 0.000007 2023-10-13 08:22:32,867 DEV : loss 0.18576553463935852 - f1-score (micro avg) 0.8674 2023-10-13 08:22:32,871 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:34,055 epoch 9 - iter 27/275 - loss 0.00634717 - time (sec): 1.18 - samples/sec: 1888.39 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:22:35,220 epoch 9 - iter 54/275 - loss 0.01263450 - time (sec): 2.35 - samples/sec: 1888.86 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:22:36,403 epoch 9 - iter 81/275 - loss 0.00970784 - time (sec): 3.53 - samples/sec: 1888.71 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:22:37,585 epoch 9 - iter 108/275 - loss 0.01140941 - time (sec): 4.71 - samples/sec: 1904.23 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:22:38,791 epoch 9 - iter 135/275 - loss 0.01594643 - time (sec): 5.92 - samples/sec: 1923.04 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:22:39,983 epoch 9 - iter 162/275 - loss 0.01560158 - time (sec): 7.11 - samples/sec: 1905.01 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:22:41,198 epoch 9 - iter 189/275 - loss 0.01560657 - time (sec): 8.33 - samples/sec: 1880.27 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:22:42,375 epoch 9 - iter 216/275 - loss 0.01531936 - time (sec): 9.50 - samples/sec: 1902.35 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:22:43,576 epoch 9 - iter 243/275 - loss 0.01864379 - time (sec): 10.70 - samples/sec: 1890.37 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:22:44,753 epoch 9 - iter 270/275 - loss 0.01843708 - time (sec): 11.88 - samples/sec: 1881.34 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:22:44,977 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:44,977 EPOCH 9 done: loss 0.0183 - lr: 0.000003 2023-10-13 08:22:45,730 DEV : loss 0.17246633768081665 - f1-score (micro avg) 0.8722 2023-10-13 08:22:45,737 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:46,947 epoch 10 - iter 27/275 - loss 0.01414494 - time (sec): 1.21 - samples/sec: 2068.84 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:22:48,127 epoch 10 - iter 54/275 - loss 0.01095108 - time (sec): 2.39 - samples/sec: 1867.40 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:22:49,301 epoch 10 - iter 81/275 - loss 0.01556087 - time (sec): 3.56 - samples/sec: 1893.80 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:22:50,474 epoch 10 - iter 108/275 - loss 0.01230299 - time (sec): 4.74 - samples/sec: 1881.63 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:22:51,688 epoch 10 - iter 135/275 - loss 0.01331860 - time (sec): 5.95 - samples/sec: 1827.94 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:22:52,920 epoch 10 - iter 162/275 - loss 0.01446570 - time (sec): 7.18 - samples/sec: 1848.59 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:22:54,175 epoch 10 - iter 189/275 - loss 0.01385998 - time (sec): 8.44 - samples/sec: 1877.21 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:22:55,340 epoch 10 - iter 216/275 - loss 0.01303071 - time (sec): 9.60 - samples/sec: 1865.87 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:22:56,501 epoch 10 - iter 243/275 - loss 0.01567887 - time (sec): 10.76 - samples/sec: 1870.12 - lr: 0.000000 - momentum: 0.000000 2023-10-13 08:22:57,702 epoch 10 - iter 270/275 - loss 0.01751613 - time (sec): 11.96 - samples/sec: 1859.69 - lr: 0.000000 - momentum: 0.000000 2023-10-13 08:22:57,932 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:57,932 EPOCH 10 done: loss 0.0174 - lr: 0.000000 2023-10-13 08:22:58,642 DEV : loss 0.18003800511360168 - f1-score (micro avg) 0.8653 2023-10-13 08:22:58,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:22:58,975 Loading model from best epoch ... 2023-10-13 08:23:00,431 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 08:23:01,438 Results: - F-score (micro) 0.9176 - F-score (macro) 0.6322 - Accuracy 0.854 By class: precision recall f1-score support scope 0.8989 0.9091 0.9040 176 pers 0.9690 0.9766 0.9728 128 work 0.8904 0.8784 0.8844 74 loc 0.3333 0.5000 0.4000 2 object 0.0000 0.0000 0.0000 2 micro avg 0.9164 0.9188 0.9176 382 macro avg 0.6183 0.6528 0.6322 382 weighted avg 0.9131 0.9188 0.9158 382 2023-10-13 08:23:01,438 ----------------------------------------------------------------------------------------------------