Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +239 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:077671538fbfa0f4467afdad38940bf3bfd423ea61b0420f5f2e64336e64c5e9
|
3 |
+
size 443311175
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 22:39:07 0.0000 0.3741 0.0563 0.7348 0.7131 0.7238 0.5930
|
3 |
+
2 22:39:55 0.0000 0.0770 0.0543 0.7510 0.8017 0.7755 0.6463
|
4 |
+
3 22:40:42 0.0000 0.0512 0.0629 0.7530 0.7975 0.7746 0.6451
|
5 |
+
4 22:41:29 0.0000 0.0331 0.0667 0.7300 0.8101 0.7680 0.6379
|
6 |
+
5 22:42:16 0.0000 0.0211 0.0905 0.7606 0.8312 0.7944 0.6701
|
7 |
+
6 22:43:04 0.0000 0.0178 0.0954 0.7164 0.8312 0.7695 0.6417
|
8 |
+
7 22:43:52 0.0000 0.0115 0.1044 0.7490 0.8312 0.7880 0.6655
|
9 |
+
8 22:44:40 0.0000 0.0076 0.1056 0.7568 0.8270 0.7903 0.6689
|
10 |
+
9 22:45:27 0.0000 0.0056 0.1088 0.7984 0.8186 0.8083 0.6929
|
11 |
+
10 22:46:15 0.0000 0.0040 0.1119 0.7760 0.8186 0.7967 0.6760
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,239 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-16 22:38:20,773 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-16 22:38:20,774 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-16 22:38:20,774 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
|
53 |
+
2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-16 22:38:20,774 Train: 6183 sentences
|
55 |
+
2023-10-16 22:38:20,774 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-16 22:38:20,774 Training Params:
|
58 |
+
2023-10-16 22:38:20,774 - learning_rate: "3e-05"
|
59 |
+
2023-10-16 22:38:20,774 - mini_batch_size: "8"
|
60 |
+
2023-10-16 22:38:20,774 - max_epochs: "10"
|
61 |
+
2023-10-16 22:38:20,774 - shuffle: "True"
|
62 |
+
2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-16 22:38:20,774 Plugins:
|
64 |
+
2023-10-16 22:38:20,774 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-16 22:38:20,775 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-16 22:38:20,775 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-16 22:38:20,775 Computation:
|
70 |
+
2023-10-16 22:38:20,775 - compute on device: cuda:0
|
71 |
+
2023-10-16 22:38:20,775 - embedding storage: none
|
72 |
+
2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-16 22:38:20,775 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
|
74 |
+
2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-16 22:38:25,471 epoch 1 - iter 77/773 - loss 2.20232412 - time (sec): 4.70 - samples/sec: 2784.44 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-16 22:38:30,116 epoch 1 - iter 154/773 - loss 1.34548593 - time (sec): 9.34 - samples/sec: 2692.01 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-16 22:38:34,672 epoch 1 - iter 231/773 - loss 0.96047681 - time (sec): 13.90 - samples/sec: 2750.73 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-16 22:38:38,896 epoch 1 - iter 308/773 - loss 0.76598464 - time (sec): 18.12 - samples/sec: 2776.90 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-16 22:38:43,376 epoch 1 - iter 385/773 - loss 0.63985618 - time (sec): 22.60 - samples/sec: 2772.49 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-16 22:38:47,807 epoch 1 - iter 462/773 - loss 0.55707246 - time (sec): 27.03 - samples/sec: 2750.85 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-16 22:38:52,214 epoch 1 - iter 539/773 - loss 0.49281757 - time (sec): 31.44 - samples/sec: 2756.04 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-16 22:38:56,475 epoch 1 - iter 616/773 - loss 0.44355349 - time (sec): 35.70 - samples/sec: 2772.08 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-16 22:39:01,048 epoch 1 - iter 693/773 - loss 0.40565474 - time (sec): 40.27 - samples/sec: 2770.12 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-16 22:39:05,427 epoch 1 - iter 770/773 - loss 0.37504677 - time (sec): 44.65 - samples/sec: 2774.67 - lr: 0.000030 - momentum: 0.000000
|
86 |
+
2023-10-16 22:39:05,579 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-16 22:39:05,579 EPOCH 1 done: loss 0.3741 - lr: 0.000030
|
88 |
+
2023-10-16 22:39:07,318 DEV : loss 0.056319333612918854 - f1-score (micro avg) 0.7238
|
89 |
+
2023-10-16 22:39:07,330 saving best model
|
90 |
+
2023-10-16 22:39:07,659 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-16 22:39:12,043 epoch 2 - iter 77/773 - loss 0.09617103 - time (sec): 4.38 - samples/sec: 2800.90 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-16 22:39:16,570 epoch 2 - iter 154/773 - loss 0.08610244 - time (sec): 8.91 - samples/sec: 2855.09 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-16 22:39:20,960 epoch 2 - iter 231/773 - loss 0.08642226 - time (sec): 13.30 - samples/sec: 2784.34 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-16 22:39:25,501 epoch 2 - iter 308/773 - loss 0.08300522 - time (sec): 17.84 - samples/sec: 2766.83 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-16 22:39:29,894 epoch 2 - iter 385/773 - loss 0.08529708 - time (sec): 22.23 - samples/sec: 2755.25 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-16 22:39:34,743 epoch 2 - iter 462/773 - loss 0.07999665 - time (sec): 27.08 - samples/sec: 2760.78 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-16 22:39:39,180 epoch 2 - iter 539/773 - loss 0.07886580 - time (sec): 31.52 - samples/sec: 2754.08 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-16 22:39:43,779 epoch 2 - iter 616/773 - loss 0.07954784 - time (sec): 36.12 - samples/sec: 2737.93 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-16 22:39:48,076 epoch 2 - iter 693/773 - loss 0.07809064 - time (sec): 40.42 - samples/sec: 2743.57 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-16 22:39:52,684 epoch 2 - iter 770/773 - loss 0.07720281 - time (sec): 45.02 - samples/sec: 2751.19 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-16 22:39:52,845 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-16 22:39:52,845 EPOCH 2 done: loss 0.0770 - lr: 0.000027
|
103 |
+
2023-10-16 22:39:55,157 DEV : loss 0.054309092462062836 - f1-score (micro avg) 0.7755
|
104 |
+
2023-10-16 22:39:55,169 saving best model
|
105 |
+
2023-10-16 22:39:55,568 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-16 22:39:59,920 epoch 3 - iter 77/773 - loss 0.05351069 - time (sec): 4.35 - samples/sec: 2893.11 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-16 22:40:04,475 epoch 3 - iter 154/773 - loss 0.04936663 - time (sec): 8.90 - samples/sec: 2813.62 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-16 22:40:08,874 epoch 3 - iter 231/773 - loss 0.04838463 - time (sec): 13.30 - samples/sec: 2802.51 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-16 22:40:13,388 epoch 3 - iter 308/773 - loss 0.04961579 - time (sec): 17.82 - samples/sec: 2763.32 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-16 22:40:17,767 epoch 3 - iter 385/773 - loss 0.04991997 - time (sec): 22.20 - samples/sec: 2743.43 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-16 22:40:22,119 epoch 3 - iter 462/773 - loss 0.05042490 - time (sec): 26.55 - samples/sec: 2730.63 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-16 22:40:26,594 epoch 3 - iter 539/773 - loss 0.05214182 - time (sec): 31.02 - samples/sec: 2735.69 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-16 22:40:31,138 epoch 3 - iter 616/773 - loss 0.05156897 - time (sec): 35.57 - samples/sec: 2743.50 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-16 22:40:35,704 epoch 3 - iter 693/773 - loss 0.05185757 - time (sec): 40.13 - samples/sec: 2748.55 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-16 22:40:40,464 epoch 3 - iter 770/773 - loss 0.05129452 - time (sec): 44.89 - samples/sec: 2758.78 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-16 22:40:40,621 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-16 22:40:40,621 EPOCH 3 done: loss 0.0512 - lr: 0.000023
|
118 |
+
2023-10-16 22:40:42,689 DEV : loss 0.06285982578992844 - f1-score (micro avg) 0.7746
|
119 |
+
2023-10-16 22:40:42,701 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-16 22:40:47,094 epoch 4 - iter 77/773 - loss 0.02714792 - time (sec): 4.39 - samples/sec: 2681.65 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-16 22:40:51,754 epoch 4 - iter 154/773 - loss 0.03209934 - time (sec): 9.05 - samples/sec: 2795.02 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-16 22:40:56,059 epoch 4 - iter 231/773 - loss 0.03306353 - time (sec): 13.36 - samples/sec: 2773.05 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-16 22:41:00,497 epoch 4 - iter 308/773 - loss 0.03181679 - time (sec): 17.80 - samples/sec: 2755.37 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-16 22:41:05,077 epoch 4 - iter 385/773 - loss 0.03093616 - time (sec): 22.37 - samples/sec: 2771.77 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-16 22:41:09,714 epoch 4 - iter 462/773 - loss 0.03052049 - time (sec): 27.01 - samples/sec: 2781.54 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-16 22:41:14,036 epoch 4 - iter 539/773 - loss 0.03167868 - time (sec): 31.33 - samples/sec: 2757.72 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-16 22:41:18,685 epoch 4 - iter 616/773 - loss 0.03232620 - time (sec): 35.98 - samples/sec: 2757.92 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-16 22:41:23,024 epoch 4 - iter 693/773 - loss 0.03317880 - time (sec): 40.32 - samples/sec: 2757.58 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-16 22:41:27,622 epoch 4 - iter 770/773 - loss 0.03312722 - time (sec): 44.92 - samples/sec: 2758.94 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-16 22:41:27,780 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-16 22:41:27,780 EPOCH 4 done: loss 0.0331 - lr: 0.000020
|
132 |
+
2023-10-16 22:41:29,832 DEV : loss 0.06671957671642303 - f1-score (micro avg) 0.768
|
133 |
+
2023-10-16 22:41:29,844 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-16 22:41:34,447 epoch 5 - iter 77/773 - loss 0.02283526 - time (sec): 4.60 - samples/sec: 2670.48 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-16 22:41:39,025 epoch 5 - iter 154/773 - loss 0.01970168 - time (sec): 9.18 - samples/sec: 2735.63 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-16 22:41:43,613 epoch 5 - iter 231/773 - loss 0.02003167 - time (sec): 13.77 - samples/sec: 2723.64 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-16 22:41:48,304 epoch 5 - iter 308/773 - loss 0.01954257 - time (sec): 18.46 - samples/sec: 2758.00 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-16 22:41:52,860 epoch 5 - iter 385/773 - loss 0.02257951 - time (sec): 23.01 - samples/sec: 2759.60 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-16 22:41:57,270 epoch 5 - iter 462/773 - loss 0.02218939 - time (sec): 27.42 - samples/sec: 2773.32 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-16 22:42:01,832 epoch 5 - iter 539/773 - loss 0.02119526 - time (sec): 31.99 - samples/sec: 2785.25 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-16 22:42:06,209 epoch 5 - iter 616/773 - loss 0.02082536 - time (sec): 36.36 - samples/sec: 2777.63 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-16 22:42:10,432 epoch 5 - iter 693/773 - loss 0.02135291 - time (sec): 40.59 - samples/sec: 2772.43 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-16 22:42:14,647 epoch 5 - iter 770/773 - loss 0.02117237 - time (sec): 44.80 - samples/sec: 2767.16 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-16 22:42:14,799 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-16 22:42:14,799 EPOCH 5 done: loss 0.0211 - lr: 0.000017
|
146 |
+
2023-10-16 22:42:16,911 DEV : loss 0.09046085923910141 - f1-score (micro avg) 0.7944
|
147 |
+
2023-10-16 22:42:16,923 saving best model
|
148 |
+
2023-10-16 22:42:17,382 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-16 22:42:21,963 epoch 6 - iter 77/773 - loss 0.01026234 - time (sec): 4.58 - samples/sec: 2677.11 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-16 22:42:26,622 epoch 6 - iter 154/773 - loss 0.01400531 - time (sec): 9.24 - samples/sec: 2663.86 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-16 22:42:31,099 epoch 6 - iter 231/773 - loss 0.01813924 - time (sec): 13.71 - samples/sec: 2663.41 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-16 22:42:35,639 epoch 6 - iter 308/773 - loss 0.01889707 - time (sec): 18.25 - samples/sec: 2703.15 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-16 22:42:40,002 epoch 6 - iter 385/773 - loss 0.01861180 - time (sec): 22.62 - samples/sec: 2730.77 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-16 22:42:44,737 epoch 6 - iter 462/773 - loss 0.01792516 - time (sec): 27.35 - samples/sec: 2747.07 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-16 22:42:49,244 epoch 6 - iter 539/773 - loss 0.01740097 - time (sec): 31.86 - samples/sec: 2732.59 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-16 22:42:53,665 epoch 6 - iter 616/773 - loss 0.01803069 - time (sec): 36.28 - samples/sec: 2725.53 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-16 22:42:58,153 epoch 6 - iter 693/773 - loss 0.01729439 - time (sec): 40.77 - samples/sec: 2724.07 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-16 22:43:02,751 epoch 6 - iter 770/773 - loss 0.01779138 - time (sec): 45.36 - samples/sec: 2732.31 - lr: 0.000013 - momentum: 0.000000
|
159 |
+
2023-10-16 22:43:02,910 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-16 22:43:02,910 EPOCH 6 done: loss 0.0178 - lr: 0.000013
|
161 |
+
2023-10-16 22:43:04,962 DEV : loss 0.09538255631923676 - f1-score (micro avg) 0.7695
|
162 |
+
2023-10-16 22:43:04,975 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-16 22:43:09,443 epoch 7 - iter 77/773 - loss 0.00643346 - time (sec): 4.47 - samples/sec: 2694.74 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-16 22:43:13,972 epoch 7 - iter 154/773 - loss 0.01003396 - time (sec): 9.00 - samples/sec: 2657.19 - lr: 0.000013 - momentum: 0.000000
|
165 |
+
2023-10-16 22:43:18,760 epoch 7 - iter 231/773 - loss 0.01041150 - time (sec): 13.78 - samples/sec: 2638.74 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-16 22:43:23,225 epoch 7 - iter 308/773 - loss 0.01290931 - time (sec): 18.25 - samples/sec: 2663.58 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-16 22:43:27,776 epoch 7 - iter 385/773 - loss 0.01397438 - time (sec): 22.80 - samples/sec: 2692.33 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-16 22:43:32,358 epoch 7 - iter 462/773 - loss 0.01329284 - time (sec): 27.38 - samples/sec: 2715.59 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-16 22:43:36,906 epoch 7 - iter 539/773 - loss 0.01261849 - time (sec): 31.93 - samples/sec: 2722.96 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-16 22:43:41,592 epoch 7 - iter 616/773 - loss 0.01224990 - time (sec): 36.62 - samples/sec: 2704.51 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-16 22:43:45,932 epoch 7 - iter 693/773 - loss 0.01208302 - time (sec): 40.96 - samples/sec: 2717.98 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-16 22:43:50,481 epoch 7 - iter 770/773 - loss 0.01149147 - time (sec): 45.51 - samples/sec: 2724.45 - lr: 0.000010 - momentum: 0.000000
|
173 |
+
2023-10-16 22:43:50,635 ----------------------------------------------------------------------------------------------------
|
174 |
+
2023-10-16 22:43:50,635 EPOCH 7 done: loss 0.0115 - lr: 0.000010
|
175 |
+
2023-10-16 22:43:52,654 DEV : loss 0.1044364646077156 - f1-score (micro avg) 0.788
|
176 |
+
2023-10-16 22:43:52,666 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-16 22:43:56,937 epoch 8 - iter 77/773 - loss 0.00328818 - time (sec): 4.27 - samples/sec: 2666.00 - lr: 0.000010 - momentum: 0.000000
|
178 |
+
2023-10-16 22:44:01,720 epoch 8 - iter 154/773 - loss 0.00544868 - time (sec): 9.05 - samples/sec: 2733.65 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-16 22:44:06,477 epoch 8 - iter 231/773 - loss 0.00555489 - time (sec): 13.81 - samples/sec: 2740.46 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-16 22:44:11,283 epoch 8 - iter 308/773 - loss 0.00511882 - time (sec): 18.62 - samples/sec: 2740.44 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-16 22:44:15,741 epoch 8 - iter 385/773 - loss 0.00595049 - time (sec): 23.07 - samples/sec: 2733.45 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-16 22:44:20,022 epoch 8 - iter 462/773 - loss 0.00656918 - time (sec): 27.35 - samples/sec: 2731.09 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-16 22:44:24,368 epoch 8 - iter 539/773 - loss 0.00680503 - time (sec): 31.70 - samples/sec: 2747.97 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-16 22:44:29,123 epoch 8 - iter 616/773 - loss 0.00713075 - time (sec): 36.46 - samples/sec: 2735.58 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-16 22:44:33,766 epoch 8 - iter 693/773 - loss 0.00795647 - time (sec): 41.10 - samples/sec: 2729.44 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-16 22:44:38,196 epoch 8 - iter 770/773 - loss 0.00759391 - time (sec): 45.53 - samples/sec: 2718.37 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-16 22:44:38,370 ----------------------------------------------------------------------------------------------------
|
188 |
+
2023-10-16 22:44:38,370 EPOCH 8 done: loss 0.0076 - lr: 0.000007
|
189 |
+
2023-10-16 22:44:40,510 DEV : loss 0.10555334389209747 - f1-score (micro avg) 0.7903
|
190 |
+
2023-10-16 22:44:40,524 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-16 22:44:45,138 epoch 9 - iter 77/773 - loss 0.00640030 - time (sec): 4.61 - samples/sec: 2548.91 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-16 22:44:49,781 epoch 9 - iter 154/773 - loss 0.00442726 - time (sec): 9.26 - samples/sec: 2569.05 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-16 22:44:54,449 epoch 9 - iter 231/773 - loss 0.00457001 - time (sec): 13.92 - samples/sec: 2670.36 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-16 22:44:58,987 epoch 9 - iter 308/773 - loss 0.00461721 - time (sec): 18.46 - samples/sec: 2662.10 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-16 22:45:03,559 epoch 9 - iter 385/773 - loss 0.00507112 - time (sec): 23.03 - samples/sec: 2700.71 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-16 22:45:07,877 epoch 9 - iter 462/773 - loss 0.00480843 - time (sec): 27.35 - samples/sec: 2716.12 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-16 22:45:12,316 epoch 9 - iter 539/773 - loss 0.00451261 - time (sec): 31.79 - samples/sec: 2737.14 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-16 22:45:16,589 epoch 9 - iter 616/773 - loss 0.00542387 - time (sec): 36.06 - samples/sec: 2745.31 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-16 22:45:20,864 epoch 9 - iter 693/773 - loss 0.00545535 - time (sec): 40.34 - samples/sec: 2756.94 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-16 22:45:25,445 epoch 9 - iter 770/773 - loss 0.00559464 - time (sec): 44.92 - samples/sec: 2759.75 - lr: 0.000003 - momentum: 0.000000
|
201 |
+
2023-10-16 22:45:25,595 ----------------------------------------------------------------------------------------------------
|
202 |
+
2023-10-16 22:45:25,595 EPOCH 9 done: loss 0.0056 - lr: 0.000003
|
203 |
+
2023-10-16 22:45:27,729 DEV : loss 0.10876341164112091 - f1-score (micro avg) 0.8083
|
204 |
+
2023-10-16 22:45:27,742 saving best model
|
205 |
+
2023-10-16 22:45:28,250 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-16 22:45:32,710 epoch 10 - iter 77/773 - loss 0.00084844 - time (sec): 4.46 - samples/sec: 2738.56 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-16 22:45:37,149 epoch 10 - iter 154/773 - loss 0.00313889 - time (sec): 8.90 - samples/sec: 2793.18 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-16 22:45:41,590 epoch 10 - iter 231/773 - loss 0.00408552 - time (sec): 13.34 - samples/sec: 2788.43 - lr: 0.000002 - momentum: 0.000000
|
209 |
+
2023-10-16 22:45:46,096 epoch 10 - iter 308/773 - loss 0.00423339 - time (sec): 17.84 - samples/sec: 2786.35 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-16 22:45:50,417 epoch 10 - iter 385/773 - loss 0.00406239 - time (sec): 22.17 - samples/sec: 2807.87 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-16 22:45:54,922 epoch 10 - iter 462/773 - loss 0.00398725 - time (sec): 26.67 - samples/sec: 2806.42 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-16 22:45:59,453 epoch 10 - iter 539/773 - loss 0.00414480 - time (sec): 31.20 - samples/sec: 2786.22 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-16 22:46:03,820 epoch 10 - iter 616/773 - loss 0.00400871 - time (sec): 35.57 - samples/sec: 2793.71 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-16 22:46:08,284 epoch 10 - iter 693/773 - loss 0.00394248 - time (sec): 40.03 - samples/sec: 2787.36 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-10-16 22:46:12,795 epoch 10 - iter 770/773 - loss 0.00399570 - time (sec): 44.54 - samples/sec: 2782.73 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-16 22:46:12,940 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-16 22:46:12,940 EPOCH 10 done: loss 0.0040 - lr: 0.000000
|
218 |
+
2023-10-16 22:46:15,389 DEV : loss 0.11186421662569046 - f1-score (micro avg) 0.7967
|
219 |
+
2023-10-16 22:46:15,737 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-16 22:46:15,738 Loading model from best epoch ...
|
221 |
+
2023-10-16 22:46:17,241 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
|
222 |
+
2023-10-16 22:46:23,470
|
223 |
+
Results:
|
224 |
+
- F-score (micro) 0.8034
|
225 |
+
- F-score (macro) 0.7007
|
226 |
+
- Accuracy 0.6925
|
227 |
+
|
228 |
+
By class:
|
229 |
+
precision recall f1-score support
|
230 |
+
|
231 |
+
LOC 0.8494 0.8584 0.8538 946
|
232 |
+
BUILDING 0.6101 0.5243 0.5640 185
|
233 |
+
STREET 0.6724 0.6964 0.6842 56
|
234 |
+
|
235 |
+
micro avg 0.8082 0.7987 0.8034 1187
|
236 |
+
macro avg 0.7106 0.6930 0.7007 1187
|
237 |
+
weighted avg 0.8037 0.7987 0.8007 1187
|
238 |
+
|
239 |
+
2023-10-16 22:46:23,471 ----------------------------------------------------------------------------------------------------
|