stefan-it commited on
Commit
ecddb45
·
1 Parent(s): 53671fa

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +239 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:077671538fbfa0f4467afdad38940bf3bfd423ea61b0420f5f2e64336e64c5e9
3
+ size 443311175
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 22:39:07 0.0000 0.3741 0.0563 0.7348 0.7131 0.7238 0.5930
3
+ 2 22:39:55 0.0000 0.0770 0.0543 0.7510 0.8017 0.7755 0.6463
4
+ 3 22:40:42 0.0000 0.0512 0.0629 0.7530 0.7975 0.7746 0.6451
5
+ 4 22:41:29 0.0000 0.0331 0.0667 0.7300 0.8101 0.7680 0.6379
6
+ 5 22:42:16 0.0000 0.0211 0.0905 0.7606 0.8312 0.7944 0.6701
7
+ 6 22:43:04 0.0000 0.0178 0.0954 0.7164 0.8312 0.7695 0.6417
8
+ 7 22:43:52 0.0000 0.0115 0.1044 0.7490 0.8312 0.7880 0.6655
9
+ 8 22:44:40 0.0000 0.0076 0.1056 0.7568 0.8270 0.7903 0.6689
10
+ 9 22:45:27 0.0000 0.0056 0.1088 0.7984 0.8186 0.8083 0.6929
11
+ 10 22:46:15 0.0000 0.0040 0.1119 0.7760 0.8186 0.7967 0.6760
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-16 22:38:20,773 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-16 22:38:20,774 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-16 22:38:20,774 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
52
+ - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
53
+ 2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-16 22:38:20,774 Train: 6183 sentences
55
+ 2023-10-16 22:38:20,774 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-16 22:38:20,774 Training Params:
58
+ 2023-10-16 22:38:20,774 - learning_rate: "3e-05"
59
+ 2023-10-16 22:38:20,774 - mini_batch_size: "8"
60
+ 2023-10-16 22:38:20,774 - max_epochs: "10"
61
+ 2023-10-16 22:38:20,774 - shuffle: "True"
62
+ 2023-10-16 22:38:20,774 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-16 22:38:20,774 Plugins:
64
+ 2023-10-16 22:38:20,774 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-16 22:38:20,775 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-16 22:38:20,775 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-16 22:38:20,775 Computation:
70
+ 2023-10-16 22:38:20,775 - compute on device: cuda:0
71
+ 2023-10-16 22:38:20,775 - embedding storage: none
72
+ 2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-16 22:38:20,775 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
74
+ 2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-16 22:38:20,775 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-16 22:38:25,471 epoch 1 - iter 77/773 - loss 2.20232412 - time (sec): 4.70 - samples/sec: 2784.44 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-16 22:38:30,116 epoch 1 - iter 154/773 - loss 1.34548593 - time (sec): 9.34 - samples/sec: 2692.01 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-16 22:38:34,672 epoch 1 - iter 231/773 - loss 0.96047681 - time (sec): 13.90 - samples/sec: 2750.73 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-16 22:38:38,896 epoch 1 - iter 308/773 - loss 0.76598464 - time (sec): 18.12 - samples/sec: 2776.90 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-16 22:38:43,376 epoch 1 - iter 385/773 - loss 0.63985618 - time (sec): 22.60 - samples/sec: 2772.49 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-16 22:38:47,807 epoch 1 - iter 462/773 - loss 0.55707246 - time (sec): 27.03 - samples/sec: 2750.85 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-16 22:38:52,214 epoch 1 - iter 539/773 - loss 0.49281757 - time (sec): 31.44 - samples/sec: 2756.04 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-16 22:38:56,475 epoch 1 - iter 616/773 - loss 0.44355349 - time (sec): 35.70 - samples/sec: 2772.08 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-16 22:39:01,048 epoch 1 - iter 693/773 - loss 0.40565474 - time (sec): 40.27 - samples/sec: 2770.12 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-16 22:39:05,427 epoch 1 - iter 770/773 - loss 0.37504677 - time (sec): 44.65 - samples/sec: 2774.67 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-16 22:39:05,579 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-16 22:39:05,579 EPOCH 1 done: loss 0.3741 - lr: 0.000030
88
+ 2023-10-16 22:39:07,318 DEV : loss 0.056319333612918854 - f1-score (micro avg) 0.7238
89
+ 2023-10-16 22:39:07,330 saving best model
90
+ 2023-10-16 22:39:07,659 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-16 22:39:12,043 epoch 2 - iter 77/773 - loss 0.09617103 - time (sec): 4.38 - samples/sec: 2800.90 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-16 22:39:16,570 epoch 2 - iter 154/773 - loss 0.08610244 - time (sec): 8.91 - samples/sec: 2855.09 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-16 22:39:20,960 epoch 2 - iter 231/773 - loss 0.08642226 - time (sec): 13.30 - samples/sec: 2784.34 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-16 22:39:25,501 epoch 2 - iter 308/773 - loss 0.08300522 - time (sec): 17.84 - samples/sec: 2766.83 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-16 22:39:29,894 epoch 2 - iter 385/773 - loss 0.08529708 - time (sec): 22.23 - samples/sec: 2755.25 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-16 22:39:34,743 epoch 2 - iter 462/773 - loss 0.07999665 - time (sec): 27.08 - samples/sec: 2760.78 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-16 22:39:39,180 epoch 2 - iter 539/773 - loss 0.07886580 - time (sec): 31.52 - samples/sec: 2754.08 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-16 22:39:43,779 epoch 2 - iter 616/773 - loss 0.07954784 - time (sec): 36.12 - samples/sec: 2737.93 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-16 22:39:48,076 epoch 2 - iter 693/773 - loss 0.07809064 - time (sec): 40.42 - samples/sec: 2743.57 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-16 22:39:52,684 epoch 2 - iter 770/773 - loss 0.07720281 - time (sec): 45.02 - samples/sec: 2751.19 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-16 22:39:52,845 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-16 22:39:52,845 EPOCH 2 done: loss 0.0770 - lr: 0.000027
103
+ 2023-10-16 22:39:55,157 DEV : loss 0.054309092462062836 - f1-score (micro avg) 0.7755
104
+ 2023-10-16 22:39:55,169 saving best model
105
+ 2023-10-16 22:39:55,568 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-16 22:39:59,920 epoch 3 - iter 77/773 - loss 0.05351069 - time (sec): 4.35 - samples/sec: 2893.11 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-16 22:40:04,475 epoch 3 - iter 154/773 - loss 0.04936663 - time (sec): 8.90 - samples/sec: 2813.62 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-16 22:40:08,874 epoch 3 - iter 231/773 - loss 0.04838463 - time (sec): 13.30 - samples/sec: 2802.51 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-16 22:40:13,388 epoch 3 - iter 308/773 - loss 0.04961579 - time (sec): 17.82 - samples/sec: 2763.32 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-16 22:40:17,767 epoch 3 - iter 385/773 - loss 0.04991997 - time (sec): 22.20 - samples/sec: 2743.43 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-16 22:40:22,119 epoch 3 - iter 462/773 - loss 0.05042490 - time (sec): 26.55 - samples/sec: 2730.63 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-16 22:40:26,594 epoch 3 - iter 539/773 - loss 0.05214182 - time (sec): 31.02 - samples/sec: 2735.69 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-16 22:40:31,138 epoch 3 - iter 616/773 - loss 0.05156897 - time (sec): 35.57 - samples/sec: 2743.50 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-16 22:40:35,704 epoch 3 - iter 693/773 - loss 0.05185757 - time (sec): 40.13 - samples/sec: 2748.55 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-16 22:40:40,464 epoch 3 - iter 770/773 - loss 0.05129452 - time (sec): 44.89 - samples/sec: 2758.78 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-16 22:40:40,621 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-16 22:40:40,621 EPOCH 3 done: loss 0.0512 - lr: 0.000023
118
+ 2023-10-16 22:40:42,689 DEV : loss 0.06285982578992844 - f1-score (micro avg) 0.7746
119
+ 2023-10-16 22:40:42,701 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-16 22:40:47,094 epoch 4 - iter 77/773 - loss 0.02714792 - time (sec): 4.39 - samples/sec: 2681.65 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-16 22:40:51,754 epoch 4 - iter 154/773 - loss 0.03209934 - time (sec): 9.05 - samples/sec: 2795.02 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-16 22:40:56,059 epoch 4 - iter 231/773 - loss 0.03306353 - time (sec): 13.36 - samples/sec: 2773.05 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-16 22:41:00,497 epoch 4 - iter 308/773 - loss 0.03181679 - time (sec): 17.80 - samples/sec: 2755.37 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-16 22:41:05,077 epoch 4 - iter 385/773 - loss 0.03093616 - time (sec): 22.37 - samples/sec: 2771.77 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-16 22:41:09,714 epoch 4 - iter 462/773 - loss 0.03052049 - time (sec): 27.01 - samples/sec: 2781.54 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-16 22:41:14,036 epoch 4 - iter 539/773 - loss 0.03167868 - time (sec): 31.33 - samples/sec: 2757.72 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-16 22:41:18,685 epoch 4 - iter 616/773 - loss 0.03232620 - time (sec): 35.98 - samples/sec: 2757.92 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-16 22:41:23,024 epoch 4 - iter 693/773 - loss 0.03317880 - time (sec): 40.32 - samples/sec: 2757.58 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-16 22:41:27,622 epoch 4 - iter 770/773 - loss 0.03312722 - time (sec): 44.92 - samples/sec: 2758.94 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-16 22:41:27,780 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-16 22:41:27,780 EPOCH 4 done: loss 0.0331 - lr: 0.000020
132
+ 2023-10-16 22:41:29,832 DEV : loss 0.06671957671642303 - f1-score (micro avg) 0.768
133
+ 2023-10-16 22:41:29,844 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-16 22:41:34,447 epoch 5 - iter 77/773 - loss 0.02283526 - time (sec): 4.60 - samples/sec: 2670.48 - lr: 0.000020 - momentum: 0.000000
135
+ 2023-10-16 22:41:39,025 epoch 5 - iter 154/773 - loss 0.01970168 - time (sec): 9.18 - samples/sec: 2735.63 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-16 22:41:43,613 epoch 5 - iter 231/773 - loss 0.02003167 - time (sec): 13.77 - samples/sec: 2723.64 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-16 22:41:48,304 epoch 5 - iter 308/773 - loss 0.01954257 - time (sec): 18.46 - samples/sec: 2758.00 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-16 22:41:52,860 epoch 5 - iter 385/773 - loss 0.02257951 - time (sec): 23.01 - samples/sec: 2759.60 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-16 22:41:57,270 epoch 5 - iter 462/773 - loss 0.02218939 - time (sec): 27.42 - samples/sec: 2773.32 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-16 22:42:01,832 epoch 5 - iter 539/773 - loss 0.02119526 - time (sec): 31.99 - samples/sec: 2785.25 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-16 22:42:06,209 epoch 5 - iter 616/773 - loss 0.02082536 - time (sec): 36.36 - samples/sec: 2777.63 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-16 22:42:10,432 epoch 5 - iter 693/773 - loss 0.02135291 - time (sec): 40.59 - samples/sec: 2772.43 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-16 22:42:14,647 epoch 5 - iter 770/773 - loss 0.02117237 - time (sec): 44.80 - samples/sec: 2767.16 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-16 22:42:14,799 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-16 22:42:14,799 EPOCH 5 done: loss 0.0211 - lr: 0.000017
146
+ 2023-10-16 22:42:16,911 DEV : loss 0.09046085923910141 - f1-score (micro avg) 0.7944
147
+ 2023-10-16 22:42:16,923 saving best model
148
+ 2023-10-16 22:42:17,382 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-16 22:42:21,963 epoch 6 - iter 77/773 - loss 0.01026234 - time (sec): 4.58 - samples/sec: 2677.11 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-16 22:42:26,622 epoch 6 - iter 154/773 - loss 0.01400531 - time (sec): 9.24 - samples/sec: 2663.86 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-16 22:42:31,099 epoch 6 - iter 231/773 - loss 0.01813924 - time (sec): 13.71 - samples/sec: 2663.41 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-16 22:42:35,639 epoch 6 - iter 308/773 - loss 0.01889707 - time (sec): 18.25 - samples/sec: 2703.15 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-16 22:42:40,002 epoch 6 - iter 385/773 - loss 0.01861180 - time (sec): 22.62 - samples/sec: 2730.77 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-16 22:42:44,737 epoch 6 - iter 462/773 - loss 0.01792516 - time (sec): 27.35 - samples/sec: 2747.07 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-16 22:42:49,244 epoch 6 - iter 539/773 - loss 0.01740097 - time (sec): 31.86 - samples/sec: 2732.59 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-16 22:42:53,665 epoch 6 - iter 616/773 - loss 0.01803069 - time (sec): 36.28 - samples/sec: 2725.53 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-16 22:42:58,153 epoch 6 - iter 693/773 - loss 0.01729439 - time (sec): 40.77 - samples/sec: 2724.07 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-16 22:43:02,751 epoch 6 - iter 770/773 - loss 0.01779138 - time (sec): 45.36 - samples/sec: 2732.31 - lr: 0.000013 - momentum: 0.000000
159
+ 2023-10-16 22:43:02,910 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-16 22:43:02,910 EPOCH 6 done: loss 0.0178 - lr: 0.000013
161
+ 2023-10-16 22:43:04,962 DEV : loss 0.09538255631923676 - f1-score (micro avg) 0.7695
162
+ 2023-10-16 22:43:04,975 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-16 22:43:09,443 epoch 7 - iter 77/773 - loss 0.00643346 - time (sec): 4.47 - samples/sec: 2694.74 - lr: 0.000013 - momentum: 0.000000
164
+ 2023-10-16 22:43:13,972 epoch 7 - iter 154/773 - loss 0.01003396 - time (sec): 9.00 - samples/sec: 2657.19 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-16 22:43:18,760 epoch 7 - iter 231/773 - loss 0.01041150 - time (sec): 13.78 - samples/sec: 2638.74 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-16 22:43:23,225 epoch 7 - iter 308/773 - loss 0.01290931 - time (sec): 18.25 - samples/sec: 2663.58 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-16 22:43:27,776 epoch 7 - iter 385/773 - loss 0.01397438 - time (sec): 22.80 - samples/sec: 2692.33 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-16 22:43:32,358 epoch 7 - iter 462/773 - loss 0.01329284 - time (sec): 27.38 - samples/sec: 2715.59 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-16 22:43:36,906 epoch 7 - iter 539/773 - loss 0.01261849 - time (sec): 31.93 - samples/sec: 2722.96 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-16 22:43:41,592 epoch 7 - iter 616/773 - loss 0.01224990 - time (sec): 36.62 - samples/sec: 2704.51 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-16 22:43:45,932 epoch 7 - iter 693/773 - loss 0.01208302 - time (sec): 40.96 - samples/sec: 2717.98 - lr: 0.000010 - momentum: 0.000000
172
+ 2023-10-16 22:43:50,481 epoch 7 - iter 770/773 - loss 0.01149147 - time (sec): 45.51 - samples/sec: 2724.45 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-16 22:43:50,635 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-16 22:43:50,635 EPOCH 7 done: loss 0.0115 - lr: 0.000010
175
+ 2023-10-16 22:43:52,654 DEV : loss 0.1044364646077156 - f1-score (micro avg) 0.788
176
+ 2023-10-16 22:43:52,666 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-16 22:43:56,937 epoch 8 - iter 77/773 - loss 0.00328818 - time (sec): 4.27 - samples/sec: 2666.00 - lr: 0.000010 - momentum: 0.000000
178
+ 2023-10-16 22:44:01,720 epoch 8 - iter 154/773 - loss 0.00544868 - time (sec): 9.05 - samples/sec: 2733.65 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-16 22:44:06,477 epoch 8 - iter 231/773 - loss 0.00555489 - time (sec): 13.81 - samples/sec: 2740.46 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-16 22:44:11,283 epoch 8 - iter 308/773 - loss 0.00511882 - time (sec): 18.62 - samples/sec: 2740.44 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-16 22:44:15,741 epoch 8 - iter 385/773 - loss 0.00595049 - time (sec): 23.07 - samples/sec: 2733.45 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-16 22:44:20,022 epoch 8 - iter 462/773 - loss 0.00656918 - time (sec): 27.35 - samples/sec: 2731.09 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-16 22:44:24,368 epoch 8 - iter 539/773 - loss 0.00680503 - time (sec): 31.70 - samples/sec: 2747.97 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-16 22:44:29,123 epoch 8 - iter 616/773 - loss 0.00713075 - time (sec): 36.46 - samples/sec: 2735.58 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-16 22:44:33,766 epoch 8 - iter 693/773 - loss 0.00795647 - time (sec): 41.10 - samples/sec: 2729.44 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-16 22:44:38,196 epoch 8 - iter 770/773 - loss 0.00759391 - time (sec): 45.53 - samples/sec: 2718.37 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-16 22:44:38,370 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-16 22:44:38,370 EPOCH 8 done: loss 0.0076 - lr: 0.000007
189
+ 2023-10-16 22:44:40,510 DEV : loss 0.10555334389209747 - f1-score (micro avg) 0.7903
190
+ 2023-10-16 22:44:40,524 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-16 22:44:45,138 epoch 9 - iter 77/773 - loss 0.00640030 - time (sec): 4.61 - samples/sec: 2548.91 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-16 22:44:49,781 epoch 9 - iter 154/773 - loss 0.00442726 - time (sec): 9.26 - samples/sec: 2569.05 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-16 22:44:54,449 epoch 9 - iter 231/773 - loss 0.00457001 - time (sec): 13.92 - samples/sec: 2670.36 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-16 22:44:58,987 epoch 9 - iter 308/773 - loss 0.00461721 - time (sec): 18.46 - samples/sec: 2662.10 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-16 22:45:03,559 epoch 9 - iter 385/773 - loss 0.00507112 - time (sec): 23.03 - samples/sec: 2700.71 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-16 22:45:07,877 epoch 9 - iter 462/773 - loss 0.00480843 - time (sec): 27.35 - samples/sec: 2716.12 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-16 22:45:12,316 epoch 9 - iter 539/773 - loss 0.00451261 - time (sec): 31.79 - samples/sec: 2737.14 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-16 22:45:16,589 epoch 9 - iter 616/773 - loss 0.00542387 - time (sec): 36.06 - samples/sec: 2745.31 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-16 22:45:20,864 epoch 9 - iter 693/773 - loss 0.00545535 - time (sec): 40.34 - samples/sec: 2756.94 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-16 22:45:25,445 epoch 9 - iter 770/773 - loss 0.00559464 - time (sec): 44.92 - samples/sec: 2759.75 - lr: 0.000003 - momentum: 0.000000
201
+ 2023-10-16 22:45:25,595 ----------------------------------------------------------------------------------------------------
202
+ 2023-10-16 22:45:25,595 EPOCH 9 done: loss 0.0056 - lr: 0.000003
203
+ 2023-10-16 22:45:27,729 DEV : loss 0.10876341164112091 - f1-score (micro avg) 0.8083
204
+ 2023-10-16 22:45:27,742 saving best model
205
+ 2023-10-16 22:45:28,250 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-16 22:45:32,710 epoch 10 - iter 77/773 - loss 0.00084844 - time (sec): 4.46 - samples/sec: 2738.56 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-16 22:45:37,149 epoch 10 - iter 154/773 - loss 0.00313889 - time (sec): 8.90 - samples/sec: 2793.18 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-16 22:45:41,590 epoch 10 - iter 231/773 - loss 0.00408552 - time (sec): 13.34 - samples/sec: 2788.43 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-16 22:45:46,096 epoch 10 - iter 308/773 - loss 0.00423339 - time (sec): 17.84 - samples/sec: 2786.35 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-16 22:45:50,417 epoch 10 - iter 385/773 - loss 0.00406239 - time (sec): 22.17 - samples/sec: 2807.87 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-16 22:45:54,922 epoch 10 - iter 462/773 - loss 0.00398725 - time (sec): 26.67 - samples/sec: 2806.42 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-16 22:45:59,453 epoch 10 - iter 539/773 - loss 0.00414480 - time (sec): 31.20 - samples/sec: 2786.22 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-16 22:46:03,820 epoch 10 - iter 616/773 - loss 0.00400871 - time (sec): 35.57 - samples/sec: 2793.71 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-16 22:46:08,284 epoch 10 - iter 693/773 - loss 0.00394248 - time (sec): 40.03 - samples/sec: 2787.36 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-16 22:46:12,795 epoch 10 - iter 770/773 - loss 0.00399570 - time (sec): 44.54 - samples/sec: 2782.73 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-16 22:46:12,940 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-16 22:46:12,940 EPOCH 10 done: loss 0.0040 - lr: 0.000000
218
+ 2023-10-16 22:46:15,389 DEV : loss 0.11186421662569046 - f1-score (micro avg) 0.7967
219
+ 2023-10-16 22:46:15,737 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-16 22:46:15,738 Loading model from best epoch ...
221
+ 2023-10-16 22:46:17,241 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
222
+ 2023-10-16 22:46:23,470
223
+ Results:
224
+ - F-score (micro) 0.8034
225
+ - F-score (macro) 0.7007
226
+ - Accuracy 0.6925
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ LOC 0.8494 0.8584 0.8538 946
232
+ BUILDING 0.6101 0.5243 0.5640 185
233
+ STREET 0.6724 0.6964 0.6842 56
234
+
235
+ micro avg 0.8082 0.7987 0.8034 1187
236
+ macro avg 0.7106 0.6930 0.7007 1187
237
+ weighted avg 0.8037 0.7987 0.8007 1187
238
+
239
+ 2023-10-16 22:46:23,471 ----------------------------------------------------------------------------------------------------