stefan-it commited on
Commit
a155999
·
1 Parent(s): c634f01

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +245 -0
training.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-27 15:57:04,764 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-27 15:57:04,765 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): XLMRobertaModel(
5
+ (embeddings): XLMRobertaEmbeddings(
6
+ (word_embeddings): Embedding(250003, 1024)
7
+ (position_embeddings): Embedding(514, 1024, padding_idx=1)
8
+ (token_type_embeddings): Embedding(1, 1024)
9
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): XLMRobertaEncoder(
13
+ (layer): ModuleList(
14
+ (0-23): 24 x XLMRobertaLayer(
15
+ (attention): XLMRobertaAttention(
16
+ (self): XLMRobertaSelfAttention(
17
+ (query): Linear(in_features=1024, out_features=1024, bias=True)
18
+ (key): Linear(in_features=1024, out_features=1024, bias=True)
19
+ (value): Linear(in_features=1024, out_features=1024, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): XLMRobertaSelfOutput(
23
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
24
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): XLMRobertaIntermediate(
29
+ (dense): Linear(in_features=1024, out_features=4096, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): XLMRobertaOutput(
33
+ (dense): Linear(in_features=4096, out_features=1024, bias=True)
34
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): XLMRobertaPooler(
41
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=1024, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-27 15:57:04,765 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-27 15:57:04,765 Corpus: 14903 train + 3449 dev + 3658 test sentences
52
+ 2023-10-27 15:57:04,765 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-27 15:57:04,765 Train: 14903 sentences
54
+ 2023-10-27 15:57:04,766 (train_with_dev=False, train_with_test=False)
55
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
56
+ 2023-10-27 15:57:04,766 Training Params:
57
+ 2023-10-27 15:57:04,766 - learning_rate: "5e-06"
58
+ 2023-10-27 15:57:04,766 - mini_batch_size: "4"
59
+ 2023-10-27 15:57:04,766 - max_epochs: "10"
60
+ 2023-10-27 15:57:04,766 - shuffle: "True"
61
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
62
+ 2023-10-27 15:57:04,766 Plugins:
63
+ 2023-10-27 15:57:04,766 - TensorboardLogger
64
+ 2023-10-27 15:57:04,766 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-27 15:57:04,766 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-27 15:57:04,766 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-27 15:57:04,766 Computation:
70
+ 2023-10-27 15:57:04,766 - compute on device: cuda:0
71
+ 2023-10-27 15:57:04,766 - embedding storage: none
72
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-27 15:57:04,766 Model training base path: "flair-clean-conll-lr5e-06-bs4-2"
74
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-27 15:57:04,766 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-27 15:57:04,766 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2023-10-27 15:57:51,345 epoch 1 - iter 372/3726 - loss 3.66933019 - time (sec): 46.58 - samples/sec: 441.14 - lr: 0.000000 - momentum: 0.000000
78
+ 2023-10-27 15:58:37,240 epoch 1 - iter 744/3726 - loss 2.44791196 - time (sec): 92.47 - samples/sec: 440.81 - lr: 0.000001 - momentum: 0.000000
79
+ 2023-10-27 15:59:23,004 epoch 1 - iter 1116/3726 - loss 1.82180853 - time (sec): 138.24 - samples/sec: 444.18 - lr: 0.000001 - momentum: 0.000000
80
+ 2023-10-27 16:00:08,910 epoch 1 - iter 1488/3726 - loss 1.46511605 - time (sec): 184.14 - samples/sec: 445.62 - lr: 0.000002 - momentum: 0.000000
81
+ 2023-10-27 16:00:55,551 epoch 1 - iter 1860/3726 - loss 1.23020473 - time (sec): 230.78 - samples/sec: 444.20 - lr: 0.000002 - momentum: 0.000000
82
+ 2023-10-27 16:01:41,835 epoch 1 - iter 2232/3726 - loss 1.05969433 - time (sec): 277.07 - samples/sec: 443.08 - lr: 0.000003 - momentum: 0.000000
83
+ 2023-10-27 16:02:28,579 epoch 1 - iter 2604/3726 - loss 0.92870944 - time (sec): 323.81 - samples/sec: 443.41 - lr: 0.000003 - momentum: 0.000000
84
+ 2023-10-27 16:03:15,307 epoch 1 - iter 2976/3726 - loss 0.83025530 - time (sec): 370.54 - samples/sec: 441.38 - lr: 0.000004 - momentum: 0.000000
85
+ 2023-10-27 16:04:02,180 epoch 1 - iter 3348/3726 - loss 0.75373492 - time (sec): 417.41 - samples/sec: 439.59 - lr: 0.000004 - momentum: 0.000000
86
+ 2023-10-27 16:04:49,217 epoch 1 - iter 3720/3726 - loss 0.68664292 - time (sec): 464.45 - samples/sec: 439.63 - lr: 0.000005 - momentum: 0.000000
87
+ 2023-10-27 16:04:49,995 ----------------------------------------------------------------------------------------------------
88
+ 2023-10-27 16:04:49,996 EPOCH 1 done: loss 0.6854 - lr: 0.000005
89
+ 2023-10-27 16:05:15,688 DEV : loss 0.06499314308166504 - f1-score (micro avg) 0.941
90
+ 2023-10-27 16:05:15,743 saving best model
91
+ 2023-10-27 16:05:17,851 ----------------------------------------------------------------------------------------------------
92
+ 2023-10-27 16:06:05,511 epoch 2 - iter 372/3726 - loss 0.08608847 - time (sec): 47.66 - samples/sec: 436.63 - lr: 0.000005 - momentum: 0.000000
93
+ 2023-10-27 16:06:53,421 epoch 2 - iter 744/3726 - loss 0.08159160 - time (sec): 95.57 - samples/sec: 433.86 - lr: 0.000005 - momentum: 0.000000
94
+ 2023-10-27 16:07:40,883 epoch 2 - iter 1116/3726 - loss 0.08672812 - time (sec): 143.03 - samples/sec: 434.04 - lr: 0.000005 - momentum: 0.000000
95
+ 2023-10-27 16:08:28,410 epoch 2 - iter 1488/3726 - loss 0.08683755 - time (sec): 190.56 - samples/sec: 432.29 - lr: 0.000005 - momentum: 0.000000
96
+ 2023-10-27 16:09:15,037 epoch 2 - iter 1860/3726 - loss 0.08779187 - time (sec): 237.18 - samples/sec: 435.35 - lr: 0.000005 - momentum: 0.000000
97
+ 2023-10-27 16:10:02,026 epoch 2 - iter 2232/3726 - loss 0.08712052 - time (sec): 284.17 - samples/sec: 434.32 - lr: 0.000005 - momentum: 0.000000
98
+ 2023-10-27 16:10:48,962 epoch 2 - iter 2604/3726 - loss 0.08526279 - time (sec): 331.11 - samples/sec: 434.61 - lr: 0.000005 - momentum: 0.000000
99
+ 2023-10-27 16:11:35,182 epoch 2 - iter 2976/3726 - loss 0.08450012 - time (sec): 377.33 - samples/sec: 434.72 - lr: 0.000005 - momentum: 0.000000
100
+ 2023-10-27 16:12:21,618 epoch 2 - iter 3348/3726 - loss 0.08460079 - time (sec): 423.77 - samples/sec: 433.17 - lr: 0.000005 - momentum: 0.000000
101
+ 2023-10-27 16:13:08,337 epoch 2 - iter 3720/3726 - loss 0.08261905 - time (sec): 470.48 - samples/sec: 434.27 - lr: 0.000004 - momentum: 0.000000
102
+ 2023-10-27 16:13:09,112 ----------------------------------------------------------------------------------------------------
103
+ 2023-10-27 16:13:09,112 EPOCH 2 done: loss 0.0825 - lr: 0.000004
104
+ 2023-10-27 16:13:33,111 DEV : loss 0.08286476135253906 - f1-score (micro avg) 0.9546
105
+ 2023-10-27 16:13:33,170 saving best model
106
+ 2023-10-27 16:13:35,742 ----------------------------------------------------------------------------------------------------
107
+ 2023-10-27 16:14:22,419 epoch 3 - iter 372/3726 - loss 0.05591265 - time (sec): 46.67 - samples/sec: 435.31 - lr: 0.000004 - momentum: 0.000000
108
+ 2023-10-27 16:15:09,686 epoch 3 - iter 744/3726 - loss 0.05984730 - time (sec): 93.94 - samples/sec: 434.32 - lr: 0.000004 - momentum: 0.000000
109
+ 2023-10-27 16:15:57,178 epoch 3 - iter 1116/3726 - loss 0.06005216 - time (sec): 141.43 - samples/sec: 435.00 - lr: 0.000004 - momentum: 0.000000
110
+ 2023-10-27 16:16:45,692 epoch 3 - iter 1488/3726 - loss 0.05601000 - time (sec): 189.95 - samples/sec: 430.14 - lr: 0.000004 - momentum: 0.000000
111
+ 2023-10-27 16:17:32,939 epoch 3 - iter 1860/3726 - loss 0.05476618 - time (sec): 237.20 - samples/sec: 426.95 - lr: 0.000004 - momentum: 0.000000
112
+ 2023-10-27 16:18:20,145 epoch 3 - iter 2232/3726 - loss 0.05358297 - time (sec): 284.40 - samples/sec: 428.53 - lr: 0.000004 - momentum: 0.000000
113
+ 2023-10-27 16:19:07,624 epoch 3 - iter 2604/3726 - loss 0.05384047 - time (sec): 331.88 - samples/sec: 429.32 - lr: 0.000004 - momentum: 0.000000
114
+ 2023-10-27 16:19:54,617 epoch 3 - iter 2976/3726 - loss 0.05438530 - time (sec): 378.87 - samples/sec: 429.16 - lr: 0.000004 - momentum: 0.000000
115
+ 2023-10-27 16:20:41,784 epoch 3 - iter 3348/3726 - loss 0.05364700 - time (sec): 426.04 - samples/sec: 430.25 - lr: 0.000004 - momentum: 0.000000
116
+ 2023-10-27 16:21:28,928 epoch 3 - iter 3720/3726 - loss 0.05265148 - time (sec): 473.18 - samples/sec: 431.75 - lr: 0.000004 - momentum: 0.000000
117
+ 2023-10-27 16:21:29,696 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-27 16:21:29,696 EPOCH 3 done: loss 0.0527 - lr: 0.000004
119
+ 2023-10-27 16:21:53,630 DEV : loss 0.05983666330575943 - f1-score (micro avg) 0.963
120
+ 2023-10-27 16:21:53,682 saving best model
121
+ 2023-10-27 16:21:55,901 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-27 16:22:43,296 epoch 4 - iter 372/3726 - loss 0.03718873 - time (sec): 47.39 - samples/sec: 429.14 - lr: 0.000004 - momentum: 0.000000
123
+ 2023-10-27 16:23:30,210 epoch 4 - iter 744/3726 - loss 0.04099485 - time (sec): 94.31 - samples/sec: 435.38 - lr: 0.000004 - momentum: 0.000000
124
+ 2023-10-27 16:24:17,027 epoch 4 - iter 1116/3726 - loss 0.03721825 - time (sec): 141.12 - samples/sec: 434.73 - lr: 0.000004 - momentum: 0.000000
125
+ 2023-10-27 16:25:04,504 epoch 4 - iter 1488/3726 - loss 0.03714011 - time (sec): 188.60 - samples/sec: 433.49 - lr: 0.000004 - momentum: 0.000000
126
+ 2023-10-27 16:25:52,892 epoch 4 - iter 1860/3726 - loss 0.03758136 - time (sec): 236.99 - samples/sec: 428.95 - lr: 0.000004 - momentum: 0.000000
127
+ 2023-10-27 16:26:40,944 epoch 4 - iter 2232/3726 - loss 0.03790295 - time (sec): 285.04 - samples/sec: 428.86 - lr: 0.000004 - momentum: 0.000000
128
+ 2023-10-27 16:27:29,194 epoch 4 - iter 2604/3726 - loss 0.03805339 - time (sec): 333.29 - samples/sec: 428.62 - lr: 0.000004 - momentum: 0.000000
129
+ 2023-10-27 16:28:16,189 epoch 4 - iter 2976/3726 - loss 0.03708819 - time (sec): 380.29 - samples/sec: 429.11 - lr: 0.000003 - momentum: 0.000000
130
+ 2023-10-27 16:29:03,316 epoch 4 - iter 3348/3726 - loss 0.03680602 - time (sec): 427.41 - samples/sec: 429.64 - lr: 0.000003 - momentum: 0.000000
131
+ 2023-10-27 16:29:50,404 epoch 4 - iter 3720/3726 - loss 0.03682622 - time (sec): 474.50 - samples/sec: 430.34 - lr: 0.000003 - momentum: 0.000000
132
+ 2023-10-27 16:29:51,089 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-27 16:29:51,089 EPOCH 4 done: loss 0.0369 - lr: 0.000003
134
+ 2023-10-27 16:30:14,916 DEV : loss 0.04883182421326637 - f1-score (micro avg) 0.9659
135
+ 2023-10-27 16:30:14,971 saving best model
136
+ 2023-10-27 16:30:17,459 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-27 16:31:04,080 epoch 5 - iter 372/3726 - loss 0.03340894 - time (sec): 46.62 - samples/sec: 441.00 - lr: 0.000003 - momentum: 0.000000
138
+ 2023-10-27 16:31:50,991 epoch 5 - iter 744/3726 - loss 0.03438447 - time (sec): 93.53 - samples/sec: 439.30 - lr: 0.000003 - momentum: 0.000000
139
+ 2023-10-27 16:32:38,716 epoch 5 - iter 1116/3726 - loss 0.03321367 - time (sec): 141.25 - samples/sec: 435.67 - lr: 0.000003 - momentum: 0.000000
140
+ 2023-10-27 16:33:25,523 epoch 5 - iter 1488/3726 - loss 0.02824924 - time (sec): 188.06 - samples/sec: 435.61 - lr: 0.000003 - momentum: 0.000000
141
+ 2023-10-27 16:34:12,201 epoch 5 - iter 1860/3726 - loss 0.02851437 - time (sec): 234.74 - samples/sec: 433.50 - lr: 0.000003 - momentum: 0.000000
142
+ 2023-10-27 16:34:59,180 epoch 5 - iter 2232/3726 - loss 0.02789578 - time (sec): 281.72 - samples/sec: 436.78 - lr: 0.000003 - momentum: 0.000000
143
+ 2023-10-27 16:35:46,777 epoch 5 - iter 2604/3726 - loss 0.02681236 - time (sec): 329.32 - samples/sec: 434.70 - lr: 0.000003 - momentum: 0.000000
144
+ 2023-10-27 16:36:33,751 epoch 5 - iter 2976/3726 - loss 0.02765246 - time (sec): 376.29 - samples/sec: 432.28 - lr: 0.000003 - momentum: 0.000000
145
+ 2023-10-27 16:37:20,836 epoch 5 - iter 3348/3726 - loss 0.02767176 - time (sec): 423.38 - samples/sec: 432.82 - lr: 0.000003 - momentum: 0.000000
146
+ 2023-10-27 16:38:08,311 epoch 5 - iter 3720/3726 - loss 0.02792716 - time (sec): 470.85 - samples/sec: 433.69 - lr: 0.000003 - momentum: 0.000000
147
+ 2023-10-27 16:38:09,077 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-27 16:38:09,077 EPOCH 5 done: loss 0.0279 - lr: 0.000003
149
+ 2023-10-27 16:38:33,913 DEV : loss 0.05045438930392265 - f1-score (micro avg) 0.9709
150
+ 2023-10-27 16:38:33,966 saving best model
151
+ 2023-10-27 16:38:36,347 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-27 16:39:23,511 epoch 6 - iter 372/3726 - loss 0.02592894 - time (sec): 47.15 - samples/sec: 418.65 - lr: 0.000003 - momentum: 0.000000
153
+ 2023-10-27 16:40:10,156 epoch 6 - iter 744/3726 - loss 0.02441091 - time (sec): 93.80 - samples/sec: 435.34 - lr: 0.000003 - momentum: 0.000000
154
+ 2023-10-27 16:40:56,462 epoch 6 - iter 1116/3726 - loss 0.02083566 - time (sec): 140.10 - samples/sec: 437.89 - lr: 0.000003 - momentum: 0.000000
155
+ 2023-10-27 16:41:42,045 epoch 6 - iter 1488/3726 - loss 0.01995447 - time (sec): 185.69 - samples/sec: 441.22 - lr: 0.000003 - momentum: 0.000000
156
+ 2023-10-27 16:42:28,231 epoch 6 - iter 1860/3726 - loss 0.01971121 - time (sec): 231.87 - samples/sec: 442.59 - lr: 0.000003 - momentum: 0.000000
157
+ 2023-10-27 16:43:13,863 epoch 6 - iter 2232/3726 - loss 0.02038473 - time (sec): 277.50 - samples/sec: 442.07 - lr: 0.000002 - momentum: 0.000000
158
+ 2023-10-27 16:43:59,052 epoch 6 - iter 2604/3726 - loss 0.02010731 - time (sec): 322.69 - samples/sec: 442.05 - lr: 0.000002 - momentum: 0.000000
159
+ 2023-10-27 16:44:44,618 epoch 6 - iter 2976/3726 - loss 0.02110678 - time (sec): 368.26 - samples/sec: 443.32 - lr: 0.000002 - momentum: 0.000000
160
+ 2023-10-27 16:45:30,589 epoch 6 - iter 3348/3726 - loss 0.02064377 - time (sec): 414.23 - samples/sec: 443.27 - lr: 0.000002 - momentum: 0.000000
161
+ 2023-10-27 16:46:15,877 epoch 6 - iter 3720/3726 - loss 0.02070977 - time (sec): 459.52 - samples/sec: 444.64 - lr: 0.000002 - momentum: 0.000000
162
+ 2023-10-27 16:46:16,609 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-27 16:46:16,609 EPOCH 6 done: loss 0.0207 - lr: 0.000002
164
+ 2023-10-27 16:46:39,599 DEV : loss 0.05228659138083458 - f1-score (micro avg) 0.9688
165
+ 2023-10-27 16:46:39,652 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-27 16:47:25,815 epoch 7 - iter 372/3726 - loss 0.01393066 - time (sec): 46.16 - samples/sec: 453.87 - lr: 0.000002 - momentum: 0.000000
167
+ 2023-10-27 16:48:11,032 epoch 7 - iter 744/3726 - loss 0.01975985 - time (sec): 91.38 - samples/sec: 465.32 - lr: 0.000002 - momentum: 0.000000
168
+ 2023-10-27 16:48:57,003 epoch 7 - iter 1116/3726 - loss 0.01736626 - time (sec): 137.35 - samples/sec: 453.61 - lr: 0.000002 - momentum: 0.000000
169
+ 2023-10-27 16:49:42,670 epoch 7 - iter 1488/3726 - loss 0.01602877 - time (sec): 183.02 - samples/sec: 449.60 - lr: 0.000002 - momentum: 0.000000
170
+ 2023-10-27 16:50:28,056 epoch 7 - iter 1860/3726 - loss 0.01614250 - time (sec): 228.40 - samples/sec: 448.54 - lr: 0.000002 - momentum: 0.000000
171
+ 2023-10-27 16:51:13,857 epoch 7 - iter 2232/3726 - loss 0.01731041 - time (sec): 274.20 - samples/sec: 447.20 - lr: 0.000002 - momentum: 0.000000
172
+ 2023-10-27 16:51:59,472 epoch 7 - iter 2604/3726 - loss 0.01639037 - time (sec): 319.82 - samples/sec: 447.95 - lr: 0.000002 - momentum: 0.000000
173
+ 2023-10-27 16:52:45,630 epoch 7 - iter 2976/3726 - loss 0.01622162 - time (sec): 365.98 - samples/sec: 446.28 - lr: 0.000002 - momentum: 0.000000
174
+ 2023-10-27 16:53:30,732 epoch 7 - iter 3348/3726 - loss 0.01590288 - time (sec): 411.08 - samples/sec: 447.75 - lr: 0.000002 - momentum: 0.000000
175
+ 2023-10-27 16:54:16,747 epoch 7 - iter 3720/3726 - loss 0.01577280 - time (sec): 457.09 - samples/sec: 446.76 - lr: 0.000002 - momentum: 0.000000
176
+ 2023-10-27 16:54:17,443 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-27 16:54:17,443 EPOCH 7 done: loss 0.0157 - lr: 0.000002
178
+ 2023-10-27 16:54:39,633 DEV : loss 0.05249254032969475 - f1-score (micro avg) 0.9716
179
+ 2023-10-27 16:54:39,686 saving best model
180
+ 2023-10-27 16:54:42,796 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-27 16:55:28,427 epoch 8 - iter 372/3726 - loss 0.01008978 - time (sec): 45.63 - samples/sec: 447.29 - lr: 0.000002 - momentum: 0.000000
182
+ 2023-10-27 16:56:13,841 epoch 8 - iter 744/3726 - loss 0.00993689 - time (sec): 91.04 - samples/sec: 445.29 - lr: 0.000002 - momentum: 0.000000
183
+ 2023-10-27 16:56:59,449 epoch 8 - iter 1116/3726 - loss 0.00840825 - time (sec): 136.65 - samples/sec: 443.14 - lr: 0.000002 - momentum: 0.000000
184
+ 2023-10-27 16:57:45,482 epoch 8 - iter 1488/3726 - loss 0.00783549 - time (sec): 182.68 - samples/sec: 441.32 - lr: 0.000001 - momentum: 0.000000
185
+ 2023-10-27 16:58:31,635 epoch 8 - iter 1860/3726 - loss 0.00875476 - time (sec): 228.84 - samples/sec: 441.43 - lr: 0.000001 - momentum: 0.000000
186
+ 2023-10-27 16:59:17,304 epoch 8 - iter 2232/3726 - loss 0.00997788 - time (sec): 274.51 - samples/sec: 447.12 - lr: 0.000001 - momentum: 0.000000
187
+ 2023-10-27 17:00:03,903 epoch 8 - iter 2604/3726 - loss 0.01002162 - time (sec): 321.10 - samples/sec: 445.17 - lr: 0.000001 - momentum: 0.000000
188
+ 2023-10-27 17:00:49,795 epoch 8 - iter 2976/3726 - loss 0.00982956 - time (sec): 367.00 - samples/sec: 443.07 - lr: 0.000001 - momentum: 0.000000
189
+ 2023-10-27 17:01:35,384 epoch 8 - iter 3348/3726 - loss 0.01006193 - time (sec): 412.59 - samples/sec: 445.05 - lr: 0.000001 - momentum: 0.000000
190
+ 2023-10-27 17:02:21,065 epoch 8 - iter 3720/3726 - loss 0.01018978 - time (sec): 458.27 - samples/sec: 445.76 - lr: 0.000001 - momentum: 0.000000
191
+ 2023-10-27 17:02:21,762 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-27 17:02:21,762 EPOCH 8 done: loss 0.0102 - lr: 0.000001
193
+ 2023-10-27 17:02:44,780 DEV : loss 0.05600257217884064 - f1-score (micro avg) 0.9717
194
+ 2023-10-27 17:02:44,832 saving best model
195
+ 2023-10-27 17:02:47,541 ----------------------------------------------------------------------------------------------------
196
+ 2023-10-27 17:03:33,194 epoch 9 - iter 372/3726 - loss 0.00852829 - time (sec): 45.65 - samples/sec: 446.98 - lr: 0.000001 - momentum: 0.000000
197
+ 2023-10-27 17:04:18,797 epoch 9 - iter 744/3726 - loss 0.01209549 - time (sec): 91.25 - samples/sec: 442.36 - lr: 0.000001 - momentum: 0.000000
198
+ 2023-10-27 17:05:04,412 epoch 9 - iter 1116/3726 - loss 0.01171120 - time (sec): 136.87 - samples/sec: 446.88 - lr: 0.000001 - momentum: 0.000000
199
+ 2023-10-27 17:05:49,939 epoch 9 - iter 1488/3726 - loss 0.01104234 - time (sec): 182.39 - samples/sec: 448.01 - lr: 0.000001 - momentum: 0.000000
200
+ 2023-10-27 17:06:35,656 epoch 9 - iter 1860/3726 - loss 0.01095518 - time (sec): 228.11 - samples/sec: 444.74 - lr: 0.000001 - momentum: 0.000000
201
+ 2023-10-27 17:07:21,859 epoch 9 - iter 2232/3726 - loss 0.01041938 - time (sec): 274.31 - samples/sec: 445.26 - lr: 0.000001 - momentum: 0.000000
202
+ 2023-10-27 17:08:07,175 epoch 9 - iter 2604/3726 - loss 0.01077364 - time (sec): 319.63 - samples/sec: 446.97 - lr: 0.000001 - momentum: 0.000000
203
+ 2023-10-27 17:08:52,206 epoch 9 - iter 2976/3726 - loss 0.01011920 - time (sec): 364.66 - samples/sec: 448.47 - lr: 0.000001 - momentum: 0.000000
204
+ 2023-10-27 17:09:37,411 epoch 9 - iter 3348/3726 - loss 0.00960798 - time (sec): 409.87 - samples/sec: 448.71 - lr: 0.000001 - momentum: 0.000000
205
+ 2023-10-27 17:10:23,015 epoch 9 - iter 3720/3726 - loss 0.00963949 - time (sec): 455.47 - samples/sec: 448.69 - lr: 0.000001 - momentum: 0.000000
206
+ 2023-10-27 17:10:23,789 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-27 17:10:23,789 EPOCH 9 done: loss 0.0096 - lr: 0.000001
208
+ 2023-10-27 17:10:47,419 DEV : loss 0.053138185292482376 - f1-score (micro avg) 0.9726
209
+ 2023-10-27 17:10:47,471 saving best model
210
+ 2023-10-27 17:10:50,135 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-27 17:11:35,418 epoch 10 - iter 372/3726 - loss 0.00478465 - time (sec): 45.28 - samples/sec: 451.34 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-27 17:12:21,078 epoch 10 - iter 744/3726 - loss 0.00483843 - time (sec): 90.94 - samples/sec: 449.97 - lr: 0.000000 - momentum: 0.000000
213
+ 2023-10-27 17:13:06,334 epoch 10 - iter 1116/3726 - loss 0.00472956 - time (sec): 136.20 - samples/sec: 449.54 - lr: 0.000000 - momentum: 0.000000
214
+ 2023-10-27 17:13:51,612 epoch 10 - iter 1488/3726 - loss 0.00451912 - time (sec): 181.47 - samples/sec: 451.84 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-27 17:14:37,168 epoch 10 - iter 1860/3726 - loss 0.00470044 - time (sec): 227.03 - samples/sec: 451.55 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-27 17:15:22,745 epoch 10 - iter 2232/3726 - loss 0.00497575 - time (sec): 272.61 - samples/sec: 452.99 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-10-27 17:16:08,737 epoch 10 - iter 2604/3726 - loss 0.00499748 - time (sec): 318.60 - samples/sec: 450.83 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-10-27 17:16:54,804 epoch 10 - iter 2976/3726 - loss 0.00512330 - time (sec): 364.67 - samples/sec: 450.17 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-27 17:17:40,016 epoch 10 - iter 3348/3726 - loss 0.00514967 - time (sec): 409.88 - samples/sec: 449.74 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-10-27 17:18:25,574 epoch 10 - iter 3720/3726 - loss 0.00505541 - time (sec): 455.44 - samples/sec: 448.55 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-27 17:18:26,331 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-27 17:18:26,331 EPOCH 10 done: loss 0.0051 - lr: 0.000000
223
+ 2023-10-27 17:18:49,314 DEV : loss 0.05512790009379387 - f1-score (micro avg) 0.9722
224
+ 2023-10-27 17:18:51,313 ----------------------------------------------------------------------------------------------------
225
+ 2023-10-27 17:18:51,315 Loading model from best epoch ...
226
+ 2023-10-27 17:18:58,497 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC
227
+ 2023-10-27 17:19:21,159
228
+ Results:
229
+ - F-score (micro) 0.969
230
+ - F-score (macro) 0.9632
231
+ - Accuracy 0.9558
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ ORG 0.9676 0.9691 0.9683 1909
237
+ PER 0.9956 0.9943 0.9950 1591
238
+ LOC 0.9756 0.9625 0.9690 1413
239
+ MISC 0.9019 0.9397 0.9204 812
240
+
241
+ micro avg 0.9676 0.9703 0.9690 5725
242
+ macro avg 0.9602 0.9664 0.9632 5725
243
+ weighted avg 0.9680 0.9703 0.9691 5725
244
+
245
+ 2023-10-27 17:19:21,160 ----------------------------------------------------------------------------------------------------