Upload ./training.log with huggingface_hub
Browse files- training.log +244 -0
training.log
ADDED
@@ -0,0 +1,244 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 18:20:32,214 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 18:20:32,215 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 18:20:32,216 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
|
53 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 18:20:32,216 Train: 7142 sentences
|
55 |
+
2023-10-25 18:20:32,216 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 18:20:32,216 Training Params:
|
58 |
+
2023-10-25 18:20:32,216 - learning_rate: "3e-05"
|
59 |
+
2023-10-25 18:20:32,216 - mini_batch_size: "4"
|
60 |
+
2023-10-25 18:20:32,216 - max_epochs: "10"
|
61 |
+
2023-10-25 18:20:32,216 - shuffle: "True"
|
62 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 18:20:32,216 Plugins:
|
64 |
+
2023-10-25 18:20:32,216 - TensorboardLogger
|
65 |
+
2023-10-25 18:20:32,216 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 18:20:32,216 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 18:20:32,216 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 18:20:32,216 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 18:20:32,216 Computation:
|
71 |
+
2023-10-25 18:20:32,216 - compute on device: cuda:0
|
72 |
+
2023-10-25 18:20:32,216 - embedding storage: none
|
73 |
+
2023-10-25 18:20:32,217 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 18:20:32,217 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
|
75 |
+
2023-10-25 18:20:32,217 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 18:20:32,217 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 18:20:32,217 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 18:20:41,829 epoch 1 - iter 178/1786 - loss 1.62894423 - time (sec): 9.61 - samples/sec: 2374.31 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-25 18:20:51,370 epoch 1 - iter 356/1786 - loss 1.05338819 - time (sec): 19.15 - samples/sec: 2440.03 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-25 18:21:00,897 epoch 1 - iter 534/1786 - loss 0.81520377 - time (sec): 28.68 - samples/sec: 2483.98 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-25 18:21:10,205 epoch 1 - iter 712/1786 - loss 0.65815637 - time (sec): 37.99 - samples/sec: 2589.41 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-25 18:21:18,795 epoch 1 - iter 890/1786 - loss 0.56620539 - time (sec): 46.58 - samples/sec: 2635.83 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-25 18:21:27,476 epoch 1 - iter 1068/1786 - loss 0.50136101 - time (sec): 55.26 - samples/sec: 2648.12 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-25 18:21:36,384 epoch 1 - iter 1246/1786 - loss 0.45154314 - time (sec): 64.17 - samples/sec: 2662.89 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-25 18:21:45,468 epoch 1 - iter 1424/1786 - loss 0.41094756 - time (sec): 73.25 - samples/sec: 2680.96 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-25 18:21:54,948 epoch 1 - iter 1602/1786 - loss 0.37891662 - time (sec): 82.73 - samples/sec: 2692.41 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-25 18:22:04,798 epoch 1 - iter 1780/1786 - loss 0.35716707 - time (sec): 92.58 - samples/sec: 2677.72 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-25 18:22:05,143 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 18:22:05,143 EPOCH 1 done: loss 0.3565 - lr: 0.000030
|
90 |
+
2023-10-25 18:22:08,918 DEV : loss 0.10421743243932724 - f1-score (micro avg) 0.7273
|
91 |
+
2023-10-25 18:22:08,940 saving best model
|
92 |
+
2023-10-25 18:22:09,391 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 18:22:19,092 epoch 2 - iter 178/1786 - loss 0.11382450 - time (sec): 9.70 - samples/sec: 2665.09 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-25 18:22:28,505 epoch 2 - iter 356/1786 - loss 0.11818253 - time (sec): 19.11 - samples/sec: 2534.69 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-25 18:22:37,505 epoch 2 - iter 534/1786 - loss 0.11752290 - time (sec): 28.11 - samples/sec: 2607.90 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-25 18:22:46,687 epoch 2 - iter 712/1786 - loss 0.11710291 - time (sec): 37.29 - samples/sec: 2656.23 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-25 18:22:55,874 epoch 2 - iter 890/1786 - loss 0.11761515 - time (sec): 46.48 - samples/sec: 2626.69 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-25 18:23:04,857 epoch 2 - iter 1068/1786 - loss 0.11880463 - time (sec): 55.46 - samples/sec: 2647.77 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-25 18:23:13,642 epoch 2 - iter 1246/1786 - loss 0.11830693 - time (sec): 64.25 - samples/sec: 2669.76 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-25 18:23:22,746 epoch 2 - iter 1424/1786 - loss 0.11678831 - time (sec): 73.35 - samples/sec: 2706.42 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-25 18:23:31,977 epoch 2 - iter 1602/1786 - loss 0.11600197 - time (sec): 82.58 - samples/sec: 2700.85 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-25 18:23:41,217 epoch 2 - iter 1780/1786 - loss 0.11609587 - time (sec): 91.82 - samples/sec: 2701.26 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-25 18:23:41,525 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 18:23:41,526 EPOCH 2 done: loss 0.1160 - lr: 0.000027
|
105 |
+
2023-10-25 18:23:46,583 DEV : loss 0.10009025037288666 - f1-score (micro avg) 0.7704
|
106 |
+
2023-10-25 18:23:46,604 saving best model
|
107 |
+
2023-10-25 18:23:47,260 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 18:23:56,854 epoch 3 - iter 178/1786 - loss 0.06189937 - time (sec): 9.59 - samples/sec: 2612.13 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-25 18:24:06,469 epoch 3 - iter 356/1786 - loss 0.06991416 - time (sec): 19.21 - samples/sec: 2562.50 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-25 18:24:15,985 epoch 3 - iter 534/1786 - loss 0.07505640 - time (sec): 28.72 - samples/sec: 2562.39 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-25 18:24:25,614 epoch 3 - iter 712/1786 - loss 0.07238654 - time (sec): 38.35 - samples/sec: 2574.30 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-25 18:24:35,379 epoch 3 - iter 890/1786 - loss 0.07251223 - time (sec): 48.12 - samples/sec: 2571.82 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-25 18:24:45,040 epoch 3 - iter 1068/1786 - loss 0.07322538 - time (sec): 57.78 - samples/sec: 2580.46 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-25 18:24:54,249 epoch 3 - iter 1246/1786 - loss 0.07152561 - time (sec): 66.98 - samples/sec: 2609.81 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-25 18:25:03,582 epoch 3 - iter 1424/1786 - loss 0.07161969 - time (sec): 76.32 - samples/sec: 2579.33 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-25 18:25:12,455 epoch 3 - iter 1602/1786 - loss 0.07144466 - time (sec): 85.19 - samples/sec: 2609.06 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-25 18:25:21,964 epoch 3 - iter 1780/1786 - loss 0.07137444 - time (sec): 94.70 - samples/sec: 2618.28 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-25 18:25:22,284 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 18:25:22,284 EPOCH 3 done: loss 0.0713 - lr: 0.000023
|
120 |
+
2023-10-25 18:25:27,382 DEV : loss 0.13084866106510162 - f1-score (micro avg) 0.7918
|
121 |
+
2023-10-25 18:25:27,404 saving best model
|
122 |
+
2023-10-25 18:25:28,076 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-25 18:25:36,796 epoch 4 - iter 178/1786 - loss 0.04603763 - time (sec): 8.72 - samples/sec: 2822.79 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-25 18:25:45,700 epoch 4 - iter 356/1786 - loss 0.04595061 - time (sec): 17.62 - samples/sec: 2858.45 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-25 18:25:54,721 epoch 4 - iter 534/1786 - loss 0.05049014 - time (sec): 26.64 - samples/sec: 2796.43 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-25 18:26:04,026 epoch 4 - iter 712/1786 - loss 0.05319326 - time (sec): 35.95 - samples/sec: 2750.89 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-25 18:26:13,466 epoch 4 - iter 890/1786 - loss 0.05285730 - time (sec): 45.39 - samples/sec: 2713.26 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-25 18:26:22,918 epoch 4 - iter 1068/1786 - loss 0.05374856 - time (sec): 54.84 - samples/sec: 2711.40 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-25 18:26:32,404 epoch 4 - iter 1246/1786 - loss 0.05460603 - time (sec): 64.33 - samples/sec: 2687.26 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-25 18:26:42,081 epoch 4 - iter 1424/1786 - loss 0.05410043 - time (sec): 74.00 - samples/sec: 2681.28 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-25 18:26:51,822 epoch 4 - iter 1602/1786 - loss 0.05349229 - time (sec): 83.74 - samples/sec: 2668.08 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-25 18:27:01,217 epoch 4 - iter 1780/1786 - loss 0.05316503 - time (sec): 93.14 - samples/sec: 2663.63 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-25 18:27:01,526 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-25 18:27:01,527 EPOCH 4 done: loss 0.0532 - lr: 0.000020
|
135 |
+
2023-10-25 18:27:06,049 DEV : loss 0.16789670288562775 - f1-score (micro avg) 0.7829
|
136 |
+
2023-10-25 18:27:06,070 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-25 18:27:15,697 epoch 5 - iter 178/1786 - loss 0.05389475 - time (sec): 9.63 - samples/sec: 2455.06 - lr: 0.000020 - momentum: 0.000000
|
138 |
+
2023-10-25 18:27:25,190 epoch 5 - iter 356/1786 - loss 0.04260056 - time (sec): 19.12 - samples/sec: 2632.79 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-25 18:27:34,006 epoch 5 - iter 534/1786 - loss 0.04052769 - time (sec): 27.93 - samples/sec: 2688.86 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-25 18:27:43,195 epoch 5 - iter 712/1786 - loss 0.04001294 - time (sec): 37.12 - samples/sec: 2703.06 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-25 18:27:52,669 epoch 5 - iter 890/1786 - loss 0.03975722 - time (sec): 46.60 - samples/sec: 2699.77 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-25 18:28:02,256 epoch 5 - iter 1068/1786 - loss 0.03974050 - time (sec): 56.18 - samples/sec: 2654.81 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-25 18:28:11,497 epoch 5 - iter 1246/1786 - loss 0.03952798 - time (sec): 65.43 - samples/sec: 2666.55 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-25 18:28:20,489 epoch 5 - iter 1424/1786 - loss 0.03903402 - time (sec): 74.42 - samples/sec: 2667.01 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-25 18:28:29,211 epoch 5 - iter 1602/1786 - loss 0.03844957 - time (sec): 83.14 - samples/sec: 2669.18 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-25 18:28:38,265 epoch 5 - iter 1780/1786 - loss 0.03829347 - time (sec): 92.19 - samples/sec: 2687.67 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-25 18:28:38,578 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-25 18:28:38,578 EPOCH 5 done: loss 0.0382 - lr: 0.000017
|
149 |
+
2023-10-25 18:28:44,071 DEV : loss 0.19442911446094513 - f1-score (micro avg) 0.7802
|
150 |
+
2023-10-25 18:28:44,094 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-25 18:28:53,424 epoch 6 - iter 178/1786 - loss 0.03398055 - time (sec): 9.33 - samples/sec: 2700.33 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-25 18:29:02,610 epoch 6 - iter 356/1786 - loss 0.03316916 - time (sec): 18.51 - samples/sec: 2757.68 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-25 18:29:12,181 epoch 6 - iter 534/1786 - loss 0.03040477 - time (sec): 28.09 - samples/sec: 2648.55 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-25 18:29:21,873 epoch 6 - iter 712/1786 - loss 0.03212632 - time (sec): 37.78 - samples/sec: 2628.07 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-25 18:29:31,591 epoch 6 - iter 890/1786 - loss 0.02992094 - time (sec): 47.50 - samples/sec: 2632.24 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-25 18:29:41,332 epoch 6 - iter 1068/1786 - loss 0.02930104 - time (sec): 57.24 - samples/sec: 2618.93 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-25 18:29:50,819 epoch 6 - iter 1246/1786 - loss 0.02972192 - time (sec): 66.72 - samples/sec: 2622.50 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-25 18:30:00,534 epoch 6 - iter 1424/1786 - loss 0.02944805 - time (sec): 76.44 - samples/sec: 2594.65 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-25 18:30:09,882 epoch 6 - iter 1602/1786 - loss 0.02958451 - time (sec): 85.79 - samples/sec: 2626.36 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-25 18:30:18,857 epoch 6 - iter 1780/1786 - loss 0.03003769 - time (sec): 94.76 - samples/sec: 2618.30 - lr: 0.000013 - momentum: 0.000000
|
161 |
+
2023-10-25 18:30:19,162 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-25 18:30:19,163 EPOCH 6 done: loss 0.0300 - lr: 0.000013
|
163 |
+
2023-10-25 18:30:23,457 DEV : loss 0.18333885073661804 - f1-score (micro avg) 0.7925
|
164 |
+
2023-10-25 18:30:23,480 saving best model
|
165 |
+
2023-10-25 18:30:24,184 ----------------------------------------------------------------------------------------------------
|
166 |
+
2023-10-25 18:30:34,775 epoch 7 - iter 178/1786 - loss 0.02162460 - time (sec): 10.59 - samples/sec: 2448.54 - lr: 0.000013 - momentum: 0.000000
|
167 |
+
2023-10-25 18:30:44,255 epoch 7 - iter 356/1786 - loss 0.01908762 - time (sec): 20.07 - samples/sec: 2453.18 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-25 18:30:53,682 epoch 7 - iter 534/1786 - loss 0.01976296 - time (sec): 29.50 - samples/sec: 2516.85 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-25 18:31:02,712 epoch 7 - iter 712/1786 - loss 0.02174272 - time (sec): 38.53 - samples/sec: 2564.31 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-25 18:31:11,779 epoch 7 - iter 890/1786 - loss 0.02278711 - time (sec): 47.59 - samples/sec: 2635.47 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-25 18:31:20,900 epoch 7 - iter 1068/1786 - loss 0.02154762 - time (sec): 56.71 - samples/sec: 2659.10 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-25 18:31:29,668 epoch 7 - iter 1246/1786 - loss 0.02092074 - time (sec): 65.48 - samples/sec: 2699.75 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-25 18:31:38,781 epoch 7 - iter 1424/1786 - loss 0.02100336 - time (sec): 74.60 - samples/sec: 2665.33 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-25 18:31:48,049 epoch 7 - iter 1602/1786 - loss 0.02189701 - time (sec): 83.86 - samples/sec: 2662.93 - lr: 0.000010 - momentum: 0.000000
|
175 |
+
2023-10-25 18:31:57,161 epoch 7 - iter 1780/1786 - loss 0.02145412 - time (sec): 92.98 - samples/sec: 2665.12 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-25 18:31:57,481 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-25 18:31:57,481 EPOCH 7 done: loss 0.0214 - lr: 0.000010
|
178 |
+
2023-10-25 18:32:01,934 DEV : loss 0.19056054949760437 - f1-score (micro avg) 0.8075
|
179 |
+
2023-10-25 18:32:01,956 saving best model
|
180 |
+
2023-10-25 18:32:02,611 ----------------------------------------------------------------------------------------------------
|
181 |
+
2023-10-25 18:32:12,182 epoch 8 - iter 178/1786 - loss 0.02612736 - time (sec): 9.57 - samples/sec: 2588.30 - lr: 0.000010 - momentum: 0.000000
|
182 |
+
2023-10-25 18:32:21,872 epoch 8 - iter 356/1786 - loss 0.01894543 - time (sec): 19.26 - samples/sec: 2516.13 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-25 18:32:31,455 epoch 8 - iter 534/1786 - loss 0.01672658 - time (sec): 28.84 - samples/sec: 2550.32 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-25 18:32:40,981 epoch 8 - iter 712/1786 - loss 0.01545569 - time (sec): 38.37 - samples/sec: 2583.47 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-25 18:32:50,491 epoch 8 - iter 890/1786 - loss 0.01420894 - time (sec): 47.88 - samples/sec: 2582.39 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-25 18:32:59,910 epoch 8 - iter 1068/1786 - loss 0.01478084 - time (sec): 57.30 - samples/sec: 2565.98 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-25 18:33:09,307 epoch 8 - iter 1246/1786 - loss 0.01595451 - time (sec): 66.69 - samples/sec: 2572.39 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-25 18:33:18,275 epoch 8 - iter 1424/1786 - loss 0.01587470 - time (sec): 75.66 - samples/sec: 2598.46 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-25 18:33:27,281 epoch 8 - iter 1602/1786 - loss 0.01586368 - time (sec): 84.67 - samples/sec: 2636.76 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-25 18:33:36,357 epoch 8 - iter 1780/1786 - loss 0.01584675 - time (sec): 93.74 - samples/sec: 2645.49 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-25 18:33:36,660 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-25 18:33:36,660 EPOCH 8 done: loss 0.0158 - lr: 0.000007
|
193 |
+
2023-10-25 18:33:42,265 DEV : loss 0.19766351580619812 - f1-score (micro avg) 0.8038
|
194 |
+
2023-10-25 18:33:42,287 ----------------------------------------------------------------------------------------------------
|
195 |
+
2023-10-25 18:33:51,969 epoch 9 - iter 178/1786 - loss 0.01028428 - time (sec): 9.68 - samples/sec: 2633.27 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-25 18:34:01,084 epoch 9 - iter 356/1786 - loss 0.00955984 - time (sec): 18.80 - samples/sec: 2571.59 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-25 18:34:10,368 epoch 9 - iter 534/1786 - loss 0.01151104 - time (sec): 28.08 - samples/sec: 2581.54 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-25 18:34:19,236 epoch 9 - iter 712/1786 - loss 0.01207848 - time (sec): 36.95 - samples/sec: 2625.24 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-25 18:34:28,301 epoch 9 - iter 890/1786 - loss 0.01138794 - time (sec): 46.01 - samples/sec: 2612.46 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-25 18:34:37,234 epoch 9 - iter 1068/1786 - loss 0.01116143 - time (sec): 54.95 - samples/sec: 2672.29 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-25 18:34:46,369 epoch 9 - iter 1246/1786 - loss 0.01155486 - time (sec): 64.08 - samples/sec: 2669.89 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-25 18:34:55,509 epoch 9 - iter 1424/1786 - loss 0.01114669 - time (sec): 73.22 - samples/sec: 2685.06 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-25 18:35:04,412 epoch 9 - iter 1602/1786 - loss 0.01110376 - time (sec): 82.12 - samples/sec: 2697.08 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-25 18:35:13,124 epoch 9 - iter 1780/1786 - loss 0.01074968 - time (sec): 90.84 - samples/sec: 2731.45 - lr: 0.000003 - momentum: 0.000000
|
205 |
+
2023-10-25 18:35:13,422 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-25 18:35:13,423 EPOCH 9 done: loss 0.0108 - lr: 0.000003
|
207 |
+
2023-10-25 18:35:17,832 DEV : loss 0.21357937157154083 - f1-score (micro avg) 0.8104
|
208 |
+
2023-10-25 18:35:17,855 saving best model
|
209 |
+
2023-10-25 18:35:18,543 ----------------------------------------------------------------------------------------------------
|
210 |
+
2023-10-25 18:35:28,085 epoch 10 - iter 178/1786 - loss 0.00637564 - time (sec): 9.54 - samples/sec: 2632.83 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-25 18:35:37,383 epoch 10 - iter 356/1786 - loss 0.00589645 - time (sec): 18.84 - samples/sec: 2586.16 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-25 18:35:46,529 epoch 10 - iter 534/1786 - loss 0.00533950 - time (sec): 27.98 - samples/sec: 2649.13 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-25 18:35:55,356 epoch 10 - iter 712/1786 - loss 0.00617715 - time (sec): 36.81 - samples/sec: 2702.22 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-25 18:36:04,322 epoch 10 - iter 890/1786 - loss 0.00620859 - time (sec): 45.78 - samples/sec: 2721.41 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-25 18:36:13,367 epoch 10 - iter 1068/1786 - loss 0.00641849 - time (sec): 54.82 - samples/sec: 2715.20 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-25 18:36:22,629 epoch 10 - iter 1246/1786 - loss 0.00676503 - time (sec): 64.08 - samples/sec: 2702.32 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-25 18:36:32,265 epoch 10 - iter 1424/1786 - loss 0.00651546 - time (sec): 73.72 - samples/sec: 2700.44 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-25 18:36:41,692 epoch 10 - iter 1602/1786 - loss 0.00660118 - time (sec): 83.15 - samples/sec: 2690.29 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-25 18:36:50,765 epoch 10 - iter 1780/1786 - loss 0.00682484 - time (sec): 92.22 - samples/sec: 2689.05 - lr: 0.000000 - momentum: 0.000000
|
220 |
+
2023-10-25 18:36:51,073 ----------------------------------------------------------------------------------------------------
|
221 |
+
2023-10-25 18:36:51,074 EPOCH 10 done: loss 0.0068 - lr: 0.000000
|
222 |
+
2023-10-25 18:36:56,195 DEV : loss 0.21570290625095367 - f1-score (micro avg) 0.8096
|
223 |
+
2023-10-25 18:36:56,710 ----------------------------------------------------------------------------------------------------
|
224 |
+
2023-10-25 18:36:56,712 Loading model from best epoch ...
|
225 |
+
2023-10-25 18:36:58,657 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
226 |
+
2023-10-25 18:37:13,089
|
227 |
+
Results:
|
228 |
+
- F-score (micro) 0.6996
|
229 |
+
- F-score (macro) 0.6261
|
230 |
+
- Accuracy 0.5539
|
231 |
+
|
232 |
+
By class:
|
233 |
+
precision recall f1-score support
|
234 |
+
|
235 |
+
LOC 0.6930 0.7050 0.6990 1095
|
236 |
+
PER 0.7827 0.7796 0.7812 1012
|
237 |
+
ORG 0.4655 0.5854 0.5186 357
|
238 |
+
HumanProd 0.3966 0.6970 0.5055 33
|
239 |
+
|
240 |
+
micro avg 0.6820 0.7181 0.6996 2497
|
241 |
+
macro avg 0.5844 0.6918 0.6261 2497
|
242 |
+
weighted avg 0.6929 0.7181 0.7039 2497
|
243 |
+
|
244 |
+
2023-10-25 18:37:13,089 ----------------------------------------------------------------------------------------------------
|