alkiskoudounas commited on
Commit
e6d089d
·
1 Parent(s): 8269b9e

End of training

Browse files
all_results.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 46.95,
3
+ "eval_loss": 0.28073304891586304,
4
+ "eval_runtime": 693.978,
5
+ "eval_samples_per_second": 2.444,
6
+ "eval_steps_per_second": 0.611,
7
+ "eval_wer": 17.709881129271917,
8
+ "train_loss": 0.029123614896199433,
9
+ "train_runtime": 43309.4036,
10
+ "train_samples_per_second": 3.694,
11
+ "train_steps_per_second": 0.462
12
+ }
eval_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 46.95,
3
+ "eval_loss": 0.28073304891586304,
4
+ "eval_runtime": 693.978,
5
+ "eval_samples_per_second": 2.444,
6
+ "eval_steps_per_second": 0.611,
7
+ "eval_wer": 17.709881129271917
8
+ }
runs/Dec19_08-40-22_heron/events.out.tfevents.1671483462.heron.2099547.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac99b8a70a9f8c6a97aab4d09905e4f54be55ceb276ca81086b4e2b0dc1cd369
3
+ size 364
train_results.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 46.95,
3
+ "train_loss": 0.029123614896199433,
4
+ "train_runtime": 43309.4036,
5
+ "train_samples_per_second": 3.694,
6
+ "train_steps_per_second": 0.462
7
+ }
trainer_state.json ADDED
@@ -0,0 +1,2515 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 17.09695393759287,
3
+ "best_model_checkpoint": "whisper-el-medium-augmented-2/checkpoint-20000",
4
+ "epoch": 46.948356807511736,
5
+ "global_step": 20000,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 0.12,
12
+ "learning_rate": 9.400000000000001e-07,
13
+ "loss": 2.3145,
14
+ "step": 50
15
+ },
16
+ {
17
+ "epoch": 0.23,
18
+ "learning_rate": 1.94e-06,
19
+ "loss": 1.4804,
20
+ "step": 100
21
+ },
22
+ {
23
+ "epoch": 0.35,
24
+ "learning_rate": 2.9400000000000002e-06,
25
+ "loss": 0.6645,
26
+ "step": 150
27
+ },
28
+ {
29
+ "epoch": 0.47,
30
+ "learning_rate": 3.94e-06,
31
+ "loss": 0.4941,
32
+ "step": 200
33
+ },
34
+ {
35
+ "epoch": 0.59,
36
+ "learning_rate": 4.94e-06,
37
+ "loss": 0.4274,
38
+ "step": 250
39
+ },
40
+ {
41
+ "epoch": 0.7,
42
+ "learning_rate": 5.94e-06,
43
+ "loss": 0.3706,
44
+ "step": 300
45
+ },
46
+ {
47
+ "epoch": 0.82,
48
+ "learning_rate": 6.9400000000000005e-06,
49
+ "loss": 0.3363,
50
+ "step": 350
51
+ },
52
+ {
53
+ "epoch": 0.94,
54
+ "learning_rate": 7.94e-06,
55
+ "loss": 0.3374,
56
+ "step": 400
57
+ },
58
+ {
59
+ "epoch": 1.06,
60
+ "learning_rate": 8.94e-06,
61
+ "loss": 0.25,
62
+ "step": 450
63
+ },
64
+ {
65
+ "epoch": 1.17,
66
+ "learning_rate": 9.940000000000001e-06,
67
+ "loss": 0.2338,
68
+ "step": 500
69
+ },
70
+ {
71
+ "epoch": 1.29,
72
+ "learning_rate": 9.975897435897436e-06,
73
+ "loss": 0.2485,
74
+ "step": 550
75
+ },
76
+ {
77
+ "epoch": 1.41,
78
+ "learning_rate": 9.950256410256412e-06,
79
+ "loss": 0.2306,
80
+ "step": 600
81
+ },
82
+ {
83
+ "epoch": 1.53,
84
+ "learning_rate": 9.924615384615385e-06,
85
+ "loss": 0.2184,
86
+ "step": 650
87
+ },
88
+ {
89
+ "epoch": 1.64,
90
+ "learning_rate": 9.89897435897436e-06,
91
+ "loss": 0.2268,
92
+ "step": 700
93
+ },
94
+ {
95
+ "epoch": 1.76,
96
+ "learning_rate": 9.873333333333334e-06,
97
+ "loss": 0.1912,
98
+ "step": 750
99
+ },
100
+ {
101
+ "epoch": 1.88,
102
+ "learning_rate": 9.847692307692308e-06,
103
+ "loss": 0.1835,
104
+ "step": 800
105
+ },
106
+ {
107
+ "epoch": 2.0,
108
+ "learning_rate": 9.822051282051283e-06,
109
+ "loss": 0.1908,
110
+ "step": 850
111
+ },
112
+ {
113
+ "epoch": 2.11,
114
+ "learning_rate": 9.796410256410257e-06,
115
+ "loss": 0.1107,
116
+ "step": 900
117
+ },
118
+ {
119
+ "epoch": 2.23,
120
+ "learning_rate": 9.770769230769232e-06,
121
+ "loss": 0.1117,
122
+ "step": 950
123
+ },
124
+ {
125
+ "epoch": 2.35,
126
+ "learning_rate": 9.745128205128206e-06,
127
+ "loss": 0.0995,
128
+ "step": 1000
129
+ },
130
+ {
131
+ "epoch": 2.46,
132
+ "learning_rate": 9.71948717948718e-06,
133
+ "loss": 0.1149,
134
+ "step": 1050
135
+ },
136
+ {
137
+ "epoch": 2.58,
138
+ "learning_rate": 9.693846153846155e-06,
139
+ "loss": 0.1002,
140
+ "step": 1100
141
+ },
142
+ {
143
+ "epoch": 2.7,
144
+ "learning_rate": 9.668205128205129e-06,
145
+ "loss": 0.1142,
146
+ "step": 1150
147
+ },
148
+ {
149
+ "epoch": 2.82,
150
+ "learning_rate": 9.642564102564104e-06,
151
+ "loss": 0.1061,
152
+ "step": 1200
153
+ },
154
+ {
155
+ "epoch": 2.93,
156
+ "learning_rate": 9.616923076923077e-06,
157
+ "loss": 0.1017,
158
+ "step": 1250
159
+ },
160
+ {
161
+ "epoch": 3.05,
162
+ "learning_rate": 9.591282051282053e-06,
163
+ "loss": 0.0797,
164
+ "step": 1300
165
+ },
166
+ {
167
+ "epoch": 3.17,
168
+ "learning_rate": 9.565641025641026e-06,
169
+ "loss": 0.0545,
170
+ "step": 1350
171
+ },
172
+ {
173
+ "epoch": 3.29,
174
+ "learning_rate": 9.54e-06,
175
+ "loss": 0.0548,
176
+ "step": 1400
177
+ },
178
+ {
179
+ "epoch": 3.4,
180
+ "learning_rate": 9.514358974358975e-06,
181
+ "loss": 0.0638,
182
+ "step": 1450
183
+ },
184
+ {
185
+ "epoch": 3.52,
186
+ "learning_rate": 9.488717948717949e-06,
187
+ "loss": 0.0658,
188
+ "step": 1500
189
+ },
190
+ {
191
+ "epoch": 3.64,
192
+ "learning_rate": 9.463076923076924e-06,
193
+ "loss": 0.0674,
194
+ "step": 1550
195
+ },
196
+ {
197
+ "epoch": 3.76,
198
+ "learning_rate": 9.437435897435898e-06,
199
+ "loss": 0.0534,
200
+ "step": 1600
201
+ },
202
+ {
203
+ "epoch": 3.87,
204
+ "learning_rate": 9.411794871794872e-06,
205
+ "loss": 0.0806,
206
+ "step": 1650
207
+ },
208
+ {
209
+ "epoch": 3.99,
210
+ "learning_rate": 9.386153846153847e-06,
211
+ "loss": 0.0559,
212
+ "step": 1700
213
+ },
214
+ {
215
+ "epoch": 4.11,
216
+ "learning_rate": 9.36051282051282e-06,
217
+ "loss": 0.0427,
218
+ "step": 1750
219
+ },
220
+ {
221
+ "epoch": 4.23,
222
+ "learning_rate": 9.334871794871796e-06,
223
+ "loss": 0.0376,
224
+ "step": 1800
225
+ },
226
+ {
227
+ "epoch": 4.34,
228
+ "learning_rate": 9.30923076923077e-06,
229
+ "loss": 0.0478,
230
+ "step": 1850
231
+ },
232
+ {
233
+ "epoch": 4.46,
234
+ "learning_rate": 9.283589743589745e-06,
235
+ "loss": 0.0505,
236
+ "step": 1900
237
+ },
238
+ {
239
+ "epoch": 4.58,
240
+ "learning_rate": 9.257948717948719e-06,
241
+ "loss": 0.0397,
242
+ "step": 1950
243
+ },
244
+ {
245
+ "epoch": 4.69,
246
+ "learning_rate": 9.232307692307692e-06,
247
+ "loss": 0.0407,
248
+ "step": 2000
249
+ },
250
+ {
251
+ "epoch": 4.69,
252
+ "eval_loss": 0.2483939379453659,
253
+ "eval_runtime": 698.7199,
254
+ "eval_samples_per_second": 2.427,
255
+ "eval_steps_per_second": 0.607,
256
+ "eval_wer": 20.87667161961367,
257
+ "step": 2000
258
+ },
259
+ {
260
+ "epoch": 4.81,
261
+ "learning_rate": 9.206666666666668e-06,
262
+ "loss": 0.041,
263
+ "step": 2050
264
+ },
265
+ {
266
+ "epoch": 4.93,
267
+ "learning_rate": 9.181025641025641e-06,
268
+ "loss": 0.0508,
269
+ "step": 2100
270
+ },
271
+ {
272
+ "epoch": 5.05,
273
+ "learning_rate": 9.155384615384617e-06,
274
+ "loss": 0.0333,
275
+ "step": 2150
276
+ },
277
+ {
278
+ "epoch": 5.16,
279
+ "learning_rate": 9.12974358974359e-06,
280
+ "loss": 0.0272,
281
+ "step": 2200
282
+ },
283
+ {
284
+ "epoch": 5.28,
285
+ "learning_rate": 9.104102564102566e-06,
286
+ "loss": 0.0284,
287
+ "step": 2250
288
+ },
289
+ {
290
+ "epoch": 5.4,
291
+ "learning_rate": 9.07846153846154e-06,
292
+ "loss": 0.0233,
293
+ "step": 2300
294
+ },
295
+ {
296
+ "epoch": 5.52,
297
+ "learning_rate": 9.052820512820513e-06,
298
+ "loss": 0.0217,
299
+ "step": 2350
300
+ },
301
+ {
302
+ "epoch": 5.63,
303
+ "learning_rate": 9.027179487179488e-06,
304
+ "loss": 0.0304,
305
+ "step": 2400
306
+ },
307
+ {
308
+ "epoch": 5.75,
309
+ "learning_rate": 9.001538461538462e-06,
310
+ "loss": 0.0237,
311
+ "step": 2450
312
+ },
313
+ {
314
+ "epoch": 5.87,
315
+ "learning_rate": 8.975897435897437e-06,
316
+ "loss": 0.0267,
317
+ "step": 2500
318
+ },
319
+ {
320
+ "epoch": 5.99,
321
+ "learning_rate": 8.950256410256411e-06,
322
+ "loss": 0.0298,
323
+ "step": 2550
324
+ },
325
+ {
326
+ "epoch": 6.1,
327
+ "learning_rate": 8.924615384615385e-06,
328
+ "loss": 0.0264,
329
+ "step": 2600
330
+ },
331
+ {
332
+ "epoch": 6.22,
333
+ "learning_rate": 8.89897435897436e-06,
334
+ "loss": 0.022,
335
+ "step": 2650
336
+ },
337
+ {
338
+ "epoch": 6.34,
339
+ "learning_rate": 8.873333333333334e-06,
340
+ "loss": 0.0213,
341
+ "step": 2700
342
+ },
343
+ {
344
+ "epoch": 6.46,
345
+ "learning_rate": 8.847692307692309e-06,
346
+ "loss": 0.0195,
347
+ "step": 2750
348
+ },
349
+ {
350
+ "epoch": 6.57,
351
+ "learning_rate": 8.822051282051283e-06,
352
+ "loss": 0.0234,
353
+ "step": 2800
354
+ },
355
+ {
356
+ "epoch": 6.69,
357
+ "learning_rate": 8.796410256410258e-06,
358
+ "loss": 0.0214,
359
+ "step": 2850
360
+ },
361
+ {
362
+ "epoch": 6.81,
363
+ "learning_rate": 8.770769230769232e-06,
364
+ "loss": 0.0187,
365
+ "step": 2900
366
+ },
367
+ {
368
+ "epoch": 6.92,
369
+ "learning_rate": 8.745128205128205e-06,
370
+ "loss": 0.0226,
371
+ "step": 2950
372
+ },
373
+ {
374
+ "epoch": 7.04,
375
+ "learning_rate": 8.71948717948718e-06,
376
+ "loss": 0.0154,
377
+ "step": 3000
378
+ },
379
+ {
380
+ "epoch": 7.16,
381
+ "learning_rate": 8.693846153846154e-06,
382
+ "loss": 0.0187,
383
+ "step": 3050
384
+ },
385
+ {
386
+ "epoch": 7.28,
387
+ "learning_rate": 8.66820512820513e-06,
388
+ "loss": 0.0147,
389
+ "step": 3100
390
+ },
391
+ {
392
+ "epoch": 7.39,
393
+ "learning_rate": 8.642564102564103e-06,
394
+ "loss": 0.0104,
395
+ "step": 3150
396
+ },
397
+ {
398
+ "epoch": 7.51,
399
+ "learning_rate": 8.616923076923077e-06,
400
+ "loss": 0.0134,
401
+ "step": 3200
402
+ },
403
+ {
404
+ "epoch": 7.63,
405
+ "learning_rate": 8.591282051282052e-06,
406
+ "loss": 0.0178,
407
+ "step": 3250
408
+ },
409
+ {
410
+ "epoch": 7.75,
411
+ "learning_rate": 8.565641025641026e-06,
412
+ "loss": 0.0192,
413
+ "step": 3300
414
+ },
415
+ {
416
+ "epoch": 7.86,
417
+ "learning_rate": 8.540000000000001e-06,
418
+ "loss": 0.0169,
419
+ "step": 3350
420
+ },
421
+ {
422
+ "epoch": 7.98,
423
+ "learning_rate": 8.514358974358975e-06,
424
+ "loss": 0.0222,
425
+ "step": 3400
426
+ },
427
+ {
428
+ "epoch": 8.1,
429
+ "learning_rate": 8.48871794871795e-06,
430
+ "loss": 0.0132,
431
+ "step": 3450
432
+ },
433
+ {
434
+ "epoch": 8.22,
435
+ "learning_rate": 8.463076923076924e-06,
436
+ "loss": 0.0142,
437
+ "step": 3500
438
+ },
439
+ {
440
+ "epoch": 8.33,
441
+ "learning_rate": 8.437435897435898e-06,
442
+ "loss": 0.0119,
443
+ "step": 3550
444
+ },
445
+ {
446
+ "epoch": 8.45,
447
+ "learning_rate": 8.411794871794873e-06,
448
+ "loss": 0.0109,
449
+ "step": 3600
450
+ },
451
+ {
452
+ "epoch": 8.57,
453
+ "learning_rate": 8.386153846153847e-06,
454
+ "loss": 0.0132,
455
+ "step": 3650
456
+ },
457
+ {
458
+ "epoch": 8.69,
459
+ "learning_rate": 8.360512820512822e-06,
460
+ "loss": 0.0083,
461
+ "step": 3700
462
+ },
463
+ {
464
+ "epoch": 8.8,
465
+ "learning_rate": 8.334871794871796e-06,
466
+ "loss": 0.0106,
467
+ "step": 3750
468
+ },
469
+ {
470
+ "epoch": 8.92,
471
+ "learning_rate": 8.30923076923077e-06,
472
+ "loss": 0.0127,
473
+ "step": 3800
474
+ },
475
+ {
476
+ "epoch": 9.04,
477
+ "learning_rate": 8.283589743589745e-06,
478
+ "loss": 0.0115,
479
+ "step": 3850
480
+ },
481
+ {
482
+ "epoch": 9.15,
483
+ "learning_rate": 8.257948717948718e-06,
484
+ "loss": 0.0079,
485
+ "step": 3900
486
+ },
487
+ {
488
+ "epoch": 9.27,
489
+ "learning_rate": 8.232307692307694e-06,
490
+ "loss": 0.0078,
491
+ "step": 3950
492
+ },
493
+ {
494
+ "epoch": 9.39,
495
+ "learning_rate": 8.206666666666667e-06,
496
+ "loss": 0.0128,
497
+ "step": 4000
498
+ },
499
+ {
500
+ "epoch": 9.39,
501
+ "eval_loss": 0.2795361876487732,
502
+ "eval_runtime": 707.7086,
503
+ "eval_samples_per_second": 2.396,
504
+ "eval_steps_per_second": 0.599,
505
+ "eval_wer": 21.201708766716195,
506
+ "step": 4000
507
+ },
508
+ {
509
+ "epoch": 9.51,
510
+ "learning_rate": 8.181025641025642e-06,
511
+ "loss": 0.0122,
512
+ "step": 4050
513
+ },
514
+ {
515
+ "epoch": 9.62,
516
+ "learning_rate": 8.155384615384616e-06,
517
+ "loss": 0.0153,
518
+ "step": 4100
519
+ },
520
+ {
521
+ "epoch": 9.74,
522
+ "learning_rate": 8.13025641025641e-06,
523
+ "loss": 0.0136,
524
+ "step": 4150
525
+ },
526
+ {
527
+ "epoch": 9.86,
528
+ "learning_rate": 8.104615384615386e-06,
529
+ "loss": 0.0053,
530
+ "step": 4200
531
+ },
532
+ {
533
+ "epoch": 9.98,
534
+ "learning_rate": 8.07897435897436e-06,
535
+ "loss": 0.0122,
536
+ "step": 4250
537
+ },
538
+ {
539
+ "epoch": 10.09,
540
+ "learning_rate": 8.053333333333335e-06,
541
+ "loss": 0.0075,
542
+ "step": 4300
543
+ },
544
+ {
545
+ "epoch": 10.21,
546
+ "learning_rate": 8.027692307692308e-06,
547
+ "loss": 0.0056,
548
+ "step": 4350
549
+ },
550
+ {
551
+ "epoch": 10.33,
552
+ "learning_rate": 8.002051282051284e-06,
553
+ "loss": 0.0061,
554
+ "step": 4400
555
+ },
556
+ {
557
+ "epoch": 10.45,
558
+ "learning_rate": 7.976410256410257e-06,
559
+ "loss": 0.0039,
560
+ "step": 4450
561
+ },
562
+ {
563
+ "epoch": 10.56,
564
+ "learning_rate": 7.950769230769233e-06,
565
+ "loss": 0.0098,
566
+ "step": 4500
567
+ },
568
+ {
569
+ "epoch": 10.68,
570
+ "learning_rate": 7.925128205128205e-06,
571
+ "loss": 0.01,
572
+ "step": 4550
573
+ },
574
+ {
575
+ "epoch": 10.8,
576
+ "learning_rate": 7.89948717948718e-06,
577
+ "loss": 0.0062,
578
+ "step": 4600
579
+ },
580
+ {
581
+ "epoch": 10.92,
582
+ "learning_rate": 7.873846153846154e-06,
583
+ "loss": 0.009,
584
+ "step": 4650
585
+ },
586
+ {
587
+ "epoch": 11.03,
588
+ "learning_rate": 7.848205128205129e-06,
589
+ "loss": 0.0075,
590
+ "step": 4700
591
+ },
592
+ {
593
+ "epoch": 11.15,
594
+ "learning_rate": 7.822564102564103e-06,
595
+ "loss": 0.0066,
596
+ "step": 4750
597
+ },
598
+ {
599
+ "epoch": 11.27,
600
+ "learning_rate": 7.796923076923078e-06,
601
+ "loss": 0.0076,
602
+ "step": 4800
603
+ },
604
+ {
605
+ "epoch": 11.38,
606
+ "learning_rate": 7.771282051282052e-06,
607
+ "loss": 0.0087,
608
+ "step": 4850
609
+ },
610
+ {
611
+ "epoch": 11.5,
612
+ "learning_rate": 7.745641025641027e-06,
613
+ "loss": 0.0064,
614
+ "step": 4900
615
+ },
616
+ {
617
+ "epoch": 11.62,
618
+ "learning_rate": 7.72e-06,
619
+ "loss": 0.0049,
620
+ "step": 4950
621
+ },
622
+ {
623
+ "epoch": 11.74,
624
+ "learning_rate": 7.694358974358976e-06,
625
+ "loss": 0.0062,
626
+ "step": 5000
627
+ },
628
+ {
629
+ "epoch": 11.85,
630
+ "learning_rate": 7.66871794871795e-06,
631
+ "loss": 0.0092,
632
+ "step": 5050
633
+ },
634
+ {
635
+ "epoch": 11.97,
636
+ "learning_rate": 7.643076923076925e-06,
637
+ "loss": 0.0082,
638
+ "step": 5100
639
+ },
640
+ {
641
+ "epoch": 12.09,
642
+ "learning_rate": 7.617435897435898e-06,
643
+ "loss": 0.0071,
644
+ "step": 5150
645
+ },
646
+ {
647
+ "epoch": 12.21,
648
+ "learning_rate": 7.591794871794872e-06,
649
+ "loss": 0.0045,
650
+ "step": 5200
651
+ },
652
+ {
653
+ "epoch": 12.32,
654
+ "learning_rate": 7.566153846153847e-06,
655
+ "loss": 0.009,
656
+ "step": 5250
657
+ },
658
+ {
659
+ "epoch": 12.44,
660
+ "learning_rate": 7.540512820512821e-06,
661
+ "loss": 0.0079,
662
+ "step": 5300
663
+ },
664
+ {
665
+ "epoch": 12.56,
666
+ "learning_rate": 7.514871794871795e-06,
667
+ "loss": 0.0115,
668
+ "step": 5350
669
+ },
670
+ {
671
+ "epoch": 12.68,
672
+ "learning_rate": 7.489230769230769e-06,
673
+ "loss": 0.0073,
674
+ "step": 5400
675
+ },
676
+ {
677
+ "epoch": 12.79,
678
+ "learning_rate": 7.463589743589744e-06,
679
+ "loss": 0.0068,
680
+ "step": 5450
681
+ },
682
+ {
683
+ "epoch": 12.91,
684
+ "learning_rate": 7.437948717948718e-06,
685
+ "loss": 0.0094,
686
+ "step": 5500
687
+ },
688
+ {
689
+ "epoch": 13.03,
690
+ "learning_rate": 7.412307692307693e-06,
691
+ "loss": 0.0091,
692
+ "step": 5550
693
+ },
694
+ {
695
+ "epoch": 13.15,
696
+ "learning_rate": 7.386666666666667e-06,
697
+ "loss": 0.0045,
698
+ "step": 5600
699
+ },
700
+ {
701
+ "epoch": 13.26,
702
+ "learning_rate": 7.361025641025642e-06,
703
+ "loss": 0.004,
704
+ "step": 5650
705
+ },
706
+ {
707
+ "epoch": 13.38,
708
+ "learning_rate": 7.335384615384616e-06,
709
+ "loss": 0.0042,
710
+ "step": 5700
711
+ },
712
+ {
713
+ "epoch": 13.5,
714
+ "learning_rate": 7.309743589743591e-06,
715
+ "loss": 0.0067,
716
+ "step": 5750
717
+ },
718
+ {
719
+ "epoch": 13.62,
720
+ "learning_rate": 7.2841025641025645e-06,
721
+ "loss": 0.0077,
722
+ "step": 5800
723
+ },
724
+ {
725
+ "epoch": 13.73,
726
+ "learning_rate": 7.258461538461539e-06,
727
+ "loss": 0.014,
728
+ "step": 5850
729
+ },
730
+ {
731
+ "epoch": 13.85,
732
+ "learning_rate": 7.2328205128205135e-06,
733
+ "loss": 0.0076,
734
+ "step": 5900
735
+ },
736
+ {
737
+ "epoch": 13.97,
738
+ "learning_rate": 7.207179487179487e-06,
739
+ "loss": 0.0059,
740
+ "step": 5950
741
+ },
742
+ {
743
+ "epoch": 14.08,
744
+ "learning_rate": 7.181538461538462e-06,
745
+ "loss": 0.0041,
746
+ "step": 6000
747
+ },
748
+ {
749
+ "epoch": 14.08,
750
+ "eval_loss": 0.27444717288017273,
751
+ "eval_runtime": 695.0914,
752
+ "eval_samples_per_second": 2.44,
753
+ "eval_steps_per_second": 0.61,
754
+ "eval_wer": 19.13075780089153,
755
+ "step": 6000
756
+ },
757
+ {
758
+ "epoch": 14.2,
759
+ "learning_rate": 7.155897435897436e-06,
760
+ "loss": 0.0077,
761
+ "step": 6050
762
+ },
763
+ {
764
+ "epoch": 14.32,
765
+ "learning_rate": 7.130256410256411e-06,
766
+ "loss": 0.0055,
767
+ "step": 6100
768
+ },
769
+ {
770
+ "epoch": 14.44,
771
+ "learning_rate": 7.104615384615385e-06,
772
+ "loss": 0.0069,
773
+ "step": 6150
774
+ },
775
+ {
776
+ "epoch": 14.55,
777
+ "learning_rate": 7.07897435897436e-06,
778
+ "loss": 0.0049,
779
+ "step": 6200
780
+ },
781
+ {
782
+ "epoch": 14.67,
783
+ "learning_rate": 7.053333333333334e-06,
784
+ "loss": 0.006,
785
+ "step": 6250
786
+ },
787
+ {
788
+ "epoch": 14.79,
789
+ "learning_rate": 7.027692307692309e-06,
790
+ "loss": 0.0057,
791
+ "step": 6300
792
+ },
793
+ {
794
+ "epoch": 14.91,
795
+ "learning_rate": 7.002051282051283e-06,
796
+ "loss": 0.0066,
797
+ "step": 6350
798
+ },
799
+ {
800
+ "epoch": 15.02,
801
+ "learning_rate": 6.976410256410257e-06,
802
+ "loss": 0.0049,
803
+ "step": 6400
804
+ },
805
+ {
806
+ "epoch": 15.14,
807
+ "learning_rate": 6.950769230769231e-06,
808
+ "loss": 0.0034,
809
+ "step": 6450
810
+ },
811
+ {
812
+ "epoch": 15.26,
813
+ "learning_rate": 6.925128205128206e-06,
814
+ "loss": 0.004,
815
+ "step": 6500
816
+ },
817
+ {
818
+ "epoch": 15.38,
819
+ "learning_rate": 6.899487179487179e-06,
820
+ "loss": 0.0029,
821
+ "step": 6550
822
+ },
823
+ {
824
+ "epoch": 15.49,
825
+ "learning_rate": 6.873846153846154e-06,
826
+ "loss": 0.004,
827
+ "step": 6600
828
+ },
829
+ {
830
+ "epoch": 15.61,
831
+ "learning_rate": 6.848205128205128e-06,
832
+ "loss": 0.0044,
833
+ "step": 6650
834
+ },
835
+ {
836
+ "epoch": 15.73,
837
+ "learning_rate": 6.822564102564103e-06,
838
+ "loss": 0.0057,
839
+ "step": 6700
840
+ },
841
+ {
842
+ "epoch": 15.85,
843
+ "learning_rate": 6.796923076923077e-06,
844
+ "loss": 0.0052,
845
+ "step": 6750
846
+ },
847
+ {
848
+ "epoch": 15.96,
849
+ "learning_rate": 6.771282051282052e-06,
850
+ "loss": 0.0028,
851
+ "step": 6800
852
+ },
853
+ {
854
+ "epoch": 16.08,
855
+ "learning_rate": 6.745641025641026e-06,
856
+ "loss": 0.0049,
857
+ "step": 6850
858
+ },
859
+ {
860
+ "epoch": 16.2,
861
+ "learning_rate": 6.720000000000001e-06,
862
+ "loss": 0.0063,
863
+ "step": 6900
864
+ },
865
+ {
866
+ "epoch": 16.31,
867
+ "learning_rate": 6.694358974358975e-06,
868
+ "loss": 0.0049,
869
+ "step": 6950
870
+ },
871
+ {
872
+ "epoch": 16.43,
873
+ "learning_rate": 6.668717948717949e-06,
874
+ "loss": 0.005,
875
+ "step": 7000
876
+ },
877
+ {
878
+ "epoch": 16.55,
879
+ "learning_rate": 6.6430769230769235e-06,
880
+ "loss": 0.0055,
881
+ "step": 7050
882
+ },
883
+ {
884
+ "epoch": 16.67,
885
+ "learning_rate": 6.617435897435898e-06,
886
+ "loss": 0.0039,
887
+ "step": 7100
888
+ },
889
+ {
890
+ "epoch": 16.78,
891
+ "learning_rate": 6.5917948717948725e-06,
892
+ "loss": 0.0042,
893
+ "step": 7150
894
+ },
895
+ {
896
+ "epoch": 16.9,
897
+ "learning_rate": 6.566153846153846e-06,
898
+ "loss": 0.0041,
899
+ "step": 7200
900
+ },
901
+ {
902
+ "epoch": 17.02,
903
+ "learning_rate": 6.540512820512821e-06,
904
+ "loss": 0.0027,
905
+ "step": 7250
906
+ },
907
+ {
908
+ "epoch": 17.14,
909
+ "learning_rate": 6.514871794871795e-06,
910
+ "loss": 0.0031,
911
+ "step": 7300
912
+ },
913
+ {
914
+ "epoch": 17.25,
915
+ "learning_rate": 6.48923076923077e-06,
916
+ "loss": 0.0049,
917
+ "step": 7350
918
+ },
919
+ {
920
+ "epoch": 17.37,
921
+ "learning_rate": 6.463589743589744e-06,
922
+ "loss": 0.0046,
923
+ "step": 7400
924
+ },
925
+ {
926
+ "epoch": 17.49,
927
+ "learning_rate": 6.437948717948719e-06,
928
+ "loss": 0.0043,
929
+ "step": 7450
930
+ },
931
+ {
932
+ "epoch": 17.61,
933
+ "learning_rate": 6.412307692307693e-06,
934
+ "loss": 0.0041,
935
+ "step": 7500
936
+ },
937
+ {
938
+ "epoch": 17.72,
939
+ "learning_rate": 6.386666666666668e-06,
940
+ "loss": 0.0007,
941
+ "step": 7550
942
+ },
943
+ {
944
+ "epoch": 17.84,
945
+ "learning_rate": 6.361025641025641e-06,
946
+ "loss": 0.0049,
947
+ "step": 7600
948
+ },
949
+ {
950
+ "epoch": 17.96,
951
+ "learning_rate": 6.335384615384616e-06,
952
+ "loss": 0.0023,
953
+ "step": 7650
954
+ },
955
+ {
956
+ "epoch": 18.08,
957
+ "learning_rate": 6.30974358974359e-06,
958
+ "loss": 0.0039,
959
+ "step": 7700
960
+ },
961
+ {
962
+ "epoch": 18.19,
963
+ "learning_rate": 6.284102564102565e-06,
964
+ "loss": 0.0019,
965
+ "step": 7750
966
+ },
967
+ {
968
+ "epoch": 18.31,
969
+ "learning_rate": 6.258461538461538e-06,
970
+ "loss": 0.0045,
971
+ "step": 7800
972
+ },
973
+ {
974
+ "epoch": 18.43,
975
+ "learning_rate": 6.232820512820513e-06,
976
+ "loss": 0.0056,
977
+ "step": 7850
978
+ },
979
+ {
980
+ "epoch": 18.54,
981
+ "learning_rate": 6.207179487179487e-06,
982
+ "loss": 0.0044,
983
+ "step": 7900
984
+ },
985
+ {
986
+ "epoch": 18.66,
987
+ "learning_rate": 6.181538461538462e-06,
988
+ "loss": 0.003,
989
+ "step": 7950
990
+ },
991
+ {
992
+ "epoch": 18.78,
993
+ "learning_rate": 6.155897435897436e-06,
994
+ "loss": 0.0017,
995
+ "step": 8000
996
+ },
997
+ {
998
+ "epoch": 18.78,
999
+ "eval_loss": 0.2759210169315338,
1000
+ "eval_runtime": 702.2209,
1001
+ "eval_samples_per_second": 2.415,
1002
+ "eval_steps_per_second": 0.604,
1003
+ "eval_wer": 17.997771173848438,
1004
+ "step": 8000
1005
+ },
1006
+ {
1007
+ "epoch": 18.9,
1008
+ "learning_rate": 6.130256410256411e-06,
1009
+ "loss": 0.0053,
1010
+ "step": 8050
1011
+ },
1012
+ {
1013
+ "epoch": 19.01,
1014
+ "learning_rate": 6.104615384615385e-06,
1015
+ "loss": 0.0034,
1016
+ "step": 8100
1017
+ },
1018
+ {
1019
+ "epoch": 19.13,
1020
+ "learning_rate": 6.07897435897436e-06,
1021
+ "loss": 0.0032,
1022
+ "step": 8150
1023
+ },
1024
+ {
1025
+ "epoch": 19.25,
1026
+ "learning_rate": 6.0533333333333335e-06,
1027
+ "loss": 0.0025,
1028
+ "step": 8200
1029
+ },
1030
+ {
1031
+ "epoch": 19.37,
1032
+ "learning_rate": 6.027692307692308e-06,
1033
+ "loss": 0.005,
1034
+ "step": 8250
1035
+ },
1036
+ {
1037
+ "epoch": 19.48,
1038
+ "learning_rate": 6.0020512820512825e-06,
1039
+ "loss": 0.0027,
1040
+ "step": 8300
1041
+ },
1042
+ {
1043
+ "epoch": 19.6,
1044
+ "learning_rate": 5.976410256410257e-06,
1045
+ "loss": 0.0029,
1046
+ "step": 8350
1047
+ },
1048
+ {
1049
+ "epoch": 19.72,
1050
+ "learning_rate": 5.950769230769231e-06,
1051
+ "loss": 0.0041,
1052
+ "step": 8400
1053
+ },
1054
+ {
1055
+ "epoch": 19.84,
1056
+ "learning_rate": 5.925128205128205e-06,
1057
+ "loss": 0.0027,
1058
+ "step": 8450
1059
+ },
1060
+ {
1061
+ "epoch": 19.95,
1062
+ "learning_rate": 5.89948717948718e-06,
1063
+ "loss": 0.0045,
1064
+ "step": 8500
1065
+ },
1066
+ {
1067
+ "epoch": 20.07,
1068
+ "learning_rate": 5.873846153846154e-06,
1069
+ "loss": 0.005,
1070
+ "step": 8550
1071
+ },
1072
+ {
1073
+ "epoch": 20.19,
1074
+ "learning_rate": 5.848205128205129e-06,
1075
+ "loss": 0.0011,
1076
+ "step": 8600
1077
+ },
1078
+ {
1079
+ "epoch": 20.31,
1080
+ "learning_rate": 5.822564102564103e-06,
1081
+ "loss": 0.0021,
1082
+ "step": 8650
1083
+ },
1084
+ {
1085
+ "epoch": 20.42,
1086
+ "learning_rate": 5.796923076923078e-06,
1087
+ "loss": 0.0034,
1088
+ "step": 8700
1089
+ },
1090
+ {
1091
+ "epoch": 20.54,
1092
+ "learning_rate": 5.771282051282052e-06,
1093
+ "loss": 0.004,
1094
+ "step": 8750
1095
+ },
1096
+ {
1097
+ "epoch": 20.66,
1098
+ "learning_rate": 5.745641025641027e-06,
1099
+ "loss": 0.0022,
1100
+ "step": 8800
1101
+ },
1102
+ {
1103
+ "epoch": 20.77,
1104
+ "learning_rate": 5.72e-06,
1105
+ "loss": 0.0021,
1106
+ "step": 8850
1107
+ },
1108
+ {
1109
+ "epoch": 20.89,
1110
+ "learning_rate": 5.694358974358975e-06,
1111
+ "loss": 0.0029,
1112
+ "step": 8900
1113
+ },
1114
+ {
1115
+ "epoch": 21.01,
1116
+ "learning_rate": 5.668717948717949e-06,
1117
+ "loss": 0.0038,
1118
+ "step": 8950
1119
+ },
1120
+ {
1121
+ "epoch": 21.13,
1122
+ "learning_rate": 5.643076923076923e-06,
1123
+ "loss": 0.0019,
1124
+ "step": 9000
1125
+ },
1126
+ {
1127
+ "epoch": 21.24,
1128
+ "learning_rate": 5.6174358974358974e-06,
1129
+ "loss": 0.0023,
1130
+ "step": 9050
1131
+ },
1132
+ {
1133
+ "epoch": 21.36,
1134
+ "learning_rate": 5.591794871794872e-06,
1135
+ "loss": 0.0034,
1136
+ "step": 9100
1137
+ },
1138
+ {
1139
+ "epoch": 21.48,
1140
+ "learning_rate": 5.566666666666667e-06,
1141
+ "loss": 0.006,
1142
+ "step": 9150
1143
+ },
1144
+ {
1145
+ "epoch": 21.6,
1146
+ "learning_rate": 5.5410256410256415e-06,
1147
+ "loss": 0.0025,
1148
+ "step": 9200
1149
+ },
1150
+ {
1151
+ "epoch": 21.71,
1152
+ "learning_rate": 5.515384615384616e-06,
1153
+ "loss": 0.0019,
1154
+ "step": 9250
1155
+ },
1156
+ {
1157
+ "epoch": 21.83,
1158
+ "learning_rate": 5.4897435897435905e-06,
1159
+ "loss": 0.0013,
1160
+ "step": 9300
1161
+ },
1162
+ {
1163
+ "epoch": 21.95,
1164
+ "learning_rate": 5.464102564102565e-06,
1165
+ "loss": 0.0009,
1166
+ "step": 9350
1167
+ },
1168
+ {
1169
+ "epoch": 22.07,
1170
+ "learning_rate": 5.4384615384615395e-06,
1171
+ "loss": 0.0032,
1172
+ "step": 9400
1173
+ },
1174
+ {
1175
+ "epoch": 22.18,
1176
+ "learning_rate": 5.412820512820514e-06,
1177
+ "loss": 0.003,
1178
+ "step": 9450
1179
+ },
1180
+ {
1181
+ "epoch": 22.3,
1182
+ "learning_rate": 5.387179487179488e-06,
1183
+ "loss": 0.0068,
1184
+ "step": 9500
1185
+ },
1186
+ {
1187
+ "epoch": 22.42,
1188
+ "learning_rate": 5.361538461538462e-06,
1189
+ "loss": 0.0032,
1190
+ "step": 9550
1191
+ },
1192
+ {
1193
+ "epoch": 22.54,
1194
+ "learning_rate": 5.335897435897436e-06,
1195
+ "loss": 0.0065,
1196
+ "step": 9600
1197
+ },
1198
+ {
1199
+ "epoch": 22.65,
1200
+ "learning_rate": 5.31025641025641e-06,
1201
+ "loss": 0.0035,
1202
+ "step": 9650
1203
+ },
1204
+ {
1205
+ "epoch": 22.77,
1206
+ "learning_rate": 5.284615384615385e-06,
1207
+ "loss": 0.0014,
1208
+ "step": 9700
1209
+ },
1210
+ {
1211
+ "epoch": 22.89,
1212
+ "learning_rate": 5.258974358974359e-06,
1213
+ "loss": 0.0017,
1214
+ "step": 9750
1215
+ },
1216
+ {
1217
+ "epoch": 23.0,
1218
+ "learning_rate": 5.233333333333334e-06,
1219
+ "loss": 0.0019,
1220
+ "step": 9800
1221
+ },
1222
+ {
1223
+ "epoch": 23.12,
1224
+ "learning_rate": 5.207692307692308e-06,
1225
+ "loss": 0.0013,
1226
+ "step": 9850
1227
+ },
1228
+ {
1229
+ "epoch": 23.24,
1230
+ "learning_rate": 5.182051282051283e-06,
1231
+ "loss": 0.001,
1232
+ "step": 9900
1233
+ },
1234
+ {
1235
+ "epoch": 23.36,
1236
+ "learning_rate": 5.156410256410257e-06,
1237
+ "loss": 0.0003,
1238
+ "step": 9950
1239
+ },
1240
+ {
1241
+ "epoch": 23.47,
1242
+ "learning_rate": 5.130769230769232e-06,
1243
+ "loss": 0.0005,
1244
+ "step": 10000
1245
+ },
1246
+ {
1247
+ "epoch": 23.47,
1248
+ "eval_loss": 0.2751367390155792,
1249
+ "eval_runtime": 702.4615,
1250
+ "eval_samples_per_second": 2.414,
1251
+ "eval_steps_per_second": 0.604,
1252
+ "eval_wer": 18.545690936106986,
1253
+ "step": 10000
1254
+ },
1255
+ {
1256
+ "epoch": 23.59,
1257
+ "learning_rate": 5.105128205128206e-06,
1258
+ "loss": 0.0018,
1259
+ "step": 10050
1260
+ },
1261
+ {
1262
+ "epoch": 23.71,
1263
+ "learning_rate": 5.07948717948718e-06,
1264
+ "loss": 0.0021,
1265
+ "step": 10100
1266
+ },
1267
+ {
1268
+ "epoch": 23.83,
1269
+ "learning_rate": 5.053846153846154e-06,
1270
+ "loss": 0.0022,
1271
+ "step": 10150
1272
+ },
1273
+ {
1274
+ "epoch": 23.94,
1275
+ "learning_rate": 5.028205128205128e-06,
1276
+ "loss": 0.0012,
1277
+ "step": 10200
1278
+ },
1279
+ {
1280
+ "epoch": 24.06,
1281
+ "learning_rate": 5.0025641025641025e-06,
1282
+ "loss": 0.0008,
1283
+ "step": 10250
1284
+ },
1285
+ {
1286
+ "epoch": 24.18,
1287
+ "learning_rate": 4.976923076923078e-06,
1288
+ "loss": 0.0009,
1289
+ "step": 10300
1290
+ },
1291
+ {
1292
+ "epoch": 24.3,
1293
+ "learning_rate": 4.9512820512820515e-06,
1294
+ "loss": 0.0011,
1295
+ "step": 10350
1296
+ },
1297
+ {
1298
+ "epoch": 24.41,
1299
+ "learning_rate": 4.925641025641026e-06,
1300
+ "loss": 0.0035,
1301
+ "step": 10400
1302
+ },
1303
+ {
1304
+ "epoch": 24.53,
1305
+ "learning_rate": 4.9000000000000005e-06,
1306
+ "loss": 0.002,
1307
+ "step": 10450
1308
+ },
1309
+ {
1310
+ "epoch": 24.65,
1311
+ "learning_rate": 4.874358974358975e-06,
1312
+ "loss": 0.0019,
1313
+ "step": 10500
1314
+ },
1315
+ {
1316
+ "epoch": 24.77,
1317
+ "learning_rate": 4.8487179487179495e-06,
1318
+ "loss": 0.0048,
1319
+ "step": 10550
1320
+ },
1321
+ {
1322
+ "epoch": 24.88,
1323
+ "learning_rate": 4.823076923076924e-06,
1324
+ "loss": 0.0031,
1325
+ "step": 10600
1326
+ },
1327
+ {
1328
+ "epoch": 25.0,
1329
+ "learning_rate": 4.7974358974358985e-06,
1330
+ "loss": 0.0003,
1331
+ "step": 10650
1332
+ },
1333
+ {
1334
+ "epoch": 25.12,
1335
+ "learning_rate": 4.771794871794872e-06,
1336
+ "loss": 0.0016,
1337
+ "step": 10700
1338
+ },
1339
+ {
1340
+ "epoch": 25.23,
1341
+ "learning_rate": 4.746153846153847e-06,
1342
+ "loss": 0.0009,
1343
+ "step": 10750
1344
+ },
1345
+ {
1346
+ "epoch": 25.35,
1347
+ "learning_rate": 4.720512820512821e-06,
1348
+ "loss": 0.0007,
1349
+ "step": 10800
1350
+ },
1351
+ {
1352
+ "epoch": 25.47,
1353
+ "learning_rate": 4.694871794871796e-06,
1354
+ "loss": 0.0011,
1355
+ "step": 10850
1356
+ },
1357
+ {
1358
+ "epoch": 25.59,
1359
+ "learning_rate": 4.66923076923077e-06,
1360
+ "loss": 0.0014,
1361
+ "step": 10900
1362
+ },
1363
+ {
1364
+ "epoch": 25.7,
1365
+ "learning_rate": 4.643589743589745e-06,
1366
+ "loss": 0.0016,
1367
+ "step": 10950
1368
+ },
1369
+ {
1370
+ "epoch": 25.82,
1371
+ "learning_rate": 4.617948717948718e-06,
1372
+ "loss": 0.0013,
1373
+ "step": 11000
1374
+ },
1375
+ {
1376
+ "epoch": 25.94,
1377
+ "learning_rate": 4.592307692307693e-06,
1378
+ "loss": 0.0019,
1379
+ "step": 11050
1380
+ },
1381
+ {
1382
+ "epoch": 26.06,
1383
+ "learning_rate": 4.566666666666667e-06,
1384
+ "loss": 0.0031,
1385
+ "step": 11100
1386
+ },
1387
+ {
1388
+ "epoch": 26.17,
1389
+ "learning_rate": 4.541025641025642e-06,
1390
+ "loss": 0.0033,
1391
+ "step": 11150
1392
+ },
1393
+ {
1394
+ "epoch": 26.29,
1395
+ "learning_rate": 4.515384615384616e-06,
1396
+ "loss": 0.0005,
1397
+ "step": 11200
1398
+ },
1399
+ {
1400
+ "epoch": 26.41,
1401
+ "learning_rate": 4.489743589743591e-06,
1402
+ "loss": 0.0025,
1403
+ "step": 11250
1404
+ },
1405
+ {
1406
+ "epoch": 26.53,
1407
+ "learning_rate": 4.464102564102564e-06,
1408
+ "loss": 0.0034,
1409
+ "step": 11300
1410
+ },
1411
+ {
1412
+ "epoch": 26.64,
1413
+ "learning_rate": 4.438461538461539e-06,
1414
+ "loss": 0.0011,
1415
+ "step": 11350
1416
+ },
1417
+ {
1418
+ "epoch": 26.76,
1419
+ "learning_rate": 4.412820512820513e-06,
1420
+ "loss": 0.0009,
1421
+ "step": 11400
1422
+ },
1423
+ {
1424
+ "epoch": 26.88,
1425
+ "learning_rate": 4.387179487179488e-06,
1426
+ "loss": 0.0009,
1427
+ "step": 11450
1428
+ },
1429
+ {
1430
+ "epoch": 27.0,
1431
+ "learning_rate": 4.361538461538462e-06,
1432
+ "loss": 0.0003,
1433
+ "step": 11500
1434
+ },
1435
+ {
1436
+ "epoch": 27.11,
1437
+ "learning_rate": 4.335897435897437e-06,
1438
+ "loss": 0.0011,
1439
+ "step": 11550
1440
+ },
1441
+ {
1442
+ "epoch": 27.23,
1443
+ "learning_rate": 4.3102564102564105e-06,
1444
+ "loss": 0.0014,
1445
+ "step": 11600
1446
+ },
1447
+ {
1448
+ "epoch": 27.35,
1449
+ "learning_rate": 4.284615384615385e-06,
1450
+ "loss": 0.0006,
1451
+ "step": 11650
1452
+ },
1453
+ {
1454
+ "epoch": 27.46,
1455
+ "learning_rate": 4.2589743589743595e-06,
1456
+ "loss": 0.0011,
1457
+ "step": 11700
1458
+ },
1459
+ {
1460
+ "epoch": 27.58,
1461
+ "learning_rate": 4.233333333333334e-06,
1462
+ "loss": 0.0022,
1463
+ "step": 11750
1464
+ },
1465
+ {
1466
+ "epoch": 27.7,
1467
+ "learning_rate": 4.2076923076923085e-06,
1468
+ "loss": 0.0015,
1469
+ "step": 11800
1470
+ },
1471
+ {
1472
+ "epoch": 27.82,
1473
+ "learning_rate": 4.182051282051283e-06,
1474
+ "loss": 0.0019,
1475
+ "step": 11850
1476
+ },
1477
+ {
1478
+ "epoch": 27.93,
1479
+ "learning_rate": 4.156410256410257e-06,
1480
+ "loss": 0.0004,
1481
+ "step": 11900
1482
+ },
1483
+ {
1484
+ "epoch": 28.05,
1485
+ "learning_rate": 4.130769230769231e-06,
1486
+ "loss": 0.001,
1487
+ "step": 11950
1488
+ },
1489
+ {
1490
+ "epoch": 28.17,
1491
+ "learning_rate": 4.105128205128206e-06,
1492
+ "loss": 0.0015,
1493
+ "step": 12000
1494
+ },
1495
+ {
1496
+ "epoch": 28.17,
1497
+ "eval_loss": 0.2928332984447479,
1498
+ "eval_runtime": 718.101,
1499
+ "eval_samples_per_second": 2.362,
1500
+ "eval_steps_per_second": 0.59,
1501
+ "eval_wer": 19.20505200594354,
1502
+ "step": 12000
1503
+ },
1504
+ {
1505
+ "epoch": 28.29,
1506
+ "learning_rate": 4.07948717948718e-06,
1507
+ "loss": 0.0009,
1508
+ "step": 12050
1509
+ },
1510
+ {
1511
+ "epoch": 28.4,
1512
+ "learning_rate": 4.053846153846155e-06,
1513
+ "loss": 0.0028,
1514
+ "step": 12100
1515
+ },
1516
+ {
1517
+ "epoch": 28.52,
1518
+ "learning_rate": 4.028205128205129e-06,
1519
+ "loss": 0.0023,
1520
+ "step": 12150
1521
+ },
1522
+ {
1523
+ "epoch": 28.64,
1524
+ "learning_rate": 4.002564102564103e-06,
1525
+ "loss": 0.0016,
1526
+ "step": 12200
1527
+ },
1528
+ {
1529
+ "epoch": 28.76,
1530
+ "learning_rate": 3.976923076923077e-06,
1531
+ "loss": 0.0001,
1532
+ "step": 12250
1533
+ },
1534
+ {
1535
+ "epoch": 28.87,
1536
+ "learning_rate": 3.951282051282052e-06,
1537
+ "loss": 0.0011,
1538
+ "step": 12300
1539
+ },
1540
+ {
1541
+ "epoch": 28.99,
1542
+ "learning_rate": 3.925641025641026e-06,
1543
+ "loss": 0.0016,
1544
+ "step": 12350
1545
+ },
1546
+ {
1547
+ "epoch": 29.11,
1548
+ "learning_rate": 3.900000000000001e-06,
1549
+ "loss": 0.0009,
1550
+ "step": 12400
1551
+ },
1552
+ {
1553
+ "epoch": 29.23,
1554
+ "learning_rate": 3.874358974358975e-06,
1555
+ "loss": 0.0003,
1556
+ "step": 12450
1557
+ },
1558
+ {
1559
+ "epoch": 29.34,
1560
+ "learning_rate": 3.848717948717949e-06,
1561
+ "loss": 0.0004,
1562
+ "step": 12500
1563
+ },
1564
+ {
1565
+ "epoch": 29.46,
1566
+ "learning_rate": 3.823076923076923e-06,
1567
+ "loss": 0.0017,
1568
+ "step": 12550
1569
+ },
1570
+ {
1571
+ "epoch": 29.58,
1572
+ "learning_rate": 3.7974358974358975e-06,
1573
+ "loss": 0.0006,
1574
+ "step": 12600
1575
+ },
1576
+ {
1577
+ "epoch": 29.69,
1578
+ "learning_rate": 3.771794871794872e-06,
1579
+ "loss": 0.001,
1580
+ "step": 12650
1581
+ },
1582
+ {
1583
+ "epoch": 29.81,
1584
+ "learning_rate": 3.7461538461538465e-06,
1585
+ "loss": 0.0006,
1586
+ "step": 12700
1587
+ },
1588
+ {
1589
+ "epoch": 29.93,
1590
+ "learning_rate": 3.720512820512821e-06,
1591
+ "loss": 0.0017,
1592
+ "step": 12750
1593
+ },
1594
+ {
1595
+ "epoch": 30.05,
1596
+ "learning_rate": 3.694871794871795e-06,
1597
+ "loss": 0.0002,
1598
+ "step": 12800
1599
+ },
1600
+ {
1601
+ "epoch": 30.16,
1602
+ "learning_rate": 3.6692307692307695e-06,
1603
+ "loss": 0.002,
1604
+ "step": 12850
1605
+ },
1606
+ {
1607
+ "epoch": 30.28,
1608
+ "learning_rate": 3.6435897435897436e-06,
1609
+ "loss": 0.0004,
1610
+ "step": 12900
1611
+ },
1612
+ {
1613
+ "epoch": 30.4,
1614
+ "learning_rate": 3.617948717948718e-06,
1615
+ "loss": 0.0014,
1616
+ "step": 12950
1617
+ },
1618
+ {
1619
+ "epoch": 30.52,
1620
+ "learning_rate": 3.5923076923076926e-06,
1621
+ "loss": 0.0005,
1622
+ "step": 13000
1623
+ },
1624
+ {
1625
+ "epoch": 30.63,
1626
+ "learning_rate": 3.566666666666667e-06,
1627
+ "loss": 0.0019,
1628
+ "step": 13050
1629
+ },
1630
+ {
1631
+ "epoch": 30.75,
1632
+ "learning_rate": 3.541025641025641e-06,
1633
+ "loss": 0.0005,
1634
+ "step": 13100
1635
+ },
1636
+ {
1637
+ "epoch": 30.87,
1638
+ "learning_rate": 3.5153846153846157e-06,
1639
+ "loss": 0.0019,
1640
+ "step": 13150
1641
+ },
1642
+ {
1643
+ "epoch": 30.99,
1644
+ "learning_rate": 3.4897435897435897e-06,
1645
+ "loss": 0.001,
1646
+ "step": 13200
1647
+ },
1648
+ {
1649
+ "epoch": 31.1,
1650
+ "learning_rate": 3.4641025641025642e-06,
1651
+ "loss": 0.0002,
1652
+ "step": 13250
1653
+ },
1654
+ {
1655
+ "epoch": 31.22,
1656
+ "learning_rate": 3.4384615384615387e-06,
1657
+ "loss": 0.0019,
1658
+ "step": 13300
1659
+ },
1660
+ {
1661
+ "epoch": 31.34,
1662
+ "learning_rate": 3.4128205128205132e-06,
1663
+ "loss": 0.0008,
1664
+ "step": 13350
1665
+ },
1666
+ {
1667
+ "epoch": 31.46,
1668
+ "learning_rate": 3.3871794871794873e-06,
1669
+ "loss": 0.0013,
1670
+ "step": 13400
1671
+ },
1672
+ {
1673
+ "epoch": 31.57,
1674
+ "learning_rate": 3.361538461538462e-06,
1675
+ "loss": 0.0002,
1676
+ "step": 13450
1677
+ },
1678
+ {
1679
+ "epoch": 31.69,
1680
+ "learning_rate": 3.3358974358974363e-06,
1681
+ "loss": 0.001,
1682
+ "step": 13500
1683
+ },
1684
+ {
1685
+ "epoch": 31.81,
1686
+ "learning_rate": 3.3102564102564104e-06,
1687
+ "loss": 0.0031,
1688
+ "step": 13550
1689
+ },
1690
+ {
1691
+ "epoch": 31.92,
1692
+ "learning_rate": 3.284615384615385e-06,
1693
+ "loss": 0.0003,
1694
+ "step": 13600
1695
+ },
1696
+ {
1697
+ "epoch": 32.04,
1698
+ "learning_rate": 3.2594871794871795e-06,
1699
+ "loss": 0.0014,
1700
+ "step": 13650
1701
+ },
1702
+ {
1703
+ "epoch": 32.16,
1704
+ "learning_rate": 3.233846153846154e-06,
1705
+ "loss": 0.0001,
1706
+ "step": 13700
1707
+ },
1708
+ {
1709
+ "epoch": 32.28,
1710
+ "learning_rate": 3.2082051282051285e-06,
1711
+ "loss": 0.0003,
1712
+ "step": 13750
1713
+ },
1714
+ {
1715
+ "epoch": 32.39,
1716
+ "learning_rate": 3.182564102564103e-06,
1717
+ "loss": 0.0006,
1718
+ "step": 13800
1719
+ },
1720
+ {
1721
+ "epoch": 32.51,
1722
+ "learning_rate": 3.1569230769230775e-06,
1723
+ "loss": 0.001,
1724
+ "step": 13850
1725
+ },
1726
+ {
1727
+ "epoch": 32.63,
1728
+ "learning_rate": 3.131282051282051e-06,
1729
+ "loss": 0.001,
1730
+ "step": 13900
1731
+ },
1732
+ {
1733
+ "epoch": 32.75,
1734
+ "learning_rate": 3.1056410256410257e-06,
1735
+ "loss": 0.0013,
1736
+ "step": 13950
1737
+ },
1738
+ {
1739
+ "epoch": 32.86,
1740
+ "learning_rate": 3.08e-06,
1741
+ "loss": 0.0004,
1742
+ "step": 14000
1743
+ },
1744
+ {
1745
+ "epoch": 32.86,
1746
+ "eval_loss": 0.2818757891654968,
1747
+ "eval_runtime": 695.7492,
1748
+ "eval_samples_per_second": 2.438,
1749
+ "eval_steps_per_second": 0.609,
1750
+ "eval_wer": 18.285661218424963,
1751
+ "step": 14000
1752
+ },
1753
+ {
1754
+ "epoch": 32.98,
1755
+ "learning_rate": 3.0543589743589747e-06,
1756
+ "loss": 0.0013,
1757
+ "step": 14050
1758
+ },
1759
+ {
1760
+ "epoch": 33.1,
1761
+ "learning_rate": 3.028717948717949e-06,
1762
+ "loss": 0.0004,
1763
+ "step": 14100
1764
+ },
1765
+ {
1766
+ "epoch": 33.22,
1767
+ "learning_rate": 3.0030769230769236e-06,
1768
+ "loss": 0.0006,
1769
+ "step": 14150
1770
+ },
1771
+ {
1772
+ "epoch": 33.33,
1773
+ "learning_rate": 2.9774358974358973e-06,
1774
+ "loss": 0.0006,
1775
+ "step": 14200
1776
+ },
1777
+ {
1778
+ "epoch": 33.45,
1779
+ "learning_rate": 2.951794871794872e-06,
1780
+ "loss": 0.0008,
1781
+ "step": 14250
1782
+ },
1783
+ {
1784
+ "epoch": 33.57,
1785
+ "learning_rate": 2.9261538461538463e-06,
1786
+ "loss": 0.0003,
1787
+ "step": 14300
1788
+ },
1789
+ {
1790
+ "epoch": 33.69,
1791
+ "learning_rate": 2.9005128205128208e-06,
1792
+ "loss": 0.0003,
1793
+ "step": 14350
1794
+ },
1795
+ {
1796
+ "epoch": 33.8,
1797
+ "learning_rate": 2.8748717948717953e-06,
1798
+ "loss": 0.0002,
1799
+ "step": 14400
1800
+ },
1801
+ {
1802
+ "epoch": 33.92,
1803
+ "learning_rate": 2.8492307692307698e-06,
1804
+ "loss": 0.001,
1805
+ "step": 14450
1806
+ },
1807
+ {
1808
+ "epoch": 34.04,
1809
+ "learning_rate": 2.8235897435897434e-06,
1810
+ "loss": 0.0003,
1811
+ "step": 14500
1812
+ },
1813
+ {
1814
+ "epoch": 34.15,
1815
+ "learning_rate": 2.797948717948718e-06,
1816
+ "loss": 0.0029,
1817
+ "step": 14550
1818
+ },
1819
+ {
1820
+ "epoch": 34.27,
1821
+ "learning_rate": 2.7723076923076924e-06,
1822
+ "loss": 0.0009,
1823
+ "step": 14600
1824
+ },
1825
+ {
1826
+ "epoch": 34.39,
1827
+ "learning_rate": 2.746666666666667e-06,
1828
+ "loss": 0.0012,
1829
+ "step": 14650
1830
+ },
1831
+ {
1832
+ "epoch": 34.51,
1833
+ "learning_rate": 2.7210256410256414e-06,
1834
+ "loss": 0.0014,
1835
+ "step": 14700
1836
+ },
1837
+ {
1838
+ "epoch": 34.62,
1839
+ "learning_rate": 2.695384615384616e-06,
1840
+ "loss": 0.0001,
1841
+ "step": 14750
1842
+ },
1843
+ {
1844
+ "epoch": 34.74,
1845
+ "learning_rate": 2.6697435897435896e-06,
1846
+ "loss": 0.0003,
1847
+ "step": 14800
1848
+ },
1849
+ {
1850
+ "epoch": 34.86,
1851
+ "learning_rate": 2.644102564102564e-06,
1852
+ "loss": 0.0009,
1853
+ "step": 14850
1854
+ },
1855
+ {
1856
+ "epoch": 34.98,
1857
+ "learning_rate": 2.6184615384615385e-06,
1858
+ "loss": 0.0001,
1859
+ "step": 14900
1860
+ },
1861
+ {
1862
+ "epoch": 35.09,
1863
+ "learning_rate": 2.592820512820513e-06,
1864
+ "loss": 0.0003,
1865
+ "step": 14950
1866
+ },
1867
+ {
1868
+ "epoch": 35.21,
1869
+ "learning_rate": 2.5671794871794875e-06,
1870
+ "loss": 0.0001,
1871
+ "step": 15000
1872
+ },
1873
+ {
1874
+ "epoch": 35.33,
1875
+ "learning_rate": 2.541538461538462e-06,
1876
+ "loss": 0.0003,
1877
+ "step": 15050
1878
+ },
1879
+ {
1880
+ "epoch": 35.45,
1881
+ "learning_rate": 2.5158974358974357e-06,
1882
+ "loss": 0.0012,
1883
+ "step": 15100
1884
+ },
1885
+ {
1886
+ "epoch": 35.56,
1887
+ "learning_rate": 2.4902564102564106e-06,
1888
+ "loss": 0.0017,
1889
+ "step": 15150
1890
+ },
1891
+ {
1892
+ "epoch": 35.68,
1893
+ "learning_rate": 2.4646153846153847e-06,
1894
+ "loss": 0.0006,
1895
+ "step": 15200
1896
+ },
1897
+ {
1898
+ "epoch": 35.8,
1899
+ "learning_rate": 2.438974358974359e-06,
1900
+ "loss": 0.001,
1901
+ "step": 15250
1902
+ },
1903
+ {
1904
+ "epoch": 35.92,
1905
+ "learning_rate": 2.4133333333333337e-06,
1906
+ "loss": 0.0013,
1907
+ "step": 15300
1908
+ },
1909
+ {
1910
+ "epoch": 36.03,
1911
+ "learning_rate": 2.3876923076923077e-06,
1912
+ "loss": 0.0001,
1913
+ "step": 15350
1914
+ },
1915
+ {
1916
+ "epoch": 36.15,
1917
+ "learning_rate": 2.3620512820512822e-06,
1918
+ "loss": 0.0015,
1919
+ "step": 15400
1920
+ },
1921
+ {
1922
+ "epoch": 36.27,
1923
+ "learning_rate": 2.3364102564102567e-06,
1924
+ "loss": 0.0001,
1925
+ "step": 15450
1926
+ },
1927
+ {
1928
+ "epoch": 36.38,
1929
+ "learning_rate": 2.310769230769231e-06,
1930
+ "loss": 0.0001,
1931
+ "step": 15500
1932
+ },
1933
+ {
1934
+ "epoch": 36.5,
1935
+ "learning_rate": 2.2851282051282053e-06,
1936
+ "loss": 0.0003,
1937
+ "step": 15550
1938
+ },
1939
+ {
1940
+ "epoch": 36.62,
1941
+ "learning_rate": 2.25948717948718e-06,
1942
+ "loss": 0.0009,
1943
+ "step": 15600
1944
+ },
1945
+ {
1946
+ "epoch": 36.74,
1947
+ "learning_rate": 2.233846153846154e-06,
1948
+ "loss": 0.0001,
1949
+ "step": 15650
1950
+ },
1951
+ {
1952
+ "epoch": 36.85,
1953
+ "learning_rate": 2.2082051282051284e-06,
1954
+ "loss": 0.0001,
1955
+ "step": 15700
1956
+ },
1957
+ {
1958
+ "epoch": 36.97,
1959
+ "learning_rate": 2.182564102564103e-06,
1960
+ "loss": 0.002,
1961
+ "step": 15750
1962
+ },
1963
+ {
1964
+ "epoch": 37.09,
1965
+ "learning_rate": 2.156923076923077e-06,
1966
+ "loss": 0.0009,
1967
+ "step": 15800
1968
+ },
1969
+ {
1970
+ "epoch": 37.21,
1971
+ "learning_rate": 2.1312820512820514e-06,
1972
+ "loss": 0.0005,
1973
+ "step": 15850
1974
+ },
1975
+ {
1976
+ "epoch": 37.32,
1977
+ "learning_rate": 2.105641025641026e-06,
1978
+ "loss": 0.0004,
1979
+ "step": 15900
1980
+ },
1981
+ {
1982
+ "epoch": 37.44,
1983
+ "learning_rate": 2.08e-06,
1984
+ "loss": 0.0001,
1985
+ "step": 15950
1986
+ },
1987
+ {
1988
+ "epoch": 37.56,
1989
+ "learning_rate": 2.0543589743589745e-06,
1990
+ "loss": 0.0002,
1991
+ "step": 16000
1992
+ },
1993
+ {
1994
+ "epoch": 37.56,
1995
+ "eval_loss": 0.28313764929771423,
1996
+ "eval_runtime": 695.718,
1997
+ "eval_samples_per_second": 2.438,
1998
+ "eval_steps_per_second": 0.609,
1999
+ "eval_wer": 17.72845468053492,
2000
+ "step": 16000
2001
+ },
2002
+ {
2003
+ "epoch": 37.68,
2004
+ "learning_rate": 2.028717948717949e-06,
2005
+ "loss": 0.0002,
2006
+ "step": 16050
2007
+ },
2008
+ {
2009
+ "epoch": 37.79,
2010
+ "learning_rate": 2.003076923076923e-06,
2011
+ "loss": 0.0006,
2012
+ "step": 16100
2013
+ },
2014
+ {
2015
+ "epoch": 37.91,
2016
+ "learning_rate": 1.9774358974358976e-06,
2017
+ "loss": 0.0001,
2018
+ "step": 16150
2019
+ },
2020
+ {
2021
+ "epoch": 38.03,
2022
+ "learning_rate": 1.951794871794872e-06,
2023
+ "loss": 0.0002,
2024
+ "step": 16200
2025
+ },
2026
+ {
2027
+ "epoch": 38.15,
2028
+ "learning_rate": 1.926153846153846e-06,
2029
+ "loss": 0.0,
2030
+ "step": 16250
2031
+ },
2032
+ {
2033
+ "epoch": 38.26,
2034
+ "learning_rate": 1.9005128205128206e-06,
2035
+ "loss": 0.0,
2036
+ "step": 16300
2037
+ },
2038
+ {
2039
+ "epoch": 38.38,
2040
+ "learning_rate": 1.8748717948717951e-06,
2041
+ "loss": 0.0001,
2042
+ "step": 16350
2043
+ },
2044
+ {
2045
+ "epoch": 38.5,
2046
+ "learning_rate": 1.8492307692307692e-06,
2047
+ "loss": 0.0,
2048
+ "step": 16400
2049
+ },
2050
+ {
2051
+ "epoch": 38.62,
2052
+ "learning_rate": 1.8235897435897437e-06,
2053
+ "loss": 0.0,
2054
+ "step": 16450
2055
+ },
2056
+ {
2057
+ "epoch": 38.73,
2058
+ "learning_rate": 1.7979487179487182e-06,
2059
+ "loss": 0.0,
2060
+ "step": 16500
2061
+ },
2062
+ {
2063
+ "epoch": 38.85,
2064
+ "learning_rate": 1.7723076923076922e-06,
2065
+ "loss": 0.0,
2066
+ "step": 16550
2067
+ },
2068
+ {
2069
+ "epoch": 38.97,
2070
+ "learning_rate": 1.7466666666666667e-06,
2071
+ "loss": 0.0002,
2072
+ "step": 16600
2073
+ },
2074
+ {
2075
+ "epoch": 39.08,
2076
+ "learning_rate": 1.7210256410256412e-06,
2077
+ "loss": 0.0006,
2078
+ "step": 16650
2079
+ },
2080
+ {
2081
+ "epoch": 39.2,
2082
+ "learning_rate": 1.6953846153846153e-06,
2083
+ "loss": 0.0,
2084
+ "step": 16700
2085
+ },
2086
+ {
2087
+ "epoch": 39.32,
2088
+ "learning_rate": 1.6697435897435898e-06,
2089
+ "loss": 0.0003,
2090
+ "step": 16750
2091
+ },
2092
+ {
2093
+ "epoch": 39.44,
2094
+ "learning_rate": 1.6441025641025643e-06,
2095
+ "loss": 0.0,
2096
+ "step": 16800
2097
+ },
2098
+ {
2099
+ "epoch": 39.55,
2100
+ "learning_rate": 1.6184615384615384e-06,
2101
+ "loss": 0.0007,
2102
+ "step": 16850
2103
+ },
2104
+ {
2105
+ "epoch": 39.67,
2106
+ "learning_rate": 1.5928205128205129e-06,
2107
+ "loss": 0.0,
2108
+ "step": 16900
2109
+ },
2110
+ {
2111
+ "epoch": 39.79,
2112
+ "learning_rate": 1.5676923076923078e-06,
2113
+ "loss": 0.0008,
2114
+ "step": 16950
2115
+ },
2116
+ {
2117
+ "epoch": 39.91,
2118
+ "learning_rate": 1.5420512820512822e-06,
2119
+ "loss": 0.0004,
2120
+ "step": 17000
2121
+ },
2122
+ {
2123
+ "epoch": 40.02,
2124
+ "learning_rate": 1.5164102564102565e-06,
2125
+ "loss": 0.0011,
2126
+ "step": 17050
2127
+ },
2128
+ {
2129
+ "epoch": 40.14,
2130
+ "learning_rate": 1.4907692307692308e-06,
2131
+ "loss": 0.0003,
2132
+ "step": 17100
2133
+ },
2134
+ {
2135
+ "epoch": 40.26,
2136
+ "learning_rate": 1.4651282051282053e-06,
2137
+ "loss": 0.0013,
2138
+ "step": 17150
2139
+ },
2140
+ {
2141
+ "epoch": 40.38,
2142
+ "learning_rate": 1.4394871794871796e-06,
2143
+ "loss": 0.0003,
2144
+ "step": 17200
2145
+ },
2146
+ {
2147
+ "epoch": 40.49,
2148
+ "learning_rate": 1.4138461538461539e-06,
2149
+ "loss": 0.0001,
2150
+ "step": 17250
2151
+ },
2152
+ {
2153
+ "epoch": 40.61,
2154
+ "learning_rate": 1.3882051282051284e-06,
2155
+ "loss": 0.0004,
2156
+ "step": 17300
2157
+ },
2158
+ {
2159
+ "epoch": 40.73,
2160
+ "learning_rate": 1.3625641025641027e-06,
2161
+ "loss": 0.0005,
2162
+ "step": 17350
2163
+ },
2164
+ {
2165
+ "epoch": 40.85,
2166
+ "learning_rate": 1.336923076923077e-06,
2167
+ "loss": 0.001,
2168
+ "step": 17400
2169
+ },
2170
+ {
2171
+ "epoch": 40.96,
2172
+ "learning_rate": 1.3112820512820514e-06,
2173
+ "loss": 0.0006,
2174
+ "step": 17450
2175
+ },
2176
+ {
2177
+ "epoch": 41.08,
2178
+ "learning_rate": 1.2856410256410257e-06,
2179
+ "loss": 0.0001,
2180
+ "step": 17500
2181
+ },
2182
+ {
2183
+ "epoch": 41.2,
2184
+ "learning_rate": 1.26e-06,
2185
+ "loss": 0.0003,
2186
+ "step": 17550
2187
+ },
2188
+ {
2189
+ "epoch": 41.31,
2190
+ "learning_rate": 1.2343589743589745e-06,
2191
+ "loss": 0.0,
2192
+ "step": 17600
2193
+ },
2194
+ {
2195
+ "epoch": 41.43,
2196
+ "learning_rate": 1.2087179487179488e-06,
2197
+ "loss": 0.0001,
2198
+ "step": 17650
2199
+ },
2200
+ {
2201
+ "epoch": 41.55,
2202
+ "learning_rate": 1.1830769230769233e-06,
2203
+ "loss": 0.0003,
2204
+ "step": 17700
2205
+ },
2206
+ {
2207
+ "epoch": 41.67,
2208
+ "learning_rate": 1.1574358974358976e-06,
2209
+ "loss": 0.0003,
2210
+ "step": 17750
2211
+ },
2212
+ {
2213
+ "epoch": 41.78,
2214
+ "learning_rate": 1.1317948717948719e-06,
2215
+ "loss": 0.0001,
2216
+ "step": 17800
2217
+ },
2218
+ {
2219
+ "epoch": 41.9,
2220
+ "learning_rate": 1.1061538461538463e-06,
2221
+ "loss": 0.0,
2222
+ "step": 17850
2223
+ },
2224
+ {
2225
+ "epoch": 42.02,
2226
+ "learning_rate": 1.0805128205128206e-06,
2227
+ "loss": 0.0,
2228
+ "step": 17900
2229
+ },
2230
+ {
2231
+ "epoch": 42.14,
2232
+ "learning_rate": 1.054871794871795e-06,
2233
+ "loss": 0.0,
2234
+ "step": 17950
2235
+ },
2236
+ {
2237
+ "epoch": 42.25,
2238
+ "learning_rate": 1.0292307692307694e-06,
2239
+ "loss": 0.0007,
2240
+ "step": 18000
2241
+ },
2242
+ {
2243
+ "epoch": 42.25,
2244
+ "eval_loss": 0.27763035893440247,
2245
+ "eval_runtime": 699.0165,
2246
+ "eval_samples_per_second": 2.426,
2247
+ "eval_steps_per_second": 0.607,
2248
+ "eval_wer": 17.83989598811293,
2249
+ "step": 18000
2250
+ },
2251
+ {
2252
+ "epoch": 42.37,
2253
+ "learning_rate": 1.0035897435897437e-06,
2254
+ "loss": 0.0008,
2255
+ "step": 18050
2256
+ },
2257
+ {
2258
+ "epoch": 42.49,
2259
+ "learning_rate": 9.77948717948718e-07,
2260
+ "loss": 0.0016,
2261
+ "step": 18100
2262
+ },
2263
+ {
2264
+ "epoch": 42.61,
2265
+ "learning_rate": 9.523076923076924e-07,
2266
+ "loss": 0.0001,
2267
+ "step": 18150
2268
+ },
2269
+ {
2270
+ "epoch": 42.72,
2271
+ "learning_rate": 9.266666666666667e-07,
2272
+ "loss": 0.0,
2273
+ "step": 18200
2274
+ },
2275
+ {
2276
+ "epoch": 42.84,
2277
+ "learning_rate": 9.010256410256411e-07,
2278
+ "loss": 0.0001,
2279
+ "step": 18250
2280
+ },
2281
+ {
2282
+ "epoch": 42.96,
2283
+ "learning_rate": 8.753846153846154e-07,
2284
+ "loss": 0.0,
2285
+ "step": 18300
2286
+ },
2287
+ {
2288
+ "epoch": 43.08,
2289
+ "learning_rate": 8.497435897435897e-07,
2290
+ "loss": 0.0014,
2291
+ "step": 18350
2292
+ },
2293
+ {
2294
+ "epoch": 43.19,
2295
+ "learning_rate": 8.241025641025642e-07,
2296
+ "loss": 0.0005,
2297
+ "step": 18400
2298
+ },
2299
+ {
2300
+ "epoch": 43.31,
2301
+ "learning_rate": 7.984615384615385e-07,
2302
+ "loss": 0.0,
2303
+ "step": 18450
2304
+ },
2305
+ {
2306
+ "epoch": 43.43,
2307
+ "learning_rate": 7.728205128205128e-07,
2308
+ "loss": 0.0001,
2309
+ "step": 18500
2310
+ },
2311
+ {
2312
+ "epoch": 43.54,
2313
+ "learning_rate": 7.471794871794873e-07,
2314
+ "loss": 0.0,
2315
+ "step": 18550
2316
+ },
2317
+ {
2318
+ "epoch": 43.66,
2319
+ "learning_rate": 7.215384615384616e-07,
2320
+ "loss": 0.0,
2321
+ "step": 18600
2322
+ },
2323
+ {
2324
+ "epoch": 43.78,
2325
+ "learning_rate": 6.958974358974358e-07,
2326
+ "loss": 0.0001,
2327
+ "step": 18650
2328
+ },
2329
+ {
2330
+ "epoch": 43.9,
2331
+ "learning_rate": 6.702564102564103e-07,
2332
+ "loss": 0.0,
2333
+ "step": 18700
2334
+ },
2335
+ {
2336
+ "epoch": 44.01,
2337
+ "learning_rate": 6.446153846153846e-07,
2338
+ "loss": 0.0,
2339
+ "step": 18750
2340
+ },
2341
+ {
2342
+ "epoch": 44.13,
2343
+ "learning_rate": 6.18974358974359e-07,
2344
+ "loss": 0.0,
2345
+ "step": 18800
2346
+ },
2347
+ {
2348
+ "epoch": 44.25,
2349
+ "learning_rate": 5.933333333333334e-07,
2350
+ "loss": 0.0001,
2351
+ "step": 18850
2352
+ },
2353
+ {
2354
+ "epoch": 44.37,
2355
+ "learning_rate": 5.676923076923077e-07,
2356
+ "loss": 0.0,
2357
+ "step": 18900
2358
+ },
2359
+ {
2360
+ "epoch": 44.48,
2361
+ "learning_rate": 5.420512820512821e-07,
2362
+ "loss": 0.0004,
2363
+ "step": 18950
2364
+ },
2365
+ {
2366
+ "epoch": 44.6,
2367
+ "learning_rate": 5.164102564102565e-07,
2368
+ "loss": 0.0,
2369
+ "step": 19000
2370
+ },
2371
+ {
2372
+ "epoch": 44.72,
2373
+ "learning_rate": 4.907692307692308e-07,
2374
+ "loss": 0.0001,
2375
+ "step": 19050
2376
+ },
2377
+ {
2378
+ "epoch": 44.84,
2379
+ "learning_rate": 4.6512820512820514e-07,
2380
+ "loss": 0.0,
2381
+ "step": 19100
2382
+ },
2383
+ {
2384
+ "epoch": 44.95,
2385
+ "learning_rate": 4.3948717948717953e-07,
2386
+ "loss": 0.0,
2387
+ "step": 19150
2388
+ },
2389
+ {
2390
+ "epoch": 45.07,
2391
+ "learning_rate": 4.138461538461539e-07,
2392
+ "loss": 0.0,
2393
+ "step": 19200
2394
+ },
2395
+ {
2396
+ "epoch": 45.19,
2397
+ "learning_rate": 3.882051282051282e-07,
2398
+ "loss": 0.0001,
2399
+ "step": 19250
2400
+ },
2401
+ {
2402
+ "epoch": 45.31,
2403
+ "learning_rate": 3.625641025641026e-07,
2404
+ "loss": 0.0015,
2405
+ "step": 19300
2406
+ },
2407
+ {
2408
+ "epoch": 45.42,
2409
+ "learning_rate": 3.36923076923077e-07,
2410
+ "loss": 0.0,
2411
+ "step": 19350
2412
+ },
2413
+ {
2414
+ "epoch": 45.54,
2415
+ "learning_rate": 3.112820512820513e-07,
2416
+ "loss": 0.0,
2417
+ "step": 19400
2418
+ },
2419
+ {
2420
+ "epoch": 45.66,
2421
+ "learning_rate": 2.861538461538462e-07,
2422
+ "loss": 0.0004,
2423
+ "step": 19450
2424
+ },
2425
+ {
2426
+ "epoch": 45.77,
2427
+ "learning_rate": 2.6051282051282054e-07,
2428
+ "loss": 0.0006,
2429
+ "step": 19500
2430
+ },
2431
+ {
2432
+ "epoch": 45.89,
2433
+ "learning_rate": 2.348717948717949e-07,
2434
+ "loss": 0.0,
2435
+ "step": 19550
2436
+ },
2437
+ {
2438
+ "epoch": 46.01,
2439
+ "learning_rate": 2.0923076923076924e-07,
2440
+ "loss": 0.0,
2441
+ "step": 19600
2442
+ },
2443
+ {
2444
+ "epoch": 46.13,
2445
+ "learning_rate": 1.8358974358974358e-07,
2446
+ "loss": 0.0003,
2447
+ "step": 19650
2448
+ },
2449
+ {
2450
+ "epoch": 46.24,
2451
+ "learning_rate": 1.5794871794871797e-07,
2452
+ "loss": 0.0,
2453
+ "step": 19700
2454
+ },
2455
+ {
2456
+ "epoch": 46.36,
2457
+ "learning_rate": 1.323076923076923e-07,
2458
+ "loss": 0.0,
2459
+ "step": 19750
2460
+ },
2461
+ {
2462
+ "epoch": 46.48,
2463
+ "learning_rate": 1.0666666666666667e-07,
2464
+ "loss": 0.0,
2465
+ "step": 19800
2466
+ },
2467
+ {
2468
+ "epoch": 46.6,
2469
+ "learning_rate": 8.102564102564103e-08,
2470
+ "loss": 0.0,
2471
+ "step": 19850
2472
+ },
2473
+ {
2474
+ "epoch": 46.71,
2475
+ "learning_rate": 5.538461538461538e-08,
2476
+ "loss": 0.0,
2477
+ "step": 19900
2478
+ },
2479
+ {
2480
+ "epoch": 46.83,
2481
+ "learning_rate": 2.9743589743589746e-08,
2482
+ "loss": 0.0,
2483
+ "step": 19950
2484
+ },
2485
+ {
2486
+ "epoch": 46.95,
2487
+ "learning_rate": 4.102564102564102e-09,
2488
+ "loss": 0.0,
2489
+ "step": 20000
2490
+ },
2491
+ {
2492
+ "epoch": 46.95,
2493
+ "eval_loss": 0.2792465388774872,
2494
+ "eval_runtime": 701.495,
2495
+ "eval_samples_per_second": 2.418,
2496
+ "eval_steps_per_second": 0.604,
2497
+ "eval_wer": 17.09695393759287,
2498
+ "step": 20000
2499
+ },
2500
+ {
2501
+ "epoch": 46.95,
2502
+ "step": 20000,
2503
+ "total_flos": 1.6301509824872448e+20,
2504
+ "train_loss": 0.029123614896199433,
2505
+ "train_runtime": 43309.4036,
2506
+ "train_samples_per_second": 3.694,
2507
+ "train_steps_per_second": 0.462
2508
+ }
2509
+ ],
2510
+ "max_steps": 20000,
2511
+ "num_train_epochs": 47,
2512
+ "total_flos": 1.6301509824872448e+20,
2513
+ "trial_name": null,
2514
+ "trial_params": null
2515
+ }