Dmitry Chaplinsky
commited on
Commit
•
67b2f55
1
Parent(s):
bce1ca2
Slight update
Browse files- README.md +1 -1
- best-lm.pt +1 -1
- loss.txt +75 -0
README.md
CHANGED
@@ -17,7 +17,7 @@ widget:
|
|
17 |
|
18 |
# Ukrainian flair embeddings (forward)
|
19 |
|
20 |
-
Trained for
|
21 |
The characters dictionary used for training is in `flair_dictionary.pkl` file
|
22 |
|
23 |
For more information on flair embeddings see [the article](https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/FLAIR_EMBEDDINGS.md) or the paper below:
|
|
|
17 |
|
18 |
# Ukrainian flair embeddings (forward)
|
19 |
|
20 |
+
Trained for 12+ epochs on the texts from ubertext2.0 (WIP).
|
21 |
The characters dictionary used for training is in `flair_dictionary.pkl` file
|
22 |
|
23 |
For more information on flair embeddings see [the article](https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/FLAIR_EMBEDDINGS.md) or the paper below:
|
best-lm.pt
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 22791455
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fdae6a5c00bd208946ba0adf9f2f455701fdc71bb91157fc3372834ed8332e56
|
3 |
size 22791455
|
loss.txt
CHANGED
@@ -306,3 +306,78 @@
|
|
306 |
| end of split 54 / 28 | epoch 10 | time: 3388.66s | valid loss 1.0212 | valid ppl 2.7766 | learning rate 5.0000
|
307 |
| end of split 55 / 28 | epoch 10 | time: 3386.75s | valid loss 1.0212 | valid ppl 2.7764 | learning rate 5.0000
|
308 |
| end of split 56 / 28 | epoch 10 | time: 3386.25s | valid loss 1.0213 | valid ppl 2.7767 | learning rate 5.0000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
306 |
| end of split 54 / 28 | epoch 10 | time: 3388.66s | valid loss 1.0212 | valid ppl 2.7766 | learning rate 5.0000
|
307 |
| end of split 55 / 28 | epoch 10 | time: 3386.75s | valid loss 1.0212 | valid ppl 2.7764 | learning rate 5.0000
|
308 |
| end of split 56 / 28 | epoch 10 | time: 3386.25s | valid loss 1.0213 | valid ppl 2.7767 | learning rate 5.0000
|
309 |
+
| end of split 29 / 28 | epoch 11 | time: 3361.50s | valid loss 1.0212 | valid ppl 2.7765 | learning rate 5.0000
|
310 |
+
| end of split 30 / 28 | epoch 11 | time: 3388.02s | valid loss 1.0212 | valid ppl 2.7765 | learning rate 5.0000
|
311 |
+
| end of split 31 / 28 | epoch 11 | time: 3389.23s | valid loss 1.0211 | valid ppl 2.7761 | learning rate 5.0000
|
312 |
+
| end of split 32 / 28 | epoch 11 | time: 3376.47s | valid loss 1.0210 | valid ppl 2.7760 | learning rate 5.0000
|
313 |
+
| end of split 33 / 28 | epoch 11 | time: 3378.54s | valid loss 1.0211 | valid ppl 2.7763 | learning rate 5.0000
|
314 |
+
| end of split 34 / 28 | epoch 11 | time: 3371.86s | valid loss 1.0210 | valid ppl 2.7761 | learning rate 5.0000
|
315 |
+
| end of split 35 / 28 | epoch 11 | time: 988.54s | valid loss 1.0211 | valid ppl 2.7762 | learning rate 5.0000
|
316 |
+
| end of split 36 / 28 | epoch 11 | time: 3369.15s | valid loss 1.0210 | valid ppl 2.7761 | learning rate 5.0000
|
317 |
+
| end of split 37 / 28 | epoch 11 | time: 3362.72s | valid loss 1.0209 | valid ppl 2.7758 | learning rate 5.0000
|
318 |
+
| end of split 38 / 28 | epoch 11 | time: 3363.26s | valid loss 1.0210 | valid ppl 2.7759 | learning rate 5.0000
|
319 |
+
| end of split 39 / 28 | epoch 11 | time: 3359.86s | valid loss 1.0210 | valid ppl 2.7760 | learning rate 5.0000
|
320 |
+
| end of split 40 / 28 | epoch 11 | time: 3338.89s | valid loss 1.0209 | valid ppl 2.7758 | learning rate 5.0000
|
321 |
+
| end of split 41 / 28 | epoch 11 | time: 3356.02s | valid loss 1.0209 | valid ppl 2.7756 | learning rate 5.0000
|
322 |
+
| end of split 42 / 28 | epoch 11 | time: 3351.44s | valid loss 1.0208 | valid ppl 2.7753 | learning rate 5.0000
|
323 |
+
| end of split 43 / 28 | epoch 11 | time: 3350.87s | valid loss 1.0207 | valid ppl 2.7751 | learning rate 5.0000
|
324 |
+
| end of split 44 / 28 | epoch 11 | time: 3346.91s | valid loss 1.0207 | valid ppl 2.7752 | learning rate 5.0000
|
325 |
+
| end of split 45 / 28 | epoch 11 | time: 3348.82s | valid loss 1.0206 | valid ppl 2.7749 | learning rate 5.0000
|
326 |
+
| end of split 46 / 28 | epoch 11 | time: 3348.50s | valid loss 1.0207 | valid ppl 2.7750 | learning rate 5.0000
|
327 |
+
| end of split 47 / 28 | epoch 11 | time: 3346.52s | valid loss 1.0206 | valid ppl 2.7748 | learning rate 5.0000
|
328 |
+
| end of split 48 / 28 | epoch 11 | time: 3341.43s | valid loss 1.0206 | valid ppl 2.7748 | learning rate 5.0000
|
329 |
+
| end of split 49 / 28 | epoch 11 | time: 3342.42s | valid loss 1.0205 | valid ppl 2.7747 | learning rate 5.0000
|
330 |
+
| end of split 50 / 28 | epoch 11 | time: 3361.90s | valid loss 1.0205 | valid ppl 2.7747 | learning rate 5.0000
|
331 |
+
| end of split 51 / 28 | epoch 11 | time: 3373.79s | valid loss 1.0205 | valid ppl 2.7745 | learning rate 5.0000
|
332 |
+
| end of split 52 / 28 | epoch 11 | time: 3380.88s | valid loss 1.0205 | valid ppl 2.7746 | learning rate 5.0000
|
333 |
+
| end of split 53 / 28 | epoch 11 | time: 3380.44s | valid loss 1.0204 | valid ppl 2.7743 | learning rate 5.0000
|
334 |
+
| end of split 54 / 28 | epoch 11 | time: 3379.94s | valid loss 1.0204 | valid ppl 2.7743 | learning rate 5.0000
|
335 |
+
| end of split 55 / 28 | epoch 11 | time: 3379.47s | valid loss 1.0204 | valid ppl 2.7742 | learning rate 5.0000
|
336 |
+
| end of split 56 / 28 | epoch 11 | time: 3380.66s | valid loss 1.0204 | valid ppl 2.7742 | learning rate 5.0000
|
337 |
+
| end of split 29 / 28 | epoch 12 | time: 3378.16s | valid loss 1.0206 | valid ppl 2.7749 | learning rate 5.0000
|
338 |
+
| end of split 30 / 28 | epoch 12 | time: 3397.83s | valid loss 1.0205 | valid ppl 2.7746 | learning rate 5.0000
|
339 |
+
| end of split 31 / 28 | epoch 12 | time: 3392.19s | valid loss 1.0204 | valid ppl 2.7742 | learning rate 5.0000
|
340 |
+
| end of split 32 / 28 | epoch 12 | time: 3379.40s | valid loss 1.0204 | valid ppl 2.7743 | learning rate 5.0000
|
341 |
+
| end of split 33 / 28 | epoch 12 | time: 3373.61s | valid loss 1.0203 | valid ppl 2.7740 | learning rate 5.0000
|
342 |
+
| end of split 34 / 28 | epoch 12 | time: 3369.09s | valid loss 1.0202 | valid ppl 2.7738 | learning rate 5.0000
|
343 |
+
| end of split 35 / 28 | epoch 12 | time: 3370.15s | valid loss 1.0202 | valid ppl 2.7738 | learning rate 5.0000
|
344 |
+
| end of split 36 / 28 | epoch 12 | time: 3364.76s | valid loss 1.0202 | valid ppl 2.7736 | learning rate 5.0000
|
345 |
+
| end of split 37 / 28 | epoch 12 | time: 3362.81s | valid loss 1.0202 | valid ppl 2.7738 | learning rate 5.0000
|
346 |
+
| end of split 38 / 28 | epoch 12 | time: 3361.73s | valid loss 1.0201 | valid ppl 2.7736 | learning rate 5.0000
|
347 |
+
| end of split 39 / 28 | epoch 12 | time: 3362.24s | valid loss 1.0201 | valid ppl 2.7734 | learning rate 5.0000
|
348 |
+
| end of split 40 / 28 | epoch 12 | time: 3349.23s | valid loss 1.0201 | valid ppl 2.7735 | learning rate 5.0000
|
349 |
+
| end of split 41 / 28 | epoch 12 | time: 3349.66s | valid loss 1.0200 | valid ppl 2.7732 | learning rate 5.0000
|
350 |
+
| end of split 42 / 28 | epoch 12 | time: 3354.36s | valid loss 1.0200 | valid ppl 2.7733 | learning rate 5.0000
|
351 |
+
| end of split 43 / 28 | epoch 12 | time: 3337.30s | valid loss 1.0200 | valid ppl 2.7731 | learning rate 5.0000
|
352 |
+
| end of split 44 / 28 | epoch 12 | time: 3354.63s | valid loss 1.0200 | valid ppl 2.7733 | learning rate 5.0000
|
353 |
+
| end of split 45 / 28 | epoch 12 | time: 983.22s | valid loss 1.0200 | valid ppl 2.7732 | learning rate 5.0000
|
354 |
+
| end of split 46 / 28 | epoch 12 | time: 3353.47s | valid loss 1.0200 | valid ppl 2.7731 | learning rate 5.0000
|
355 |
+
| end of split 47 / 28 | epoch 12 | time: 3353.04s | valid loss 1.0199 | valid ppl 2.7730 | learning rate 5.0000
|
356 |
+
| end of split 48 / 28 | epoch 12 | time: 3362.69s | valid loss 1.0200 | valid ppl 2.7731 | learning rate 5.0000
|
357 |
+
| end of split 49 / 28 | epoch 12 | time: 3392.80s | valid loss 1.0198 | valid ppl 2.7726 | learning rate 5.0000
|
358 |
+
| end of split 50 / 28 | epoch 12 | time: 3394.63s | valid loss 1.0198 | valid ppl 2.7727 | learning rate 5.0000
|
359 |
+
| end of split 51 / 28 | epoch 12 | time: 3382.77s | valid loss 1.0199 | valid ppl 2.7728 | learning rate 5.0000
|
360 |
+
| end of split 52 / 28 | epoch 12 | time: 3385.26s | valid loss 1.0199 | valid ppl 2.7729 | learning rate 5.0000
|
361 |
+
| end of split 53 / 28 | epoch 12 | time: 3384.68s | valid loss 1.0198 | valid ppl 2.7725 | learning rate 5.0000
|
362 |
+
| end of split 54 / 28 | epoch 12 | time: 3381.93s | valid loss 1.0198 | valid ppl 2.7726 | learning rate 5.0000
|
363 |
+
| end of split 55 / 28 | epoch 12 | time: 3398.40s | valid loss 1.0197 | valid ppl 2.7723 | learning rate 5.0000
|
364 |
+
| end of split 56 / 28 | epoch 12 | time: 3396.09s | valid loss 1.0198 | valid ppl 2.7726 | learning rate 5.0000
|
365 |
+
| end of split 29 / 28 | epoch 13 | time: 3377.87s | valid loss 1.0197 | valid ppl 2.7723 | learning rate 5.0000
|
366 |
+
| end of split 30 / 28 | epoch 13 | time: 3374.68s | valid loss 1.0196 | valid ppl 2.7721 | learning rate 5.0000
|
367 |
+
| end of split 31 / 28 | epoch 13 | time: 3387.69s | valid loss 1.0196 | valid ppl 2.7722 | learning rate 5.0000
|
368 |
+
| end of split 32 / 28 | epoch 13 | time: 990.82s | valid loss 1.0197 | valid ppl 2.7723 | learning rate 5.0000
|
369 |
+
| end of split 33 / 28 | epoch 13 | time: 3369.69s | valid loss 1.0195 | valid ppl 2.7719 | learning rate 5.0000
|
370 |
+
| end of split 34 / 28 | epoch 13 | time: 3370.78s | valid loss 1.0197 | valid ppl 2.7723 | learning rate 5.0000
|
371 |
+
| end of split 35 / 28 | epoch 13 | time: 3370.56s | valid loss 1.0195 | valid ppl 2.7719 | learning rate 5.0000
|
372 |
+
| end of split 36 / 28 | epoch 13 | time: 3360.93s | valid loss 1.0195 | valid ppl 2.7719 | learning rate 5.0000
|
373 |
+
| end of split 37 / 28 | epoch 13 | time: 3361.03s | valid loss 1.0196 | valid ppl 2.7720 | learning rate 5.0000
|
374 |
+
| end of split 38 / 28 | epoch 13 | time: 3361.72s | valid loss 1.0196 | valid ppl 2.7722 | learning rate 5.0000
|
375 |
+
| end of split 39 / 28 | epoch 13 | time: 3350.87s | valid loss 1.0195 | valid ppl 2.7718 | learning rate 5.0000
|
376 |
+
| end of split 40 / 28 | epoch 13 | time: 3347.90s | valid loss 1.0195 | valid ppl 2.7718 | learning rate 5.0000
|
377 |
+
| end of split 41 / 28 | epoch 13 | time: 3345.82s | valid loss 1.0197 | valid ppl 2.7722 | learning rate 5.0000
|
378 |
+
| end of split 42 / 28 | epoch 13 | time: 3354.18s | valid loss 1.0194 | valid ppl 2.7716 | learning rate 5.0000
|
379 |
+
| end of split 43 / 28 | epoch 13 | time: 3350.06s | valid loss 1.0203 | valid ppl 2.7741 | learning rate 5.0000
|
380 |
+
| end of split 44 / 28 | epoch 13 | time: 3348.70s | valid loss 1.0194 | valid ppl 2.7716 | learning rate 5.0000
|
381 |
+
| end of split 45 / 28 | epoch 13 | time: 3351.28s | valid loss 1.0194 | valid ppl 2.7714 | learning rate 5.0000
|
382 |
+
| end of split 46 / 28 | epoch 13 | time: 3347.01s | valid loss 1.0194 | valid ppl 2.7714 | learning rate 5.0000
|
383 |
+
| end of split 47 / 28 | epoch 13 | time: 3338.57s | valid loss 1.0193 | valid ppl 2.7713 | learning rate 5.0000
|