pkedzia commited on
Commit
7970db9
·
1 Parent(s): 81fd9e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -1
README.md CHANGED
@@ -18,4 +18,47 @@ tags:
18
  - gpt2
19
  - from-scratch
20
  - polish-gpt2
21
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  - gpt2
19
  - from-scratch
20
  - polish-gpt2
21
+ ---
22
+
23
+ ## Description
24
+ This is the polish gpt2 model in small architecture.
25
+ This model was released on 11.08.2023
26
+
27
+
28
+ ## Datasets
29
+ Data which are used to train this model:
30
+ - clarin-knext/msmarco-pl
31
+ - clarin-knext/nq-pl
32
+ - clarin-knext/hotpotqa-pl
33
+ - clarin-knext/scidocs-pl
34
+ - clarin-knext/nfcorpus-pl
35
+ - clarin-knext/dbpedia-pl
36
+ - clarin-knext/trec-covid-pl
37
+ - clarin-knext/quora-pl
38
+ - clarin-knext/arguana-pl
39
+ - clarin-knext/fiqa-pl
40
+ - own corpora not published yet
41
+
42
+ It is about 10,5 GB of data.
43
+
44
+
45
+ ## Metrics from W&B
46
+
47
+ - train/loss: 2.9569
48
+ - train/train_samples_per_second: 31.797
49
+ - train/epoch: 20
50
+ - train/train_steps_per_second: 3.18
51
+ - train/total_flos: 16645483478384640000
52
+ - train/train_loss: 3.106043342053213
53
+ - train/learning_rate: 2.2070550413783577e-8
54
+ - train/global_step: 3185240
55
+ - train/train_runtime:1001735.8967
56
+ - eval/samples_per_second: 57.896
57
+ - eval/runtime: 1447.4458
58
+ - eval/steps_per_second: 5.79
59
+ - eval/loss: 2.890829086303711
60
+ - eval/accuracy: 0.4637797431547294
61
+
62
+
63
+ ## Changelog
64
+ - _Nothing_