yhavinga commited on
Commit
49928d0
1 Parent(s): caf99cb

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: nl
3
+ widget:
4
+ - text: "In het jaar 2030 zullen we"
5
+ - text: "Toen ik gisteren volledig in de ban was van"
6
+ - text: "Studenten en leraren van de Bogazici Universiteit in de Turkse stad Istanbul"
7
+ - text: "In Israël was een strenge lockdown"
8
+ tags:
9
+ - gpt-neo-125M
10
+ - gpt-neo
11
+ - text generation
12
+ - pytorch
13
+ - causal-lm
14
+ pipeline_tag: text-generation
15
+ datasets:
16
+ - yhavinga/mc4_nl_cleaned
17
+ ---
18
+ # # GPT-Neo 125M pre-trained on cleaned Dutch mC4 🇳🇱
19
+
20
+ Dataset:
21
+
22
+ * [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
23
+ * dataset config: mc4 nl filtered with only newspapers and wikipedia
24
+ * total tokens: 3.9B
25
+
26
+ Tokenizer:
27
+
28
+ * Tokenizer trained on mC4 with scripts from the Huggingface
29
+ Transformers [Flax examples](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling)
30
+
31
+ Training details:
32
+
33
+ * Trained for 558608 steps with batch size 128
34
+ * Optimizer: AdamW
35
+ * Block size: 512
36
+ * Learning rate: 2.4e-3
37
+ * Warmup steps: 5000
38
+ * Epochs: 8
39
+
40
+ Jan 2022
41
+
42
+ * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
43
+ * Thanks to @gsarti for creating the [t5-flax-gcp
44
+ repository](https://github.com/gsarti/t5-flax-gcp).
45
+ * Also thanks to the creators of [gpt2-medium-persian](https://huggingface.co/flax-community/gpt2-medium-persian) and
46
+ [gpt2-medium-indonesian](https://huggingface.co/flax-community/gpt2-medium-persian)
47
+ for sharing their training scripts!