yhavinga
/

gpt-neo-125M-dutch-nedd

Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

yhavinga commited on Jan 11, 2022

Commit

49928d0

•

1 Parent(s): caf99cb

Create README.md

Files changed (1) hide show

README.md +47 -0

README.md ADDED Viewed

	@@ -0,0 +1,47 @@

+---
+language: nl
+widget:
+- text: "In het jaar 2030 zullen we"
+- text: "Toen ik gisteren volledig in de ban was van"
+- text: "Studenten en leraren van de Bogazici Universiteit in de Turkse stad Istanbul"
+- text: "In Israël was een strenge lockdown"
+tags:
+- gpt-neo-125M
+- gpt-neo
+- text generation
+- pytorch
+- causal-lm
+pipeline_tag: text-generation
+datasets:
+- yhavinga/mc4_nl_cleaned
+---
+# # GPT-Neo 125M pre-trained on cleaned Dutch mC4 🇳🇱
+Dataset:
+* [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
+* dataset config: mc4 nl filtered with only newspapers and wikipedia
+* total tokens: 3.9B
+Tokenizer:
+* Tokenizer trained on mC4 with scripts from the Huggingface
+  Transformers [Flax examples](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling)
+Training details:
+* Trained for 558608 steps with batch size 128
+* Optimizer: AdamW
+* Block size: 512
+* Learning rate: 2.4e-3
+* Warmup steps: 5000
+* Epochs: 8
+Jan 2022
+* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
+* Thanks to @gsarti for creating the [t5-flax-gcp
+  repository](https://github.com/gsarti/t5-flax-gcp).
+* Also thanks to the creators of [gpt2-medium-persian](https://huggingface.co/flax-community/gpt2-medium-persian) and
+  [gpt2-medium-indonesian](https://huggingface.co/flax-community/gpt2-medium-persian)
+  for sharing their training scripts!