language: nl | |
widget: | |
- text: "Een zalig kerstfeest en " | |
- text: "Na een lange reeks vertragingen zal eind volgende week de James Webb Space Telescope (JWST) de aarde verlaten. Met een vergulde spiegel van " | |
tags: | |
- gpt2-medium | |
- gpt2 | |
pipeline_tag: text-generation | |
datasets: | |
- yhavinga/mc4_nl_cleaned | |
# GPT2-Medium pre-trained on cleaned Dutch mC4 🇳🇱 | |
Training details: | |
* trained for 240k steps (29 dec 2021) | |
* block size: 512 | |
* optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98 | |
* warmup 5000 steps | |
* weight decay 0.01 | |
Work in progress. Dec 2021. | |
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster! | |
* Thanks to @gsarti for creating the [t5-flax-gcp | |
repository](https://github.com/gsarti/t5-flax-gcp). | |
* Also thanks to the creators of [gpt2-medium-persian](https://huggingface.co/flax-community/gpt2-medium-persian) and | |
[gpt2-medium-indonesian](https://huggingface.co/flax-community/gpt2-medium-persian) | |
for sharing their training scripts! | |