Icelandic GPT-2 model

This Icelandic GPT-2 language model was pretrained on the Icelandic Gigaword Corpus (IGC, 2020 version), which contains approximately 1.532 million running words. The model was trained for 20 epochs on a TPU v3-8, with a total training time of 3 days and 21 hours. The hyperparameters used for training can be found in the JAX/Flax documentation for the Transformers library. The model uses a byte-level BPE tokenizer with a vocabulary size of 51,000.

Note: This model was pretrained on a tokenized and sentence-segmentized version of the IGC, which is reflected by the generated text. A new version of this model, trained on a pre-tokenized version of IGC (2022 version), is forthcoming.

Acknowledgments

This research was supported with Cloud TPUs from Google's TPU Research Cloud (TRC).

Downloads last month
30
Safetensors
Model size
138M params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.