Experimental Norwegian GPT-2-model trained on a 37GB mainly social corpus. | |
The following sub-corpora are used: | |
wikipedia_download_nb.jsonl | |
wikipedia_download_nn.jsonl | |
newspapers_online_nb.jsonl | |
newspapers_online_nn.jsonl | |
twitter_2016_2018_no.jsonl | |
twitter_news_2016_2018_no.jsonl | |
open_subtitles_no.jsonl | |
facebook_no.jsonl | |
reddit_no.jsonl | |
vgdebatt_no.jsonl |