Model Card for Model ID

GPT-2 based model trained for Lithuanian.

Model Description

The model architecture is copied from the ai-forever/mGPT model, however it is trained from scratch on a modified partition of the Lithuanian partition of the mC4 dataset.

The training was done on Vilnius University supercomputer.

Downloads last month
24
Safetensors
Model size
1.42B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for domce20/GPT2-Lithuanian

Quantizations
1 model

Datasets used to train domce20/GPT2-Lithuanian