LuxGPT-2
GPT-2 model for Text Generation in luxembourgish language, trained on 667 MB of text data, consisting of RTL.lu news articles, comments, parlament speeches, the luxembourgish Wikipedia, Newscrawl, Webcrawl and subtitles. The training took place on a 32 GB Nvidia Tesla V100
- with an initial learning rate of 5e-5
- with Batch size 4
- for 109 hours
- for 30 epochs
- using the transformers library
more detailed training information can be found in the "trainer_state.json".
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("laurabernardy/LuxGPT2")
model = AutoModelForCausalLM.from_pretrained("laurabernardy/LuxGPT2")
Limitations and Biases
See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.
- Downloads last month
- 25
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- accuracy on Luxembourgish Test Datasetself-reported0.33
- perplexity on Luxembourgish Test Datasetself-reported46.69