Model Card for GPT-2 Tigrinya Medium

Model Summary

This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20.6 million tokens, primarily from news sources.

Model Description

  • Model type: GPT-2
  • Language: Tigrinya (α‰΅αŒαˆ­αŠ›)
  • Finetuned from model: Trained from scratch (no pre-training)

Model Architecture

  • Parameters: 51.9M
  • Context Window: 128 tokens
  • Vocabulary Size: 52,000

Training Details

  • Training regime: fp16 mixed precision
  • Number of Epochs: 12
  • Batch Size: 6 (with gradient accumulation steps of 8)
  • Learning Rate: 5e-4

Evaluation

  • Training Perplexity: 28.6
  • Training Loss: 3.12

Usage

from transformers import pipeline
# Load the model
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium')

prompt = "ክልል α‰΅αŒαˆ«α‹­"
# Generate text
text = generator(prompt, max_length=100)[0]['generated_text']
print(text)

Limitations

  • Limited context window of 128 tokens.
  • Best suited for medium-length Tigrinya text generation.
  • Outputs should be reviewed for accuracy.
Downloads last month
346
Safetensors
Model size
51.9M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using luel/gpt2-tigrinya-medium 1

Collection including luel/gpt2-tigrinya-medium