Gemma-2-2b-it Pretrained for Luganda

Model Description

This is a continued pretraining of the Gemma-2-2b-it model on Luganda text data. The model has been pretrained on Wikipedia Luganda articles to adapt it for Luganda language understanding and generation.

Model Details

  • Base Model: unsloth/gemma-2-2b-it
  • Pretraining Data:
    • Luganda Wikipedia articles (wikimedia/wikipedia 20231101.lg)
  • Training Method: LoRA with unsloth optimization
  • Context Length: 2048 tokens
  • Training Hardware: Tesla T4 GPU

Training Process

The model was trained using the following configuration:

LoRA Configuration

  • LoRA rank (r): 128
  • Target modules:
    • q_proj, k_proj, v_proj, o_proj
    • gate_proj, up_proj, down_proj
    • embed_tokens, lm_head
  • LoRA alpha: 32
  • LoRA dropout: 0
  • Used RS-LoRA (Rank Stabilized LoRA)

Training Parameters

  • Batch size: 2 with gradient accumulation steps of 8
  • Learning rates:
    • General: 5e-5
    • Embeddings: 1e-6 (reduced for stability)
  • Training epochs: 10
  • Warmup steps: 10
  • Warmup ratio: 0.1
  • Weight decay: 0.01
  • Optimizer: AdamW 8-bit
  • LR scheduler: Linear

Data Processing

The training data was processed using the following template:

Ekyawandiikibwa kya Wikipedia
### Omutwe: {title}

### Akawayiro:
{text}

Checkpoints

This repository contains multiple checkpoints from the pretraining process:

  • checkpoint-500
  • checkpoint-1000
  • checkpoint-1500
  • checkpoint-2000
  • checkpoint-2500
  • checkpoint-2530 (final)

Usage

from unsloth import FastLanguageModel
import torch

# Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Bronsn/gemma-2-2b-it-pretrained",
    max_seq_length = 2048,
    dtype = None,  # Auto-detect
    load_in_4bit = True,
)

# Example usage
text = "Ekyawandiikibwa kya Wikipedia\n### Omutwe: Uganda\n\n### Akawayiro:\n"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

Limitations

  • The model is specifically adapted for Luganda text understanding and generation
  • Performance may vary on dialectal variations or code-mixed text
  • The model maintains the base Gemma-2-2b-it limitations

Citation

If you use this model, please cite:

@misc{luganda-gemma-pretrained,
  author = {Bronsn},
  title = {Gemma-2-2b-it Pretrained for Luganda},
  year = {2025},
  publisher = {HuggingFace}
}

License

This model inherits the licensing terms from the base Gemma-2-2b-it model. For more details, please refer to Gemma's license.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for unsloth library.

Model tree for Bronsn/gemma-2-2b-it-pretrained

Finetuned
(36)
this model
Finetunes
1 model

Dataset used to train Bronsn/gemma-2-2b-it-pretrained

Collection including Bronsn/gemma-2-2b-it-pretrained