--- language: - lg - en library_name: unsloth pipeline_tag: text-generation license: llama2 base_model: unsloth/gemma-2-2b-it tags: - luganda - gemma - pretrained - wikipedia - unsloth datasets: - wikimedia/wikipedia --- # Gemma-2-2b-it Pretrained for Luganda ## Model Description This is a continued pretraining of the Gemma-2-2b-it model on Luganda text data. The model has been pretrained on Wikipedia Luganda articles to adapt it for Luganda language understanding and generation. ## Model Details - **Base Model**: unsloth/gemma-2-2b-it - **Pretraining Data**: - Luganda Wikipedia articles (wikimedia/wikipedia 20231101.lg) - **Training Method**: LoRA with unsloth optimization - **Context Length**: 2048 tokens - **Training Hardware**: Tesla T4 GPU ## Training Process The model was trained using the following configuration: ### LoRA Configuration - LoRA rank (r): 128 - Target modules: - q_proj, k_proj, v_proj, o_proj - gate_proj, up_proj, down_proj - embed_tokens, lm_head - LoRA alpha: 32 - LoRA dropout: 0 - Used RS-LoRA (Rank Stabilized LoRA) ### Training Parameters - Batch size: 2 with gradient accumulation steps of 8 - Learning rates: - General: 5e-5 - Embeddings: 1e-6 (reduced for stability) - Training epochs: 10 - Warmup steps: 10 - Warmup ratio: 0.1 - Weight decay: 0.01 - Optimizer: AdamW 8-bit - LR scheduler: Linear ### Data Processing The training data was processed using the following template: ```python Ekyawandiikibwa kya Wikipedia ### Omutwe: {title} ### Akawayiro: {text} ``` ## Checkpoints This repository contains multiple checkpoints from the pretraining process: - checkpoint-500 - checkpoint-1000 - checkpoint-1500 - checkpoint-2000 - checkpoint-2500 - checkpoint-2530 (final) ## Usage ```python from unsloth import FastLanguageModel import torch # Load the model model, tokenizer = FastLanguageModel.from_pretrained( model_name = "Bronsn/gemma-2-2b-it-pretrained", max_seq_length = 2048, dtype = None, # Auto-detect load_in_4bit = True, ) # Example usage text = "Ekyawandiikibwa kya Wikipedia\n### Omutwe: Uganda\n\n### Akawayiro:\n" inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Limitations - The model is specifically adapted for Luganda text understanding and generation - Performance may vary on dialectal variations or code-mixed text - The model maintains the base Gemma-2-2b-it limitations ## Citation If you use this model, please cite: ``` @misc{luganda-gemma-pretrained, author = {Bronsn}, title = {Gemma-2-2b-it Pretrained for Luganda}, year = {2025}, publisher = {HuggingFace} } ``` ## License This model inherits the licensing terms from the base Gemma-2-2b-it model. For more details, please refer to [Gemma's license](https://ai.google.dev/gemma/terms).