PAWA: Swahili SML for Various Tasks


Overview

PAWA is a Swahili-specialized language model designed to excel in tasks requiring nuanced understanding and interaction in Swahili and English. It leverages supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) for improved performance and consistency. Below are the detailed model specifications, installation steps, usage examples, and its intended applications.


Model Details

  • Model Name: Pawa-mini-V0.1
  • Model Type: PAWA
  • Architecture:
    • 2B Parameter Gemma-2 Base Model
    • Enhanced with Swahili SFT and DPO datasets.
  • Languages Supported:
    • Swahili
    • English
    • Custom tokenizer for multi-language flexibility.
  • Primary Use Cases:
    • Contextually rich Swahili-focused tasks.
    • General assistance and chat-based interactions.
  • License: Custom/Contact Author for terms of use.

Installation and Setup

Ensure the necessary libraries are installed and up-to-date:

!pip uninstall transformers -y && pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git"
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install datasets

Model Loading

You can load the model using the following code snippet:

from unsloth import FastLanguageModel
import torch

model_name = "sartifyllc/Pawa-mini-V0.1"
max_seq_length = 2048  
dtype = None  
load_in_4bit = False  

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

Chat Template Configuration

For a seamless conversational experience, configure the tokenizer with the appropriate chat template:

from unsloth.chat_templates import get_chat_template
FastLanguageModel.for_inference(model) # Enable native 2x faster inference

tokenizer = get_chat_template(
    tokenizer,
    chat_template="chatml",  # Supports templates like zephyr, chatml, mistral, etc.
    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},  # ShareGPT style
    map_eos_token=True,  # Maps <|im_end|> to </s>
)

Usage Example

Generate a short story in Swahili:

messages = [{"from": "human", "value": "Tengeneza hadithi fupi"}]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=128, use_cache=True)

Training and Fine-Tuning Details

  • Base Model: Gemma-2-2B
  • Continue Pre-Training: 3B Swahili Tokens
  • Fine-tuning: Enhanced with Swahili SFT datasets for improved contextual understanding.
  • Optimization: Includes DPO for deterministic and consistent responses.

Intended Use Cases

  • General Assistance:
    Provides structured answers for general-purpose use.

  • Interactive Q&A:
    Designed for general-purpose chat environments.

  • RAG (Retrieval-Augmented Generation):
    Works best for RAG and specific use cases.


Limitations

  • Biases:
    The model may exhibit biases inherent in its fine-tuning datasets.

  • Generalization:
    May struggle with tasks outside the trained domain.

  • Hardware Requirements:

    • Optimal performance requires GPUs with high memory (e.g., Tesla V100 or T4).
    • Supports 4-bit quantization for reduced memory usage.

Feel free to reach out for further guidance or collaboration opportunities regarding PAWA!

Downloads last month
7
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .