--- language: - sw - en --- # PAWA: Swahili SML for Various Tasks --- ## Overview **PAWA** is a Swahili-specialized language model designed to excel in tasks requiring nuanced understanding and interaction in Swahili and English. It leverages supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) for improved performance and consistency. Below are the detailed model specifications, installation steps, usage examples, and its intended applications. --- ### Model Details - **Model Name**: Pawa-mini-V0.1 - **Model Type**: PAWA - **Architecture**: - 2B Parameter Gemma-2 Base Model - Enhanced with Swahili SFT and DPO datasets. - **Languages Supported**: - Swahili - English - Custom tokenizer for multi-language flexibility. - **Primary Use Cases**: - Contextually rich Swahili-focused tasks. - General assistance and chat-based interactions. - **License**: Custom/Contact Author for terms of use. --- ### Installation and Setup Ensure the necessary libraries are installed and up-to-date: ```bash !pip uninstall transformers -y && pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git" !pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install datasets ``` --- ### Model Loading You can load the model using the following code snippet: ```python from unsloth import FastLanguageModel import torch model_name = "sartifyllc/Pawa-mini-V0.1" max_seq_length = 2048 dtype = None load_in_4bit = False model, tokenizer = FastLanguageModel.from_pretrained( model_name=model_name, max_seq_length=max_seq_length, dtype=dtype, load_in_4bit=load_in_4bit, ) ``` --- ### Chat Template Configuration For a seamless conversational experience, configure the tokenizer with the appropriate chat template: ```python from unsloth.chat_templates import get_chat_template FastLanguageModel.for_inference(model) # Enable native 2x faster inference tokenizer = get_chat_template( tokenizer, chat_template="chatml", # Supports templates like zephyr, chatml, mistral, etc. mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"}, # ShareGPT style map_eos_token=True, # Maps <|im_end|> to ) ``` --- ### Usage Example Generate a short story in Swahili: ```python messages = [{"from": "human", "value": "Tengeneza hadithi fupi"}] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt", ).to("cuda") from transformers import TextStreamer text_streamer = TextStreamer(tokenizer) _ = model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=128, use_cache=True) ``` --- ### Training and Fine-Tuning Details - **Base Model**: Gemma-2-2B - **Continue Pre-Training**: 3B Swahili Tokens - **Fine-tuning**: Enhanced with Swahili SFT datasets for improved contextual understanding. - **Optimization**: Includes DPO for deterministic and consistent responses. --- ### Intended Use Cases - **General Assistance**: Provides structured answers for general-purpose use. - **Interactive Q&A**: Designed for general-purpose chat environments. - **RAG (Retrieval-Augmented Generation)**: Works best for RAG and specific use cases. --- ### Limitations - **Biases**: The model may exhibit biases inherent in its fine-tuning datasets. - **Generalization**: May struggle with tasks outside the trained domain. - **Hardware Requirements**: - Optimal performance requires GPUs with high memory (e.g., Tesla V100 or T4). - Supports 4-bit quantization for reduced memory usage. Feel free to reach out for further guidance or collaboration opportunities regarding PAWA!