Instruction-Tuned Mamba 2.8B on SlimOrca Dataset

Overview

This repository features the 2.8 billion parameter Mamba model, fine-tuned on a subset (20k) of the SlimOrca dataset. Big thanks to Justin Mattern from Haven for contributing essential code in the mamba-chat repository

Usage Instructions

To utilize the fine-tuned model, follow the Python code snippet below:

import torch
from transformers import AutoTokenizer
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel

device = "cuda"
tokenizer = AutoTokenizer.from_pretrained("Schmadge/mamba-slim-orca")
tokenizer.eos_token = tokenizer.pad_token = "<|endoftext|>"
tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
model = MambaLMHeadModel.from_pretrained("Schmadge/mamba-slim-orca", device=device, dtype=torch.float16)

def generate_response(system_prompt, user_prompt):
    # Preparing the prompt
    prompt = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]
    input_ids = tokenizer.apply_chat_template(prompt, return_tensors="pt", add_generation_prompt=True).to(device)

    # Generating the response
    out = model.generate(input_ids=input_ids, max_length=2000, temperature=0.3, top_p=0.7, eos_token_id=tokenizer.eos_token_id)
    decoded = tokenizer.batch_decode(out)

    return decoded[0].split("<|assistant|>\n")[-1].replace('<|endoftext|>','')

system_prompt = "You are an AI assistant. Provide a detailed answer so user don't need to search outside to understand the answer."
user_prompt = "In a room I have only 3 sisters. Anna is reading a book. Alice is playing a match of chess.What the third sister, Amanda is doing ?"
response = generate_response(system_prompt, user_prompt)
print(response)
#Based on the information provided, we can infer that Amanda is playing a match of chess with Alice. Since Anna is reading a book, it is reasonable to assume that Amanda is playing a game of chess with Alice, as this is a common activity for siblings to engage in together.

Refrences:

Mamba Chat:

@misc{haven2023mambachat,
  title        = {Mamba-Chat},
  author       = {Justus Mattern and Konstantin Hohr},
  year         = {2023},
  howpublished = {GitHub},
  url          = {https://github.com/havenhq/mamba-chat}
}

Mamba:

@article{mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}

SlimOrca:

@misc{SlimOrca,
  title = {SlimOrca: An Open Dataset of GPT-4 Augmented FLAN Reasoning Traces, with Verification},
  author = {Wing Lian and others},
  year = {2023},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Open-Orca/SlimOrca}
}
Downloads last month
9
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train Schmadge/mamba-slim-orca