Schmadge
/

mamba-slim-orca

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mamba-slim-orca / README.md

Schmadge's picture

add mamba chat citation to refrences

31290e7 about 1 year ago

|

history blame contribute delete

3.04 kB

	---
	license: apache-2.0
	datasets:
	- Open-Orca/SlimOrca
	---
	# Instruction-Tuned Mamba 2.8B on SlimOrca Dataset

	## Overview
	This repository features the [2.8 billion parameter Mamba model](https://huggingface.co/state-spaces/mamba-2.8b), fine-tuned on a subset (20k) of the [SlimOrca dataset](https://huggingface.co/datasets/Open-Orca/SlimOrca). Big thanks to Justin Mattern from Haven for contributing essential code in the [mamba-chat repository](https://github.com/havenhq/mamba-chat)


	## Usage Instructions
	To utilize the fine-tuned model, follow the Python code snippet below:

	```python
	import torch
	from transformers import AutoTokenizer
	from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel

	device = "cuda"
	tokenizer = AutoTokenizer.from_pretrained("Schmadge/mamba-slim-orca")
	tokenizer.eos_token = tokenizer.pad_token = "<\|endoftext\|>"
	tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
	model = MambaLMHeadModel.from_pretrained("Schmadge/mamba-slim-orca", device=device, dtype=torch.float16)

	def generate_response(system_prompt, user_prompt):
	# Preparing the prompt
	prompt = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_prompt}
	]
	input_ids = tokenizer.apply_chat_template(prompt, return_tensors="pt", add_generation_prompt=True).to(device)

	# Generating the response
	out = model.generate(input_ids=input_ids, max_length=2000, temperature=0.3, top_p=0.7, eos_token_id=tokenizer.eos_token_id)
	decoded = tokenizer.batch_decode(out)

	return decoded[0].split("<\|assistant\|>\n")[-1].replace('<\|endoftext\|>','')

	system_prompt = "You are an AI assistant. Provide a detailed answer so user don't need to search outside to understand the answer."
	user_prompt = "In a room I have only 3 sisters. Anna is reading a book. Alice is playing a match of chess.What the third sister, Amanda is doing ?"
	response = generate_response(system_prompt, user_prompt)
	print(response)
	#Based on the information provided, we can infer that Amanda is playing a match of chess with Alice. Since Anna is reading a book, it is reasonable to assume that Amanda is playing a game of chess with Alice, as this is a common activity for siblings to engage in together.
	```

	## Refrences:

	Mamba Chat:
	```bibtex
	@misc{haven2023mambachat,
	title = {Mamba-Chat},
	author = {Justus Mattern and Konstantin Hohr},
	year = {2023},
	howpublished = {GitHub},
	url = {https://github.com/havenhq/mamba-chat}
	}
	```


	Mamba:
	```bibtex
	@article{mamba,
	title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
	author={Gu, Albert and Dao, Tri},
	journal={arXiv preprint arXiv:2312.00752},
	year={2023}
	}
	```

	SlimOrca:
	```bibtex
	@misc{SlimOrca,
	title = {SlimOrca: An Open Dataset of GPT-4 Augmented FLAN Reasoning Traces, with Verification},
	author = {Wing Lian and others},
	year = {2023},
	publisher = {HuggingFace},
	url = {https://huggingface.co/Open-Orca/SlimOrca}
	}
	```