Update README.md

36f9763 verified 9 months ago

8 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- mistral
	- trl
	- sft
	base_model: alpindale/Mistral-7B-v0.2
	---

	# Mistral-7B-v0.2-OpenHermes

	![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/AbagOgU056oIB7S31XESC.webp)

	SFT Training Params:
	+ Learning Rate: 2e-4
	+ Batch Size: 8
	+ Gradient Accumulation steps: 4
	+ Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life)
	+ r: 16
	+ Lora Alpha: 16

	Training Time: 13 hours on A100

	_This model is proficient in RAG use cases_

	RAG Finetuning for your case would be a good idea

	Prompt Template: ChatML

	```
	<\|im_start\|>system
	You are a helpful assistant.<\|im_end\|>
	<\|im_start\|>user
	What's the capital of France?<\|im_end\|>
	<\|im_start\|>assistant
	Paris.
	```

	## Run easily with ollama

	```bash
	ollama run macadeliccc/mistral-7b-v2-openhermes
	```
	## OpenAI compatible server with vLLM

	install instructions for vllm can be found [here](https://docs.vllm.ai/en/latest/getting_started/installation.html)

	```bash
	python -m vllm.entrypoints.openai.api_server \
	--model macadeliccc/Mistral-7B-v0.2-OpenHermes \
	--gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
	--max-model-len 16000 # 32000 if you can run it. This works on 4090
	--chat-template ./examples/template_chatml.jinja
	```

	## Gradio chatbot interface for your endpoint

	```python
	import gradio as gr
	from openai import OpenAI

	# Modify these variables as needed
	openai_api_key = "EMPTY" # Assuming no API key is required for local testing
	openai_api_base = "http://localhost:8000/v1"

	client = OpenAI(
	api_key=openai_api_key,
	base_url=openai_api_base,
	)
	system_message = "You are a helpful assistant"

	def fast_echo(message, history):
	# Send the user's message to the vLLM API and get the response immediately

	chat_response = client.chat.completions.create(
	model="macadeliccc/Mistral-7B-v0.2-OpenHermes",
	messages=[
	{"role": "system", "content": system_message},
	{"role": "user", "content": message},
	]
	)
	print(chat_response)
	return chat_response.choices[0].message.content

	demo = gr.ChatInterface(fn=fast_echo, examples=["Write me a quicksort algorithm in python."]).queue()

	if __name__ == "__main__":
	demo.launch()
	```

	## Quantizations

	[GGUF](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-GGUF)

	[AWQ](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-AWQ/)

	[HQQ-4bit](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-HQQ-4bit)

	[ExLlamaV2](https://huggingface.co/bartowski/Mistral-7B-v0.2-OpenHermes-exl2)

	### Evaluations

	Thanks to Maxime Labonne for the evalution:

	\| Model \|AGIEval\|GPT4All\|TruthfulQA\|Bigbench\|Average\|
	\|-------------------------------------------------------------------------------------------\|------:\|------:\|---------:\|-------:\|------:\|
	\|[Mistral-7B-v0.2-OpenHermes](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes)\| 35.57\| 67.15\| 42.06\| 36.27\| 45.26\|

	### AGIEval
	\| Task \|Version\| Metric \|Value\| \|Stderr\|
	\|------------------------------\|------:\|--------\|----:\|---\|-----:\|
	\|agieval_aqua_rat \| 0\|acc \|24.02\|± \| 2.69\|
	\| \| \|acc_norm\|21.65\|± \| 2.59\|
	\|agieval_logiqa_en \| 0\|acc \|28.11\|± \| 1.76\|
	\| \| \|acc_norm\|34.56\|± \| 1.87\|
	\|agieval_lsat_ar \| 0\|acc \|27.83\|± \| 2.96\|
	\| \| \|acc_norm\|23.48\|± \| 2.80\|
	\|agieval_lsat_lr \| 0\|acc \|33.73\|± \| 2.10\|
	\| \| \|acc_norm\|33.14\|± \| 2.09\|
	\|agieval_lsat_rc \| 0\|acc \|48.70\|± \| 3.05\|
	\| \| \|acc_norm\|39.78\|± \| 2.99\|
	\|agieval_sat_en \| 0\|acc \|67.48\|± \| 3.27\|
	\| \| \|acc_norm\|64.56\|± \| 3.34\|
	\|agieval_sat_en_without_passage\| 0\|acc \|38.83\|± \| 3.40\|
	\| \| \|acc_norm\|37.38\|± \| 3.38\|
	\|agieval_sat_math \| 0\|acc \|32.27\|± \| 3.16\|
	\| \| \|acc_norm\|30.00\|± \| 3.10\|

	Average: 35.57%

	### GPT4All
	\| Task \|Version\| Metric \|Value\| \|Stderr\|
	\|-------------\|------:\|--------\|----:\|---\|-----:\|
	\|arc_challenge\| 0\|acc \|45.05\|± \| 1.45\|
	\| \| \|acc_norm\|48.46\|± \| 1.46\|
	\|arc_easy \| 0\|acc \|77.27\|± \| 0.86\|
	\| \| \|acc_norm\|73.78\|± \| 0.90\|
	\|boolq \| 1\|acc \|68.62\|± \| 0.81\|
	\|hellaswag \| 0\|acc \|59.63\|± \| 0.49\|
	\| \| \|acc_norm\|79.66\|± \| 0.40\|
	\|openbookqa \| 0\|acc \|31.40\|± \| 2.08\|
	\| \| \|acc_norm\|43.40\|± \| 2.22\|
	\|piqa \| 0\|acc \|80.25\|± \| 0.93\|
	\| \| \|acc_norm\|82.05\|± \| 0.90\|
	\|winogrande \| 0\|acc \|74.11\|± \| 1.23\|

	Average: 67.15%

	### TruthfulQA
	\| Task \|Version\|Metric\|Value\| \|Stderr\|
	\|-------------\|------:\|------\|----:\|---\|-----:\|
	\|truthfulqa_mc\| 1\|mc1 \|27.54\|± \| 1.56\|
	\| \| \|mc2 \|42.06\|± \| 1.44\|

	Average: 42.06%

	### Bigbench
	\| Task \|Version\| Metric \|Value\| \|Stderr\|
	\|------------------------------------------------\|------:\|---------------------\|----:\|---\|-----:\|
	\|bigbench_causal_judgement \| 0\|multiple_choice_grade\|56.32\|± \| 3.61\|
	\|bigbench_date_understanding \| 0\|multiple_choice_grade\|66.40\|± \| 2.46\|
	\|bigbench_disambiguation_qa \| 0\|multiple_choice_grade\|45.74\|± \| 3.11\|
	\|bigbench_geometric_shapes \| 0\|multiple_choice_grade\|10.58\|± \| 1.63\|
	\| \| \|exact_str_match \| 0.00\|± \| 0.00\|
	\|bigbench_logical_deduction_five_objects \| 0\|multiple_choice_grade\|25.00\|± \| 1.94\|
	\|bigbench_logical_deduction_seven_objects \| 0\|multiple_choice_grade\|17.71\|± \| 1.44\|
	\|bigbench_logical_deduction_three_objects \| 0\|multiple_choice_grade\|37.33\|± \| 2.80\|
	\|bigbench_movie_recommendation \| 0\|multiple_choice_grade\|29.40\|± \| 2.04\|
	\|bigbench_navigate \| 0\|multiple_choice_grade\|50.00\|± \| 1.58\|
	\|bigbench_reasoning_about_colored_objects \| 0\|multiple_choice_grade\|42.50\|± \| 1.11\|
	\|bigbench_ruin_names \| 0\|multiple_choice_grade\|39.06\|± \| 2.31\|
	\|bigbench_salient_translation_error_detection \| 0\|multiple_choice_grade\|12.93\|± \| 1.06\|
	\|bigbench_snarks \| 0\|multiple_choice_grade\|69.06\|± \| 3.45\|
	\|bigbench_sports_understanding \| 0\|multiple_choice_grade\|49.80\|± \| 1.59\|
	\|bigbench_temporal_sequences \| 0\|multiple_choice_grade\|26.50\|± \| 1.40\|
	\|bigbench_tracking_shuffled_objects_five_objects \| 0\|multiple_choice_grade\|21.20\|± \| 1.16\|
	\|bigbench_tracking_shuffled_objects_seven_objects\| 0\|multiple_choice_grade\|16.06\|± \| 0.88\|
	\|bigbench_tracking_shuffled_objects_three_objects\| 0\|multiple_choice_grade\|37.33\|± \| 2.80\|

	Average: 36.27%

	Average score: 45.26%

	Elapsed time: 01:49:22

	- Developed by: macadeliccc
	- License: apache-2.0
	- Finetuned from model : alpindale/Mistral-7B-v0.2

	This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)