metadata
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
- sft
base_model: alpindale/Mistral-7B-v0.2
Mistral-7B-v0.2-OpenHermes
SFT Training Params:
- Learning Rate: 2e-4
- Batch Size: 8
- Gradient Accumulation steps: 4
- Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life)
- r: 16
- Lora Alpha: 16
Training Time: 13 hours on A100
This model is proficient in RAG use cases
RAG Finetuning for your case would be a good idea
Prompt Template: ChatML
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant
Paris.
Run easily with ollama
ollama run macadeliccc/mistral-7b-v2-openhermes
OpenAI compatible server with vLLM
install instructions for vllm can be found here
python -m vllm.entrypoints.openai.api_server \
--model macadeliccc/Mistral-7B-v0.2-OpenHermes \
--gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
--max-model-len 16000 # 32000 if you can run it. This works on 4090
--chat-template ./examples/template_chatml.jinja
Gradio chatbot interface for your endpoint
import gradio as gr
from openai import OpenAI
# Modify these variables as needed
openai_api_key = "EMPTY" # Assuming no API key is required for local testing
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
system_message = "You are a helpful assistant"
def fast_echo(message, history):
# Send the user's message to the vLLM API and get the response immediately
chat_response = client.chat.completions.create(
model="macadeliccc/Mistral-7B-v0.2-OpenHermes",
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": message},
]
)
print(chat_response)
return chat_response.choices[0].message.content
demo = gr.ChatInterface(fn=fast_echo, examples=["Write me a quicksort algorithm in python."]).queue()
if __name__ == "__main__":
demo.launch()
Quantizations
Evaluations
Thanks to Maxime Labonne for the evalution:
Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
Mistral-7B-v0.2-OpenHermes | 35.57 | 67.15 | 42.06 | 36.27 | 45.26 |
AGIEval
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 24.02 | ± | 2.69 |
acc_norm | 21.65 | ± | 2.59 | ||
agieval_logiqa_en | 0 | acc | 28.11 | ± | 1.76 |
acc_norm | 34.56 | ± | 1.87 | ||
agieval_lsat_ar | 0 | acc | 27.83 | ± | 2.96 |
acc_norm | 23.48 | ± | 2.80 | ||
agieval_lsat_lr | 0 | acc | 33.73 | ± | 2.10 |
acc_norm | 33.14 | ± | 2.09 | ||
agieval_lsat_rc | 0 | acc | 48.70 | ± | 3.05 |
acc_norm | 39.78 | ± | 2.99 | ||
agieval_sat_en | 0 | acc | 67.48 | ± | 3.27 |
acc_norm | 64.56 | ± | 3.34 | ||
agieval_sat_en_without_passage | 0 | acc | 38.83 | ± | 3.40 |
acc_norm | 37.38 | ± | 3.38 | ||
agieval_sat_math | 0 | acc | 32.27 | ± | 3.16 |
acc_norm | 30.00 | ± | 3.10 |
Average: 35.57%
GPT4All
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
arc_challenge | 0 | acc | 45.05 | ± | 1.45 |
acc_norm | 48.46 | ± | 1.46 | ||
arc_easy | 0 | acc | 77.27 | ± | 0.86 |
acc_norm | 73.78 | ± | 0.90 | ||
boolq | 1 | acc | 68.62 | ± | 0.81 |
hellaswag | 0 | acc | 59.63 | ± | 0.49 |
acc_norm | 79.66 | ± | 0.40 | ||
openbookqa | 0 | acc | 31.40 | ± | 2.08 |
acc_norm | 43.40 | ± | 2.22 | ||
piqa | 0 | acc | 80.25 | ± | 0.93 |
acc_norm | 82.05 | ± | 0.90 | ||
winogrande | 0 | acc | 74.11 | ± | 1.23 |
Average: 67.15%
TruthfulQA
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
truthfulqa_mc | 1 | mc1 | 27.54 | ± | 1.56 |
mc2 | 42.06 | ± | 1.44 |
Average: 42.06%
Bigbench
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
bigbench_causal_judgement | 0 | multiple_choice_grade | 56.32 | ± | 3.61 |
bigbench_date_understanding | 0 | multiple_choice_grade | 66.40 | ± | 2.46 |
bigbench_disambiguation_qa | 0 | multiple_choice_grade | 45.74 | ± | 3.11 |
bigbench_geometric_shapes | 0 | multiple_choice_grade | 10.58 | ± | 1.63 |
exact_str_match | 0.00 | ± | 0.00 | ||
bigbench_logical_deduction_five_objects | 0 | multiple_choice_grade | 25.00 | ± | 1.94 |
bigbench_logical_deduction_seven_objects | 0 | multiple_choice_grade | 17.71 | ± | 1.44 |
bigbench_logical_deduction_three_objects | 0 | multiple_choice_grade | 37.33 | ± | 2.80 |
bigbench_movie_recommendation | 0 | multiple_choice_grade | 29.40 | ± | 2.04 |
bigbench_navigate | 0 | multiple_choice_grade | 50.00 | ± | 1.58 |
bigbench_reasoning_about_colored_objects | 0 | multiple_choice_grade | 42.50 | ± | 1.11 |
bigbench_ruin_names | 0 | multiple_choice_grade | 39.06 | ± | 2.31 |
bigbench_salient_translation_error_detection | 0 | multiple_choice_grade | 12.93 | ± | 1.06 |
bigbench_snarks | 0 | multiple_choice_grade | 69.06 | ± | 3.45 |
bigbench_sports_understanding | 0 | multiple_choice_grade | 49.80 | ± | 1.59 |
bigbench_temporal_sequences | 0 | multiple_choice_grade | 26.50 | ± | 1.40 |
bigbench_tracking_shuffled_objects_five_objects | 0 | multiple_choice_grade | 21.20 | ± | 1.16 |
bigbench_tracking_shuffled_objects_seven_objects | 0 | multiple_choice_grade | 16.06 | ± | 0.88 |
bigbench_tracking_shuffled_objects_three_objects | 0 | multiple_choice_grade | 37.33 | ± | 2.80 |
Average: 36.27%
Average score: 45.26%
Elapsed time: 01:49:22
- Developed by: macadeliccc
- License: apache-2.0
- Finetuned from model : alpindale/Mistral-7B-v0.2
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.