--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl - sft base_model: alpindale/Mistral-7B-v0.2 --- # Mistral-7B-v0.2-OpenHermes ![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/AbagOgU056oIB7S31XESC.webp) SFT Training Params: + Learning Rate: 2e-4 + Batch Size: 8 + Gradient Accumulation steps: 4 + Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life) + r: 16 + Lora Alpha: 16 Training Time: 13 hours on A100 _This model is proficient in RAG use cases_ **RAG Finetuning for your case would be a good idea** Prompt Template: ChatML ``` <|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user What's the capital of France?<|im_end|> <|im_start|>assistant Paris. ``` ## Run easily with ollama ```bash ollama run macadeliccc/mistral-7b-v2-openhermes ``` ## OpenAI compatible server with vLLM install instructions for vllm can be found [here](https://docs.vllm.ai/en/latest/getting_started/installation.html) ```bash python -m vllm.entrypoints.openai.api_server \ --model macadeliccc/Mistral-7B-v0.2-OpenHermes \ --gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application --max-model-len 16000 # 32000 if you can run it. This works on 4090 --chat-template ./examples/template_chatml.jinja ``` ## Gradio chatbot interface for your endpoint ```python import gradio as gr from openai import OpenAI # Modify these variables as needed openai_api_key = "EMPTY" # Assuming no API key is required for local testing openai_api_base = "http://localhost:8000/v1" client = OpenAI( api_key=openai_api_key, base_url=openai_api_base, ) system_message = "You are a helpful assistant" def fast_echo(message, history): # Send the user's message to the vLLM API and get the response immediately chat_response = client.chat.completions.create( model="macadeliccc/Mistral-7B-v0.2-OpenHermes", messages=[ {"role": "system", "content": system_message}, {"role": "user", "content": message}, ] ) print(chat_response) return chat_response.choices[0].message.content demo = gr.ChatInterface(fn=fast_echo, examples=["Write me a quicksort algorithm in python."]).queue() if __name__ == "__main__": demo.launch() ``` ## Quantizations [GGUF](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-GGUF) [AWQ](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-AWQ/) [HQQ-4bit](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-HQQ-4bit) [ExLlamaV2](https://huggingface.co/bartowski/Mistral-7B-v0.2-OpenHermes-exl2) ### Evaluations Thanks to Maxime Labonne for the evalution: | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average| |-------------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:| |[Mistral-7B-v0.2-OpenHermes](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes)| 35.57| 67.15| 42.06| 36.27| 45.26| ### AGIEval | Task |Version| Metric |Value| |Stderr| |------------------------------|------:|--------|----:|---|-----:| |agieval_aqua_rat | 0|acc |24.02|± | 2.69| | | |acc_norm|21.65|± | 2.59| |agieval_logiqa_en | 0|acc |28.11|± | 1.76| | | |acc_norm|34.56|± | 1.87| |agieval_lsat_ar | 0|acc |27.83|± | 2.96| | | |acc_norm|23.48|± | 2.80| |agieval_lsat_lr | 0|acc |33.73|± | 2.10| | | |acc_norm|33.14|± | 2.09| |agieval_lsat_rc | 0|acc |48.70|± | 3.05| | | |acc_norm|39.78|± | 2.99| |agieval_sat_en | 0|acc |67.48|± | 3.27| | | |acc_norm|64.56|± | 3.34| |agieval_sat_en_without_passage| 0|acc |38.83|± | 3.40| | | |acc_norm|37.38|± | 3.38| |agieval_sat_math | 0|acc |32.27|± | 3.16| | | |acc_norm|30.00|± | 3.10| Average: 35.57% ### GPT4All | Task |Version| Metric |Value| |Stderr| |-------------|------:|--------|----:|---|-----:| |arc_challenge| 0|acc |45.05|± | 1.45| | | |acc_norm|48.46|± | 1.46| |arc_easy | 0|acc |77.27|± | 0.86| | | |acc_norm|73.78|± | 0.90| |boolq | 1|acc |68.62|± | 0.81| |hellaswag | 0|acc |59.63|± | 0.49| | | |acc_norm|79.66|± | 0.40| |openbookqa | 0|acc |31.40|± | 2.08| | | |acc_norm|43.40|± | 2.22| |piqa | 0|acc |80.25|± | 0.93| | | |acc_norm|82.05|± | 0.90| |winogrande | 0|acc |74.11|± | 1.23| Average: 67.15% ### TruthfulQA | Task |Version|Metric|Value| |Stderr| |-------------|------:|------|----:|---|-----:| |truthfulqa_mc| 1|mc1 |27.54|± | 1.56| | | |mc2 |42.06|± | 1.44| Average: 42.06% ### Bigbench | Task |Version| Metric |Value| |Stderr| |------------------------------------------------|------:|---------------------|----:|---|-----:| |bigbench_causal_judgement | 0|multiple_choice_grade|56.32|± | 3.61| |bigbench_date_understanding | 0|multiple_choice_grade|66.40|± | 2.46| |bigbench_disambiguation_qa | 0|multiple_choice_grade|45.74|± | 3.11| |bigbench_geometric_shapes | 0|multiple_choice_grade|10.58|± | 1.63| | | |exact_str_match | 0.00|± | 0.00| |bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|25.00|± | 1.94| |bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|17.71|± | 1.44| |bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|37.33|± | 2.80| |bigbench_movie_recommendation | 0|multiple_choice_grade|29.40|± | 2.04| |bigbench_navigate | 0|multiple_choice_grade|50.00|± | 1.58| |bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|42.50|± | 1.11| |bigbench_ruin_names | 0|multiple_choice_grade|39.06|± | 2.31| |bigbench_salient_translation_error_detection | 0|multiple_choice_grade|12.93|± | 1.06| |bigbench_snarks | 0|multiple_choice_grade|69.06|± | 3.45| |bigbench_sports_understanding | 0|multiple_choice_grade|49.80|± | 1.59| |bigbench_temporal_sequences | 0|multiple_choice_grade|26.50|± | 1.40| |bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|21.20|± | 1.16| |bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|16.06|± | 0.88| |bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|37.33|± | 2.80| Average: 36.27% Average score: 45.26% Elapsed time: 01:49:22 - **Developed by:** macadeliccc - **License:** apache-2.0 - **Finetuned from model :** alpindale/Mistral-7B-v0.2 This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)