macadeliccc's picture
Update README.md
36f9763 verified
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
- sft
base_model: alpindale/Mistral-7B-v0.2
---
# Mistral-7B-v0.2-OpenHermes
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/AbagOgU056oIB7S31XESC.webp)
SFT Training Params:
+ Learning Rate: 2e-4
+ Batch Size: 8
+ Gradient Accumulation steps: 4
+ Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life)
+ r: 16
+ Lora Alpha: 16
Training Time: 13 hours on A100
_This model is proficient in RAG use cases_
**RAG Finetuning for your case would be a good idea**
Prompt Template: ChatML
```
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant
Paris.
```
## Run easily with ollama
```bash
ollama run macadeliccc/mistral-7b-v2-openhermes
```
## OpenAI compatible server with vLLM
install instructions for vllm can be found [here](https://docs.vllm.ai/en/latest/getting_started/installation.html)
```bash
python -m vllm.entrypoints.openai.api_server \
--model macadeliccc/Mistral-7B-v0.2-OpenHermes \
--gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
--max-model-len 16000 # 32000 if you can run it. This works on 4090
--chat-template ./examples/template_chatml.jinja
```
## Gradio chatbot interface for your endpoint
```python
import gradio as gr
from openai import OpenAI
# Modify these variables as needed
openai_api_key = "EMPTY" # Assuming no API key is required for local testing
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
system_message = "You are a helpful assistant"
def fast_echo(message, history):
# Send the user's message to the vLLM API and get the response immediately
chat_response = client.chat.completions.create(
model="macadeliccc/Mistral-7B-v0.2-OpenHermes",
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": message},
]
)
print(chat_response)
return chat_response.choices[0].message.content
demo = gr.ChatInterface(fn=fast_echo, examples=["Write me a quicksort algorithm in python."]).queue()
if __name__ == "__main__":
demo.launch()
```
## Quantizations
[GGUF](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-GGUF)
[AWQ](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-AWQ/)
[HQQ-4bit](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes-HQQ-4bit)
[ExLlamaV2](https://huggingface.co/bartowski/Mistral-7B-v0.2-OpenHermes-exl2)
### Evaluations
Thanks to Maxime Labonne for the evalution:
| Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-------------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[Mistral-7B-v0.2-OpenHermes](https://huggingface.co/macadeliccc/Mistral-7B-v0.2-OpenHermes)| 35.57| 67.15| 42.06| 36.27| 45.26|
### AGIEval
| Task |Version| Metric |Value| |Stderr|
|------------------------------|------:|--------|----:|---|-----:|
|agieval_aqua_rat | 0|acc |24.02|± | 2.69|
| | |acc_norm|21.65|± | 2.59|
|agieval_logiqa_en | 0|acc |28.11|± | 1.76|
| | |acc_norm|34.56|± | 1.87|
|agieval_lsat_ar | 0|acc |27.83|± | 2.96|
| | |acc_norm|23.48|± | 2.80|
|agieval_lsat_lr | 0|acc |33.73|± | 2.10|
| | |acc_norm|33.14|± | 2.09|
|agieval_lsat_rc | 0|acc |48.70|± | 3.05|
| | |acc_norm|39.78|± | 2.99|
|agieval_sat_en | 0|acc |67.48|± | 3.27|
| | |acc_norm|64.56|± | 3.34|
|agieval_sat_en_without_passage| 0|acc |38.83|± | 3.40|
| | |acc_norm|37.38|± | 3.38|
|agieval_sat_math | 0|acc |32.27|± | 3.16|
| | |acc_norm|30.00|± | 3.10|
Average: 35.57%
### GPT4All
| Task |Version| Metric |Value| |Stderr|
|-------------|------:|--------|----:|---|-----:|
|arc_challenge| 0|acc |45.05|± | 1.45|
| | |acc_norm|48.46|± | 1.46|
|arc_easy | 0|acc |77.27|± | 0.86|
| | |acc_norm|73.78|± | 0.90|
|boolq | 1|acc |68.62|± | 0.81|
|hellaswag | 0|acc |59.63|± | 0.49|
| | |acc_norm|79.66|± | 0.40|
|openbookqa | 0|acc |31.40|± | 2.08|
| | |acc_norm|43.40|± | 2.22|
|piqa | 0|acc |80.25|± | 0.93|
| | |acc_norm|82.05|± | 0.90|
|winogrande | 0|acc |74.11|± | 1.23|
Average: 67.15%
### TruthfulQA
| Task |Version|Metric|Value| |Stderr|
|-------------|------:|------|----:|---|-----:|
|truthfulqa_mc| 1|mc1 |27.54|± | 1.56|
| | |mc2 |42.06|± | 1.44|
Average: 42.06%
### Bigbench
| Task |Version| Metric |Value| |Stderr|
|------------------------------------------------|------:|---------------------|----:|---|-----:|
|bigbench_causal_judgement | 0|multiple_choice_grade|56.32|± | 3.61|
|bigbench_date_understanding | 0|multiple_choice_grade|66.40|± | 2.46|
|bigbench_disambiguation_qa | 0|multiple_choice_grade|45.74|± | 3.11|
|bigbench_geometric_shapes | 0|multiple_choice_grade|10.58|± | 1.63|
| | |exact_str_match | 0.00|± | 0.00|
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|25.00|± | 1.94|
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|17.71|± | 1.44|
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|37.33|± | 2.80|
|bigbench_movie_recommendation | 0|multiple_choice_grade|29.40|± | 2.04|
|bigbench_navigate | 0|multiple_choice_grade|50.00|± | 1.58|
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|42.50|± | 1.11|
|bigbench_ruin_names | 0|multiple_choice_grade|39.06|± | 2.31|
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|12.93|± | 1.06|
|bigbench_snarks | 0|multiple_choice_grade|69.06|± | 3.45|
|bigbench_sports_understanding | 0|multiple_choice_grade|49.80|± | 1.59|
|bigbench_temporal_sequences | 0|multiple_choice_grade|26.50|± | 1.40|
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|21.20|± | 1.16|
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|16.06|± | 0.88|
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|37.33|± | 2.80|
Average: 36.27%
Average score: 45.26%
Elapsed time: 01:49:22
- **Developed by:** macadeliccc
- **License:** apache-2.0
- **Finetuned from model :** alpindale/Mistral-7B-v0.2
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)