metadata
language:
- en
- de
library_name: transformers
pipeline_tag: text-generation
license: apache-2.0
Hermes + Leo = Hermeo
Hermeo-7B
A German-English language model merged from DPOpenHermes-7B-v2 and leo-mistral-hessianai-7b-chat using mergekit. Both base models are fine-tuned versions of Mistral-7B-v0.1.
Model details
- Merged from: leo-mistral-hessianai-7b-chat and DPOpenHermes-7B-v2
- Model type: Causal decoder-only transformer language model
- Languages: English and German
- License: Apache 2.0
Acknowledgements
- This model release is heavily inspired by Weyaxi/OpenHermes-2.5-neural-chat-v3-2-Slerp
- Thanks to the authors of the base models: Mistral, LAION, HessianAI, Open Access AI Collective, @teknium, @bjoernp
- The German evaluation datasets and scripts from @bjoernp were used.
- The computing resources from DFKI's PEGASUS cluster were used for the evaluation.
Evaluation
The evaluation methdology of the Open LLM Leaderboard is followed.
German benchmarks
German tasks: | MMLU-DE | Hellaswag-DE | ARC-DE |
---|---|---|---|
Models / Few-shots: | (5 shots) | (10 shots) | (24 shots) |
7B parameters | |||
llama-2-7b | 0.400 | 0.513 | 0.381 |
leo-hessianai-7b | 0.400 | 0.609 | 0.429 |
bloom-6b4-clp-german | 0.274 | 0.550 | 0.351 |
mistral-7b | 0.524 | 0.588 | 0.473 |
leo-mistral-hessianai-7b | 0.481 | 0.663 | 0.485 |
leo-mistral-hessianai-7b-chat | 0.458 | 0.617 | 0.465 |
DPOpenHermes-7B-v2 | TBA | 0.603 | 0.515 |
hermeo-7b (this model) | 0.511 | 0.668 | 0.528 |
13B parameters | |||
llama-2-13b | 0.469 | 0.581 | 0.468 |
leo-hessianai-13b | 0.486 | 0.658 | 0.509 |
70B parameters | |||
llama-2-70b | 0.597 | 0.674 | 0.561 |
leo-hessianai-70b | 0.653 | 0.721 | 0.600 |
English benchmarks
TBA
Prompting / Prompt Template
Prompt dialogue template (ChatML format):
"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""
The model input can contain multiple conversation turns between user and assistant, e.g.
<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)