metadata

language:
  - en
  - de
library_name: transformers
pipeline_tag: text-generation
license: apache-2.0

Hermes + Leo = Hermeo

Hermeo-7B

A German-English language model merged from DPOpenHermes-7B-v2 and leo-mistral-hessianai-7b-chat using mergekit. Both base models are fine-tuned versions of Mistral-7B-v0.1.

This model release is heavily inspired by Weyaxi/OpenHermes-2.5-neural-chat-v3-2-Slerp
Thanks to the authors of the base models: Mistral, LAION, HessianAI, Open Access AI Collective, @teknium, @bjoernp
The German evaluation datasets and scripts from @bjoernp were used.
The computing resources from DFKI's PEGASUS cluster were used for the evaluation.

Evaluation

The evaluation methdology of the Open LLM Leaderboard is followed.

German tasks:	MMLU-DE	Hellaswag-DE	ARC-DE
Models / Few-shots:	(5 shots)	(10 shots)	(24 shots)
7B parameters
llama-2-7b	0.400	0.513	0.381
leo-hessianai-7b	0.400	0.609	0.429
bloom-6b4-clp-german	0.274	0.550	0.351
mistral-7b	0.524	0.588	0.473
leo-mistral-hessianai-7b	0.481	0.663	0.485
leo-mistral-hessianai-7b-chat	0.458	0.617	0.465
DPOpenHermes-7B-v2	TBA	0.603	0.515
hermeo-7b (this model)	0.511	0.668	0.528
13B parameters
llama-2-13b	0.469	0.581	0.468
leo-hessianai-13b	0.486	0.658	0.509
70B parameters
llama-2-70b	0.597	0.674	0.561
leo-hessianai-70b	0.653	0.721	0.600

TBA

Prompt dialogue template (ChatML format):

"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

The model input can contain multiple conversation turns between user and assistant, e.g.

<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)