Visualize in Weights & Biases

Mistral 7B Zephyr Orpo

The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)

Model description

  • Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily English
  • Finetuned from model: wandb/Mistral-7B-v0.2

Recipe

We trained using the alignment handbook recipe and logging to W&B

Visit the W&B workspace here

Results:

  • MT bench
########## First turn ##########
                            score
model               turn
zephyr-orpo-7b-v0.2 1     7.44375

########## Second turn ##########
                          score
model               turn
zephyr-orpo-7b-v0.2 2     6.875

########## Average ##########
                        score
model
zephyr-orpo-7b-v0.2  7.159375

Trained on a single H100 for 2 hours!

Downloads last month
25
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for wandb/zephyr-orpo-7b-v0.2

Finetuned
(1)
this model

Dataset used to train wandb/zephyr-orpo-7b-v0.2