wandb
/

zephyr-orpo-7b-v0.2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mistral 7B Zephyr Orpo

The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)

Model description

Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP): Primarily English
Finetuned from model: wandb/Mistral-7B-v0.2

Recipe

We trained using the alignment handbook recipe and logging to W&B

Visit the W&B workspace here

Results:

MT bench

########## First turn ##########
                            score
model               turn
zephyr-orpo-7b-v0.2 1     7.44375

########## Second turn ##########
                          score
model               turn
zephyr-orpo-7b-v0.2 2     6.875

########## Average ##########
                        score
model
zephyr-orpo-7b-v0.2  7.159375

Trained on a single H100 for 2 hours!

Downloads last month: 25

Safetensors

Model size

7.24B params

Tensor type

BF16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for wandb/zephyr-orpo-7b-v0.2

Base model

wandb/Mistral-7B-v0.2

Finetuned

(1)

this model

Dataset used to train wandb/zephyr-orpo-7b-v0.2