dfurman/Llama-3-70B-Orpo-v0.1

This is an ORPO fine-tune of meta-llama/Meta-Llama-3-70B on 2k samples of mlabonne/orpo-dpo-mix-40k.

It's a successful fine-tune that follows the ChatML template!

πŸ”Ž Application

This model uses a context window of 8k. It was trained with the ChatML template.

πŸ† Evaluation

Open LLM Leaderboard

Model ID Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
meta-llama/Meta-Llama-3-70B-Instruct πŸ“„ 77.88 71.42 85.69 80.06 61.81 82.87 85.44
dfurman/Llama-3-70B-Orpo-v0.1 πŸ“„ 74.67 68.69 88.01 79.39 49.62 85.48 76.8
meta-llama/Meta-Llama-3-70B πŸ“„ 73.96 68.77 87.98 79.23 45.56 85.32 76.88

πŸ“ˆ Training curves

You can find the experiment on W&B at this address.

πŸ’» Usage

Setup
!pip install -qU transformers accelerate bitsandbytes

from transformers import AutoTokenizer, BitsAndBytesConfig
import transformers
import torch

if torch.cuda.get_device_capability()[0] >= 8:
    !pip install -qqq flash-attn
    attn_implementation = "flash_attention_2"
    torch_dtype = torch.bfloat16
else:
    attn_implementation = "eager"
    torch_dtype = torch.float16

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch_dtype,
    bnb_4bit_use_double_quant=True,
)

model = "dfurman/Llama-3-70B-Orpo-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={
        "torch_dtype": torch_dtype,
        "quantization_config": bnb_config,
        "device_map": "auto",
        "attn_implementation": attn_implementation,
    }
)

Run

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Tell me a recipe for a spicy margarita."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print("***Prompt:\n", prompt)

outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print("***Generation:\n", outputs[0]["generated_text"][len(prompt):])
Output
"""
"""

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 17.92
IFEval (0-Shot) 20.49
BBH (3-Shot) 24.09
MATH Lvl 5 (4-Shot) 13.52
GPQA (0-shot) 1.01
MuSR (0-shot) 16.28
MMLU-PRO (5-shot) 32.14
Downloads last month
2,885
Safetensors
Model size
70.6B params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for dfurman/Llama-3-70B-Orpo-v0.1

Finetuned
(38)
this model
Quantizations
3 models

Dataset used to train dfurman/Llama-3-70B-Orpo-v0.1

Collection including dfurman/Llama-3-70B-Orpo-v0.1

Evaluation results