Tulu 3 banner

Llama-3.1-Tulu-3-70B-AWQ

Quantization Details

This quantized model was created using AutoAWQ version 0.2.8 with quant_config:

{
  "zero_point": True,
  "q_group_size": 128,
  "w_bit": 4,
  "version": "GEMM"
}

Model description

Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

  • Model type: A model trained on a mix of publicly available, synthetic and human-created datasets.
  • Language(s) (NLP): Primarily English
  • License: Llama 3.1 Community License Agreement
  • Finetuned from model: allenai/Llama-3.1-Tulu-3-70B-DPO

Model Sources

Model Family

Stage Llama 3.1 405B
Base Model meta-llama/llama-3.1-405B
SFT allenai/llama-3.1-Tulu-3-405B-SFT
DPO allenai/llama-3.1-Tulu-3-405B-DPO
Final Model (RLVR) allenai/llama-3.1-Tulu-3-405B
Reward Model (RM) (Same as 8B)

Using the model

Loading with HuggingFace

To load the model with HuggingFace, use the following snippet:

from transformers import AutoModelForCausalLM

tulu_model = AutoModelForCausalLM.from_pretrained("allenai/Llama-3.1-Tulu-3-70B")

VLLM

As a Llama base model, the model can be easily served with:

vllm serve allenai/Llama-3.1-Tulu-3-70B

Note that given the long chat template of Llama, you may want to use --max_model_len=8192.

Chat template

The chat template for our models is formatted as:

<|user|>\nHow are you doing?\n<|assistant|>\nI'm just a computer progr
Downloads last month
7
Safetensors
Model size
11.3B params
Tensor type
I32
·
BF16
·
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Valdemardi/Llama-3.1-Tulu-3-70B-AWQ

Dataset used to train Valdemardi/Llama-3.1-Tulu-3-70B-AWQ