🍷 Llama-3.2-Nemotron-3B-Instruct

This is a finetune of meta-llama/Llama-3.2-3B-Instruct (specifically, unsloth/Llama-3.2-3B-Instruct-bnb-4bit).

It was trained on the nvidia/HelpSteer2 dataset, similar to nvidia/Llama-3.1-Nemotron-70B-Instruct-HF, using Unsloth.

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "itsnebulalol/Llama-3.2-Nemotron-3B-Instruct"
messages = [{"role": "user", "content": "How many r in strawberry?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
17
Safetensors
Model size
3.21B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for itsnebulalol/Llama-3.2-Nemotron-3B-Instruct

Finetuned
(203)
this model
Quantizations
9 models

Dataset used to train itsnebulalol/Llama-3.2-Nemotron-3B-Instruct