CreitinGameplays/Mistral-Nemo-12B-R1-v0.1

Mistral Nemo 12B R1

Took 96 hours to finetune on 2x Nvidia RTX A6000 with the following settings:

Batch size: 3
Gradient accumulation steps: 1
Epochs: 1
Learning rate: 1e-4
Warmup ratio: 0.1

Run the model:

import torch
from transformers import pipeline

model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.1"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "How many r's are in strawberry?"}
]

outputs = pipe(
    messages,
    temperature=0.2,
    repetition_penalty=1.1,
    max_new_tokens=2048
)

print(outputs[0]["generated_text"][-1])

System prompt:

You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Process: Think first, then answer. Always use your reasoning capabilities.

CreitinGameplays
/

Mistral-Nemo-12B-R1-v0.1

Mistral Nemo 12B R1

Model tree for CreitinGameplays/Mistral-Nemo-12B-R1-v0.1

Dataset used to train CreitinGameplays/Mistral-Nemo-12B-R1-v0.1