A newer version of this model is available: ngxson/MiniThinky-v2-1B-Llama-3.2

MiniThinky 1B

There is a newer checkpoint for this model, click here

My first trial to fine tune a small model to add reasoning capability.

Link to GGUF version: click here

Chat template is the same with llama 3, but the response will be as follow:

<|thinking|>{thinking_process}
<|answer|>
{real_answer}

IMPORTANT: System message

The model is very sensitive to system message. Make sure you're using this system message (system role) at the beginning of the conversation:

You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.

Q&A

Hardware used to trained it?
I used a HF space with 4xL40S, trained for 5 hours. Eval loss is about 0.8

Benchmark?
I don't have time to do it alone. If you can help, please open a discussion!

Can it count number of "r" in "raspberry"?
Unfortunately no

Other things that I can tune?
Maybe lower temperature, or set top_k=1


TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested)

Downloads last month
189
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ngxson/MiniThinky-1B-Llama-3.2

Quantized
(186)
this model
Quantizations
4 models

Dataset used to train ngxson/MiniThinky-1B-Llama-3.2

Collection including ngxson/MiniThinky-1B-Llama-3.2