YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

O1-Llama 3.2 3B Model Card

Important Disclaimer

This is a proof-of-concept research model designed to demonstrate the feasibility of inducing structured reasoning behaviors in smaller language models. It is not intended for production use or deployment in real-world applications. The model serves primarily as a demonstration of training methodology and should be used only for research purposes.

Model Overview

O1-Llama 3.2 3B is a fine-tuned version of Llama 3.2 3B, trained to demonstrate explicit reasoning patterns similar to those observed in OpenAI's O1 model. The model is trained on ReasonSet, a dataset of worked solutions focusing on mathematical and logical problem-solving.

Key Capabilities

  • Explicit brainstorming and strategy enumeration
  • Step-by-step solution working out
  • Self-correction attempts
  • Verification steps in problem-solving

Limitations

  • Significantly lower performance compared to larger models
  • Can get stuck in circular reasoning
  • May fail to find correct solutions despite showing reasoning behavior
  • Limited to simpler problems
  • Not suitable for production use or critical applications

Training

  • Base Model: Llama 3.2 3B
  • Dataset: ReasonSet (2,000 worked solutions)
  • Domains: AIME, GPQA, MATH dataset problems
  • Method: Fine-tuning on worked solutions generated through REL (Reasoning Enhancement Loop)

Intended Use

  • Research into reasoning capabilities of smaller language models
  • Study of explicit problem-solving behaviors
  • Academic investigation of model training methodologies

Repository

Available at: https://github.com/tamassimonds/REL

Citation

[Include paper citation when published]

Downloads last month
41
Safetensors
Model size
3.21B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for TamasSimonds/O1-Llama-3.2-3B

Quantizations
1 model