Update README.md
Browse files
README.md
CHANGED
@@ -1,28 +1,37 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# O1-Llama 3.2 3B Model Card
|
2 |
+
|
3 |
+
## Important Disclaimer
|
4 |
+
This is a **proof-of-concept research model** designed to demonstrate the feasibility of inducing structured reasoning behaviors in smaller language models. It is not intended for production use or deployment in real-world applications. The model serves primarily as a demonstration of training methodology and should be used only for research purposes.
|
5 |
+
|
6 |
+
## Model Overview
|
7 |
+
O1-Llama 3.2 3B is a fine-tuned version of Llama 3.2 3B, trained to demonstrate explicit reasoning patterns similar to those observed in OpenAI's O1 model. The model is trained on ReasonSet, a dataset of worked solutions focusing on mathematical and logical problem-solving.
|
8 |
+
|
9 |
+
## Key Capabilities
|
10 |
+
- Explicit brainstorming and strategy enumeration
|
11 |
+
- Step-by-step solution working out
|
12 |
+
- Self-correction attempts
|
13 |
+
- Verification steps in problem-solving
|
14 |
+
|
15 |
+
## Limitations
|
16 |
+
- Significantly lower performance compared to larger models
|
17 |
+
- Can get stuck in circular reasoning
|
18 |
+
- May fail to find correct solutions despite showing reasoning behavior
|
19 |
+
- Limited to simpler problems
|
20 |
+
- Not suitable for production use or critical applications
|
21 |
+
|
22 |
+
## Training
|
23 |
+
- Base Model: Llama 3.2 3B
|
24 |
+
- Dataset: ReasonSet (2,000 worked solutions)
|
25 |
+
- Domains: AIME, GPQA, MATH dataset problems
|
26 |
+
- Method: Fine-tuning on worked solutions generated through REL (Reasoning Enhancement Loop)
|
27 |
+
|
28 |
+
## Intended Use
|
29 |
+
- Research into reasoning capabilities of smaller language models
|
30 |
+
- Study of explicit problem-solving behaviors
|
31 |
+
- Academic investigation of model training methodologies
|
32 |
+
|
33 |
+
## Repository
|
34 |
+
Available at: https://github.com/tamassimonds/REL
|
35 |
+
|
36 |
+
## Citation
|
37 |
+
[Include paper citation when published]
|