Spaces:
Running
Running
adds no think tag correctly
Browse files- NO_THINK_TAG_GUIDE.md +146 -0
- config/runpod_config.py +2 -2
- config/train_smollm3.py +2 -2
- config/train_smollm3_long_context.py +2 -2
- config/train_smollm3_no_think_test.py +38 -0
- config/train_smollm3_openhermes_fr.py +2 -2
- config/train_smollm3_openhermes_fr_a100_balanced.py +2 -2
- config/train_smollm3_openhermes_fr_a100_large.py +2 -2
- config/train_smollm3_openhermes_fr_a100_max_performance.py +2 -2
- config/train_smollm3_openhermes_fr_a100_multiple_passes.py +2 -2
- data.py +12 -1
- test_no_think.py +86 -0
NO_THINK_TAG_GUIDE.md
ADDED
@@ -0,0 +1,146 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# SmolLM3 `/no_think` Tag Implementation Guide
|
2 |
+
|
3 |
+
## The Problem
|
4 |
+
|
5 |
+
You were using the `enable_thinking` parameter in the chat template configuration, which is **incorrect** for SmolLM3. The `/no_think` tag should be added as a **system message** in your training data, not as a configuration parameter.
|
6 |
+
|
7 |
+
### What was wrong:
|
8 |
+
|
9 |
+
```python
|
10 |
+
# ❌ INCORRECT - This doesn't work for SmolLM3
|
11 |
+
chat_template_kwargs={
|
12 |
+
"enable_thinking": False, # This parameter doesn't exist in SmolLM3
|
13 |
+
"add_generation_prompt": True
|
14 |
+
}
|
15 |
+
```
|
16 |
+
|
17 |
+
### What's correct:
|
18 |
+
|
19 |
+
```python
|
20 |
+
# ✅ CORRECT - Add /no_think as system message
|
21 |
+
messages = [
|
22 |
+
{"role": "system", "content": "You are a helpful assistant. /no_think"},
|
23 |
+
{"role": "user", "content": "What is machine learning?"},
|
24 |
+
{"role": "assistant", "content": "Machine learning is..."}
|
25 |
+
]
|
26 |
+
```
|
27 |
+
|
28 |
+
## The Solution
|
29 |
+
|
30 |
+
### 1. Updated Data Processing
|
31 |
+
|
32 |
+
The `data.py` file now properly handles the `/no_think` tag by:
|
33 |
+
|
34 |
+
- Adding a system message with `/no_think` when `no_think_system_message=True`
|
35 |
+
- Using the correct chat template parameters
|
36 |
+
- Properly formatting messages for SmolLM3
|
37 |
+
|
38 |
+
### 2. Updated Configuration
|
39 |
+
|
40 |
+
All configuration files now use the correct parameter:
|
41 |
+
|
42 |
+
```python
|
43 |
+
chat_template_kwargs={
|
44 |
+
"add_generation_prompt": True,
|
45 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
46 |
+
}
|
47 |
+
```
|
48 |
+
|
49 |
+
### 3. How It Works
|
50 |
+
|
51 |
+
When `no_think_system_message=True`, the system automatically adds:
|
52 |
+
|
53 |
+
```
|
54 |
+
{"role": "system", "content": "You are a helpful assistant. /no_think"}
|
55 |
+
```
|
56 |
+
|
57 |
+
as the first message in each conversation.
|
58 |
+
|
59 |
+
## Testing the Fix
|
60 |
+
|
61 |
+
### 1. Run the Test Script
|
62 |
+
|
63 |
+
```bash
|
64 |
+
python test_no_think.py
|
65 |
+
```
|
66 |
+
|
67 |
+
This will show you the difference between:
|
68 |
+
- Messages with `/no_think` tag
|
69 |
+
- Messages without `/no_think` tag
|
70 |
+
|
71 |
+
### 2. Test with Training
|
72 |
+
|
73 |
+
```bash
|
74 |
+
python train.py config/train_smollm3_no_think_test.py --dataset_dir=test_data
|
75 |
+
```
|
76 |
+
|
77 |
+
This uses a configuration that enables the `/no_think` tag.
|
78 |
+
|
79 |
+
## Configuration Options
|
80 |
+
|
81 |
+
### Enable `/no_think` tag:
|
82 |
+
```python
|
83 |
+
chat_template_kwargs={
|
84 |
+
"add_generation_prompt": True,
|
85 |
+
"no_think_system_message": True # Adds /no_think to system message
|
86 |
+
}
|
87 |
+
```
|
88 |
+
|
89 |
+
### Disable `/no_think` tag (allow thinking):
|
90 |
+
```python
|
91 |
+
chat_template_kwargs={
|
92 |
+
"add_generation_prompt": True,
|
93 |
+
"no_think_system_message": False # No /no_think tag
|
94 |
+
}
|
95 |
+
```
|
96 |
+
|
97 |
+
## When to Use `/no_think`
|
98 |
+
|
99 |
+
- **Use `/no_think`** for tasks that require direct, concise responses
|
100 |
+
- **Don't use `/no_think`** for tasks that benefit from reasoning and step-by-step thinking
|
101 |
+
|
102 |
+
### Examples:
|
103 |
+
|
104 |
+
**With `/no_think` (direct answers):**
|
105 |
+
```
|
106 |
+
System: You are a helpful assistant. /no_think
|
107 |
+
User: What is 2+2?
|
108 |
+
Assistant: 4
|
109 |
+
```
|
110 |
+
|
111 |
+
**Without `/no_think` (reasoning allowed):**
|
112 |
+
```
|
113 |
+
System: You are a helpful assistant.
|
114 |
+
User: Solve this math problem step by step: 15 * 7
|
115 |
+
Assistant: Let me solve this step by step:
|
116 |
+
1. First, I'll break down 15 * 7
|
117 |
+
2. 15 * 7 = (10 + 5) * 7
|
118 |
+
3. = 10 * 7 + 5 * 7
|
119 |
+
4. = 70 + 35
|
120 |
+
5. = 105
|
121 |
+
The answer is 105.
|
122 |
+
```
|
123 |
+
|
124 |
+
## Updated Files
|
125 |
+
|
126 |
+
The following files were updated to fix the `/no_think` tag issue:
|
127 |
+
|
128 |
+
1. `data.py` - Updated `format_chat_template` function
|
129 |
+
2. `config/train_smollm3.py` - Updated default configuration
|
130 |
+
3. `config/train_smollm3_openhermes_fr.py` - Updated configuration
|
131 |
+
4. `config/train_smollm3_long_context.py` - Updated configuration
|
132 |
+
5. `config/runpod_config.py` - Updated configuration
|
133 |
+
6. All A100 configuration files - Updated configurations
|
134 |
+
|
135 |
+
## Verification
|
136 |
+
|
137 |
+
To verify the fix is working:
|
138 |
+
|
139 |
+
1. Check that system messages include `/no_think` when `no_think_system_message=True`
|
140 |
+
2. Verify that the chat template is applied correctly
|
141 |
+
3. Test with actual training to ensure the model learns the `/no_think` behavior
|
142 |
+
|
143 |
+
## References
|
144 |
+
|
145 |
+
- [SmolLM3 Model Card](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
|
146 |
+
- [SmolLM3 Documentation](https://huggingface.co/docs/transformers/model_doc/smollm3)
|
config/runpod_config.py
CHANGED
@@ -41,7 +41,7 @@ config = SmolLM3Config(
|
|
41 |
# Chat template configuration
|
42 |
use_chat_template=True,
|
43 |
chat_template_kwargs={
|
44 |
-
"
|
45 |
-
"
|
46 |
}
|
47 |
)
|
|
|
41 |
# Chat template configuration
|
42 |
use_chat_template=True,
|
43 |
chat_template_kwargs={
|
44 |
+
"add_generation_prompt": True,
|
45 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
46 |
}
|
47 |
)
|
config/train_smollm3.py
CHANGED
@@ -80,8 +80,8 @@ class SmolLM3Config:
|
|
80 |
def __post_init__(self):
|
81 |
if self.chat_template_kwargs is None:
|
82 |
self.chat_template_kwargs = {
|
83 |
-
"
|
84 |
-
"
|
85 |
}
|
86 |
|
87 |
# Validate configuration
|
|
|
80 |
def __post_init__(self):
|
81 |
if self.chat_template_kwargs is None:
|
82 |
self.chat_template_kwargs = {
|
83 |
+
"add_generation_prompt": True,
|
84 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
85 |
}
|
86 |
|
87 |
# Validate configuration
|
config/train_smollm3_long_context.py
CHANGED
@@ -32,7 +32,7 @@ config = SmolLM3Config(
|
|
32 |
# Chat template configuration
|
33 |
use_chat_template=True,
|
34 |
chat_template_kwargs={
|
35 |
-
"
|
36 |
-
"
|
37 |
}
|
38 |
)
|
|
|
32 |
# Chat template configuration
|
33 |
use_chat_template=True,
|
34 |
chat_template_kwargs={
|
35 |
+
"add_generation_prompt": True,
|
36 |
+
"no_think_system_message": True # Allow thinking for long context tasks
|
37 |
}
|
38 |
)
|
config/train_smollm3_no_think_test.py
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
SmolLM3 Training Configuration with /no_think tag
|
3 |
+
Test configuration to verify /no_think tag functionality
|
4 |
+
"""
|
5 |
+
|
6 |
+
from config.train_smollm3 import SmolLM3Config
|
7 |
+
|
8 |
+
config = SmolLM3Config(
|
9 |
+
# Model configuration
|
10 |
+
model_name="HuggingFaceTB/SmolLM3-3B",
|
11 |
+
max_seq_length=4096,
|
12 |
+
use_flash_attention=True,
|
13 |
+
use_gradient_checkpointing=True,
|
14 |
+
|
15 |
+
# Training configuration
|
16 |
+
batch_size=2,
|
17 |
+
gradient_accumulation_steps=4,
|
18 |
+
learning_rate=2e-5,
|
19 |
+
weight_decay=0.01,
|
20 |
+
warmup_steps=100,
|
21 |
+
max_iters=100, # Short test run
|
22 |
+
|
23 |
+
# Mixed precision
|
24 |
+
fp16=True,
|
25 |
+
bf16=False,
|
26 |
+
|
27 |
+
# Logging and saving
|
28 |
+
save_steps=50,
|
29 |
+
eval_steps=25,
|
30 |
+
logging_steps=10,
|
31 |
+
|
32 |
+
# Chat template configuration with /no_think tag
|
33 |
+
use_chat_template=True,
|
34 |
+
chat_template_kwargs={
|
35 |
+
"add_generation_prompt": True,
|
36 |
+
"no_think_system_message": True # Enable /no_think tag
|
37 |
+
}
|
38 |
+
)
|
config/train_smollm3_openhermes_fr.py
CHANGED
@@ -89,8 +89,8 @@ class SmolLM3ConfigOpenHermesFR(SmolLM3Config):
|
|
89 |
def __post_init__(self):
|
90 |
if self.chat_template_kwargs is None:
|
91 |
self.chat_template_kwargs = {
|
92 |
-
"
|
93 |
-
"
|
94 |
}
|
95 |
|
96 |
# Validate configuration
|
|
|
89 |
def __post_init__(self):
|
90 |
if self.chat_template_kwargs is None:
|
91 |
self.chat_template_kwargs = {
|
92 |
+
"add_generation_prompt": True,
|
93 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
94 |
}
|
95 |
|
96 |
# Validate configuration
|
config/train_smollm3_openhermes_fr_a100_balanced.py
CHANGED
@@ -104,8 +104,8 @@ class SmolLM3ConfigOpenHermesFRBalanced(SmolLM3Config):
|
|
104 |
def __post_init__(self):
|
105 |
if self.chat_template_kwargs is None:
|
106 |
self.chat_template_kwargs = {
|
107 |
-
"
|
108 |
-
"
|
109 |
}
|
110 |
|
111 |
# Validate configuration
|
|
|
104 |
def __post_init__(self):
|
105 |
if self.chat_template_kwargs is None:
|
106 |
self.chat_template_kwargs = {
|
107 |
+
"add_generation_prompt": True,
|
108 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
109 |
}
|
110 |
|
111 |
# Validate configuration
|
config/train_smollm3_openhermes_fr_a100_large.py
CHANGED
@@ -105,8 +105,8 @@ class SmolLM3ConfigOpenHermesFRA100Large(SmolLM3Config):
|
|
105 |
def __post_init__(self):
|
106 |
if self.chat_template_kwargs is None:
|
107 |
self.chat_template_kwargs = {
|
108 |
-
"
|
109 |
-
"
|
110 |
}
|
111 |
|
112 |
# Validate configuration
|
|
|
105 |
def __post_init__(self):
|
106 |
if self.chat_template_kwargs is None:
|
107 |
self.chat_template_kwargs = {
|
108 |
+
"add_generation_prompt": True,
|
109 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
110 |
}
|
111 |
|
112 |
# Validate configuration
|
config/train_smollm3_openhermes_fr_a100_max_performance.py
CHANGED
@@ -105,8 +105,8 @@ class SmolLM3ConfigOpenHermesFRMaxPerformance(SmolLM3Config):
|
|
105 |
def __post_init__(self):
|
106 |
if self.chat_template_kwargs is None:
|
107 |
self.chat_template_kwargs = {
|
108 |
-
"
|
109 |
-
"
|
110 |
}
|
111 |
|
112 |
# Validate configuration
|
|
|
105 |
def __post_init__(self):
|
106 |
if self.chat_template_kwargs is None:
|
107 |
self.chat_template_kwargs = {
|
108 |
+
"add_generation_prompt": True,
|
109 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
110 |
}
|
111 |
|
112 |
# Validate configuration
|
config/train_smollm3_openhermes_fr_a100_multiple_passes.py
CHANGED
@@ -106,8 +106,8 @@ class SmolLM3ConfigOpenHermesFRMultiplePasses(SmolLM3Config):
|
|
106 |
def __post_init__(self):
|
107 |
if self.chat_template_kwargs is None:
|
108 |
self.chat_template_kwargs = {
|
109 |
-
"
|
110 |
-
"
|
111 |
}
|
112 |
|
113 |
# Validate configuration
|
|
|
106 |
def __post_init__(self):
|
107 |
if self.chat_template_kwargs is None:
|
108 |
self.chat_template_kwargs = {
|
109 |
+
"add_generation_prompt": True,
|
110 |
+
"no_think_system_message": True # Set to True to add /no_think tag
|
111 |
}
|
112 |
|
113 |
# Validate configuration
|
data.py
CHANGED
@@ -147,11 +147,22 @@ class SmolLM3Dataset:
|
|
147 |
# Fallback: treat as plain text
|
148 |
return {"text": str(example)}
|
149 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
150 |
# Apply chat template
|
151 |
text = self.tokenizer.apply_chat_template(
|
152 |
messages,
|
153 |
tokenize=False,
|
154 |
-
|
155 |
)
|
156 |
return {"text": text}
|
157 |
except Exception as e:
|
|
|
147 |
# Fallback: treat as plain text
|
148 |
return {"text": str(example)}
|
149 |
|
150 |
+
# Add system message with /no_think tag if not present
|
151 |
+
if messages and messages[0]["role"] != "system":
|
152 |
+
# Check if we should add /no_think tag based on configuration
|
153 |
+
system_content = "You are a helpful assistant."
|
154 |
+
if hasattr(self, 'chat_template_kwargs') and self.chat_template_kwargs:
|
155 |
+
# If no_think_system_message is True, add /no_think tag
|
156 |
+
if self.chat_template_kwargs.get("no_think_system_message") == True:
|
157 |
+
system_content = "You are a helpful assistant. /no_think"
|
158 |
+
|
159 |
+
messages.insert(0, {"role": "system", "content": system_content})
|
160 |
+
|
161 |
# Apply chat template
|
162 |
text = self.tokenizer.apply_chat_template(
|
163 |
messages,
|
164 |
tokenize=False,
|
165 |
+
add_generation_prompt=self.chat_template_kwargs.get("add_generation_prompt", True)
|
166 |
)
|
167 |
return {"text": text}
|
168 |
except Exception as e:
|
test_no_think.py
ADDED
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Test script to verify /no_think tag handling in SmolLM3
|
4 |
+
"""
|
5 |
+
|
6 |
+
import sys
|
7 |
+
import os
|
8 |
+
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
9 |
+
|
10 |
+
from transformers import AutoTokenizer
|
11 |
+
from data import SmolLM3Dataset
|
12 |
+
|
13 |
+
def test_no_think_tag():
|
14 |
+
"""Test that /no_think tag is properly applied"""
|
15 |
+
|
16 |
+
# Load tokenizer
|
17 |
+
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM3-3B")
|
18 |
+
|
19 |
+
# Test data
|
20 |
+
test_data = [
|
21 |
+
{
|
22 |
+
"messages": [
|
23 |
+
{"role": "user", "content": "What is machine learning?"},
|
24 |
+
{"role": "assistant", "content": "Machine learning is a subset of AI..."}
|
25 |
+
]
|
26 |
+
}
|
27 |
+
]
|
28 |
+
|
29 |
+
# Test with no_think_system_message=True
|
30 |
+
print("=== Testing with no_think_system_message=True ===")
|
31 |
+
dataset_with_no_think = SmolLM3Dataset(
|
32 |
+
data_path="test_data",
|
33 |
+
tokenizer=tokenizer,
|
34 |
+
max_seq_length=4096,
|
35 |
+
use_chat_template=True,
|
36 |
+
chat_template_kwargs={
|
37 |
+
"add_generation_prompt": True,
|
38 |
+
"no_think_system_message": True
|
39 |
+
}
|
40 |
+
)
|
41 |
+
|
42 |
+
# Test with no_think_system_message=False
|
43 |
+
print("\n=== Testing with no_think_system_message=False ===")
|
44 |
+
dataset_without_no_think = SmolLM3Dataset(
|
45 |
+
data_path="test_data",
|
46 |
+
tokenizer=tokenizer,
|
47 |
+
max_seq_length=4096,
|
48 |
+
use_chat_template=True,
|
49 |
+
chat_template_kwargs={
|
50 |
+
"add_generation_prompt": True,
|
51 |
+
"no_think_system_message": False
|
52 |
+
}
|
53 |
+
)
|
54 |
+
|
55 |
+
# Test manual chat template application
|
56 |
+
print("\n=== Manual chat template test ===")
|
57 |
+
messages = [
|
58 |
+
{"role": "user", "content": "What is machine learning?"},
|
59 |
+
{"role": "assistant", "content": "Machine learning is a subset of AI..."}
|
60 |
+
]
|
61 |
+
|
62 |
+
# Without /no_think
|
63 |
+
text_without = tokenizer.apply_chat_template(
|
64 |
+
messages,
|
65 |
+
tokenize=False,
|
66 |
+
add_generation_prompt=True
|
67 |
+
)
|
68 |
+
print("Without /no_think:")
|
69 |
+
print(text_without[:200] + "..." if len(text_without) > 200 else text_without)
|
70 |
+
|
71 |
+
# With /no_think
|
72 |
+
messages_with_system = [
|
73 |
+
{"role": "system", "content": "You are a helpful assistant. /no_think"},
|
74 |
+
{"role": "user", "content": "What is machine learning?"},
|
75 |
+
{"role": "assistant", "content": "Machine learning is a subset of AI..."}
|
76 |
+
]
|
77 |
+
text_with = tokenizer.apply_chat_template(
|
78 |
+
messages_with_system,
|
79 |
+
tokenize=False,
|
80 |
+
add_generation_prompt=True
|
81 |
+
)
|
82 |
+
print("\nWith /no_think:")
|
83 |
+
print(text_with[:200] + "..." if len(text_with) > 200 else text_with)
|
84 |
+
|
85 |
+
if __name__ == "__main__":
|
86 |
+
test_no_think_tag()
|