Safetensors
llama
keeeeenw commited on
Commit
83efcf3
·
verified ·
1 Parent(s): 4db74b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -181
README.md CHANGED
@@ -16,40 +16,17 @@ base_model:
16
  ---
17
  # Model Card for Model ID
18
 
19
- <!-- Provide a quick summary of what the model is/does. -->
20
 
21
- Introducing Llama-3.2-1B-Instruct-Open-R1-Distill based on Llama-3.2-1B-Instruct and HuggingFace's [OpenR1](https://github.com/huggingface/open-r1) which is the fully open reproduction of DeepSeek-R1.
22
 
23
- I have always been passionate about advancing state-of-the-art LLM technology in smaller models that can run efficiently on laptop CPUs and smartphones.
24
- Thanks to the recent breakthrough of DeepSeek-R1, it has become surprisingly easy to develop a reasoning model through distillation,
25
- requiring only supervised fine-tuning (SFT) on a dataset generated by a teacher model.
26
- Additionally, with the tools provided by Hugging Face, we now have a streamlined method to achieve this.
27
 
28
- To reprdouce the results, simply go to HuggingFace's [OpenR1](https://github.com/huggingface/open-r1) and install the package.
29
-
30
- And then execute the following command:
31
- ```
32
- ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/sft.py --config recipes/config_llama3_instrcut_1b.yaml
33
- ```
34
-
35
- You can create your own ```recipes/config_llama3_instrcut_1b.yaml``` by copying (config_full.yaml)[https://github.com/huggingface/open-r1/blob/main/recipes/qwen/Qwen2.5-1.5B-Instruct/sft/config_full.yaml]
36
- to the desired folder and change model path to ```model_name_or_path: meta-llama/Llama-3.2-1B-Instruct``` or any HuggingFace model repo id you are interested in.
37
- You may also choose to training for more than 1 epoch (I trained for 5 epoch).
38
- Also, if you want to get intermediate checkpoints, set the save parameters accordingly:
39
-
40
- ```
41
- save_strategy: "steps"
42
- save_steps: 100
43
- ```
44
-
45
- I have tried to use 1 for both train and eval batch size on 1 Nvidia 4090 but still got OOM so I rented 4 x LS40s from [vast.ai]. Training 5 epoch only required < 4 hours.
46
- ```
47
- per_device_eval_batch_size: 4
48
- per_device_train_batch_size: 4
49
- ```
50
 
 
51
 
52
- ## Model Details
53
 
54
  ### Model Description
55
 
@@ -63,9 +40,16 @@ per_device_train_batch_size: 4
63
  - **License:** Apache License 2.0
64
  - **Finetuned from model [optional]:** Llama-3.2-1B-Instruct
65
 
66
- ## Uses
67
 
68
- ```
 
 
 
 
 
 
 
69
  model = LlamaForCausalLM.from_pretrained("keeeeenw/Llama-3.2-1B-Instruct-Open-R1-Distill")
70
 
71
  # Prompt supported by HuggingFaceH4/Bespoke-Stratos-17k
@@ -92,163 +76,48 @@ outputs = model.generate(inputs['input_ids'],
92
  print(tokenizer.decode(outputs[0]))
93
  ```
94
 
95
- ### Direct Use
96
-
97
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
98
-
99
- [More Information Needed]
100
-
101
- ### Downstream Use [optional]
102
-
103
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
104
-
105
- [More Information Needed]
106
-
107
- ### Out-of-Scope Use
108
-
109
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
110
-
111
- [More Information Needed]
112
-
113
- ## Bias, Risks, and Limitations
114
-
115
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
116
-
117
- [More Information Needed]
118
-
119
- ### Recommendations
120
-
121
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
122
-
123
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
124
-
125
- ## How to Get Started with the Model
126
-
127
- Use the code below to get started with the model.
128
-
129
- [More Information Needed]
130
-
131
- ## Training Details
132
-
133
- ### Training Data
134
-
135
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
136
-
137
- [More Information Needed]
138
-
139
- ### Training Procedure
140
-
141
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
142
-
143
- #### Preprocessing [optional]
144
-
145
- [More Information Needed]
146
-
147
-
148
- #### Training Hyperparameters
149
-
150
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
151
-
152
- #### Speeds, Sizes, Times [optional]
153
-
154
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
155
-
156
- [More Information Needed]
157
-
158
- ## Evaluation
159
 
160
- <!-- This section describes the evaluation protocols and provides the results. -->
161
-
162
- ### Testing Data, Factors & Metrics
163
-
164
- #### Testing Data
165
-
166
- <!-- This should link to a Dataset Card if possible. -->
167
-
168
- [More Information Needed]
169
-
170
- #### Factors
171
-
172
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
173
-
174
- [More Information Needed]
175
-
176
- #### Metrics
177
-
178
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
179
-
180
- [More Information Needed]
181
-
182
- ### Results
183
-
184
- [More Information Needed]
185
-
186
- #### Summary
187
-
188
-
189
-
190
- ## Model Examination [optional]
191
-
192
- <!-- Relevant interpretability work for the model goes here -->
193
-
194
- [More Information Needed]
195
-
196
- ## Environmental Impact
197
-
198
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
199
-
200
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
201
-
202
- - **Hardware Type:** [More Information Needed]
203
- - **Hours used:** [More Information Needed]
204
- - **Cloud Provider:** [More Information Needed]
205
- - **Compute Region:** [More Information Needed]
206
- - **Carbon Emitted:** [More Information Needed]
207
-
208
- ## Technical Specifications [optional]
209
-
210
- ### Model Architecture and Objective
211
-
212
- [More Information Needed]
213
-
214
- ### Compute Infrastructure
215
-
216
- [More Information Needed]
217
-
218
- #### Hardware
219
-
220
- [More Information Needed]
221
-
222
- #### Software
223
-
224
- [More Information Needed]
225
-
226
- ## Citation [optional]
227
-
228
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
229
-
230
- **BibTeX:**
231
-
232
- [More Information Needed]
233
-
234
- **APA:**
235
-
236
- [More Information Needed]
237
 
238
- ## Glossary [optional]
 
 
 
239
 
240
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 
 
 
241
 
242
- [More Information Needed]
 
 
 
243
 
244
- ## More Information [optional]
 
 
 
 
245
 
246
- [More Information Needed]
247
 
248
- ## Model Card Authors [optional]
249
 
250
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
251
 
252
- ## Model Card Contact
253
 
254
- [More Information Needed]
 
16
  ---
17
  # Model Card for Model ID
18
 
19
+ # 🚀 Introducing Llama-3.2-1B-Instruct-Open-R1-Distill
20
 
21
+ Built on **Llama-3.2-1B-Instruct** and Hugging Face’s [OpenR1](https://github.com/huggingface/open-r1) a fully open reproduction of **DeepSeek-R1** — this model brings powerful reasoning capabilities to compact, efficient architectures.
22
 
23
+ ## 📌 Why This Matters
 
 
 
24
 
25
+ I have always been passionate about pushing the boundaries of **LLM** technology in smaller models that can run seamlessly on laptop CPUs and smartphones.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ With the recent breakthrough of **DeepSeek-R1**, developing a high-quality reasoning model through distillation has become remarkably straightforward. It requires only **supervised fine-tuning (SFT)** on a dataset generated by a teacher model.
28
 
29
+ Thanks to **Hugging Face**, we now have a streamlined framework to make this process more accessible than ever.
30
 
31
  ### Model Description
32
 
 
40
  - **License:** Apache License 2.0
41
  - **Finetuned from model [optional]:** Llama-3.2-1B-Instruct
42
 
43
+ ## 🎯 Uses
44
 
45
+ - 💡 **On-device AI assistants** for reasoning and general-purpose tasks
46
+ - 📱 **Mobile and edge AI applications** requiring lightweight models
47
+ - 🤖 **Chatbots and virtual assistants** optimized for efficiency
48
+ - 🏗 **Fine-tuning for specific domains** with SFT training
49
+
50
+ ### How to run the code?
51
+
52
+ ```{python}
53
  model = LlamaForCausalLM.from_pretrained("keeeeenw/Llama-3.2-1B-Instruct-Open-R1-Distill")
54
 
55
  # Prompt supported by HuggingFaceH4/Bespoke-Stratos-17k
 
76
  print(tokenizer.decode(outputs[0]))
77
  ```
78
 
79
+ ## 🏋️‍♂️ Training Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
+ To reprdouce the results, simply go to HuggingFace's [OpenR1](https://github.com/huggingface/open-r1) and install the package.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
+ And then execute the following command:
84
+ ```
85
+ ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/sft.py --config recipes/config_llama3_instrcut_1b.yaml
86
+ ```
87
 
88
+ You can create your own ```recipes/config_llama3_instrcut_1b.yaml``` by copying [config_full.yaml](https://github.com/huggingface/open-r1/blob/main/recipes/qwen/Qwen2.5-1.5B-Instruct/sft/config_full.yaml)
89
+ to the desired folder and change model path to ```model_name_or_path: meta-llama/Llama-3.2-1B-Instruct``` or any HuggingFace model repo id you are interested in.
90
+ You may also choose to training for more than 1 epoch (I trained for 5 epoch).
91
+ Also, if you want to get intermediate checkpoints, set the save parameters accordingly:
92
 
93
+ ```
94
+ save_strategy: "steps"
95
+ save_steps: 100
96
+ ```
97
 
98
+ I have tried to use 1 for both train and eval batch size on 1 Nvidia 4090 but still got OOM so I rented 4 x LS40s from [vast.ai]. Training 5 epoch only required < 4 hours.
99
+ ```
100
+ per_device_eval_batch_size: 4
101
+ per_device_train_batch_size: 4
102
+ ```
103
 
104
+ ## 📊 Evaluation
105
 
106
+ The evaluation of this model is based on HuggingFace's instructions [OpenR1](https://github.com/huggingface/open-r1)
107
 
108
+ ```
109
+ NUM_GPUS=4
110
+ MODEL="/root/open-r1/data/meta-llama/Llama-3.2-1B-Instruct"
111
+ MODEL_ARGS="pretrained=$MODEL,dtype=float16,data_parallel_size=$NUM_GPUS,max_model_length=32768,gpu_memory_utilisation=0.8"
112
+ TASK=aime24
113
+ OUTPUT_DIR=data/evals/$MODEL
114
+
115
+ lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
116
+ --custom-tasks src/open_r1/evaluate.py \
117
+ --use-chat-template \
118
+ --system-prompt="Please reason step by step, and put your final answer within \boxed{}." \
119
+ --output-dir $OUTPUT_DIR
120
+ ```
121
 
122
+ Results: To be added
123