YangXiao-nlp commited on
Commit
7b59ab3
Β·
verified Β·
1 Parent(s): 910230f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +196 -0
README.md ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LIMO: Less Is More for Reasoning πŸš€
2
+
3
+
4
+ ## πŸ“Œ Table of Contents
5
+ - [Overview](#overview)
6
+ - [Key Results](#key-results)
7
+ - [Model Zoo](#model-zoo)
8
+ - [Datasets](#datasets)
9
+ - [Quick Start](#quick-start)
10
+ - [Training](#training)
11
+ - [Evaluation](#evaluation)
12
+ - [Citation](#citation)
13
+
14
+
15
+ ## Overview
16
+
17
+ LIMO challenges the conventional wisdom in mathematical reasoning by demonstrating that models can achieve superior performance with significantly less but higher quality training data. Our approach:
18
+
19
+ - 🎯 Achieves SOTA with only 817 carefully curated training samples
20
+ - 🌟 Shows strong generalization across diverse problem types
21
+ - πŸ”¬ Provides comprehensive evaluation on 10 benchmarks
22
+ - πŸ“š Releases high-quality datasets and evaluation tools
23
+
24
+ ## Key Results
25
+
26
+ | Model | AIME24 | MATH500 | Training Samples |
27
+ |-------|--------|---------|-----------------|
28
+ | LIMO (Ours) | **57.1%** | **94.8%** | 817 |
29
+ | Previous SOTA | 6.5% | 59.2% | 100k+ |
30
+
31
+ <details>
32
+ <summary>Click to see more detailed results</summary>
33
+
34
+ | Benchmark | LIMO | Previous SOTA | Improvement |
35
+ |-----------|------|--------------------------|-------------|
36
+ | AIME24 | **57.1%** | 6.5% | +50.6% |
37
+ | MATH500 | **94.8%** | 59.2% | +35.6% |
38
+ | AMC23 | **92.0%** | 40.6% | +51.4% |
39
+ | OlympiadBench | **66.8%** | 36.7% | +30.1% |
40
+ | CHMath | **75.4%** | 11.2% | +64.2% |
41
+ | Gaokao | **81.0%** | 49.4% | +31.6% |
42
+ | Kaoyan | **73.4%** | 32.7% | +40.7% |
43
+ | GradeSchool | **76.2%** | 36.2% | +40.0% |
44
+ | Minerva | 44.9% | **47.1%** | -2.2% |
45
+ | GPQA | 66.7% | **73.3%** | -6.6% |
46
+
47
+ </details>
48
+
49
+ ## Model Zoo
50
+
51
+ Our LIMO model is available on Hugging Face πŸ€—:
52
+
53
+ | Model | Backbone | Size | Link |
54
+ |-------|------|------|------|
55
+ | LIMO | [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) | 32B | [πŸ€—](https://huggingface.co/GAIR/LIMO) |
56
+
57
+
58
+ ## Datasets
59
+
60
+ We release our datasets through Hugging Face πŸ€—:
61
+
62
+ | Dataset | Description | Size | Link |
63
+ |---------|-------------|------|------|
64
+ | LIMO | Training set used to train LIMO model | 817 | [πŸ€—](https://huggingface.co/datasets/GAIR/LIMO) |
65
+
66
+ Note: We are gradually releasing additional datasets mentioned in our paper, including those used for comparative experiments, to facilitate reproducibility and further analysis by the research community. Stay tuned!
67
+
68
+ ## Quick Start
69
+
70
+ Our model is fine-tuned on [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) and is compatible with most mainstream frameworks like [HF Transformers](https://github.com/huggingface/transformers), [VLLM](https://github.com/vllm-project/vllm), [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) and etc.
71
+
72
+
73
+ <details>
74
+ <summary>Start with HF Transformers</summary>
75
+
76
+ ```bash
77
+ # Install required packages
78
+ pip install transformers
79
+ ```
80
+
81
+ ```python
82
+ from transformers import AutoModelForCausalLM, AutoTokenizer
83
+ import torch
84
+
85
+ # Initialize model and tokenizer
86
+ model = AutoModelForCausalLM.from_pretrained(
87
+ "GAIR/LIMO",
88
+ torch_dtype="auto",
89
+ trust_remote_code=True,
90
+ device_map="auto"
91
+ )
92
+ tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True)
93
+
94
+ # Prepare input messages (We use the following template and system prompt during training and inference)
95
+ messages = [
96
+ {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
97
+ {"role": "user", "content": "What is the result of 1+1?"}
98
+ ]
99
+
100
+ # Format input using chat template
101
+ text = tokenizer.apply_chat_template(
102
+ messages,
103
+ tokenize=False,
104
+ add_generation_prompt=True
105
+ )
106
+
107
+ # Tokenize input
108
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
109
+
110
+ # Generate response
111
+ outputs = model.generate(
112
+ **inputs,
113
+ max_new_tokens=32768,
114
+ temperature=0.7,
115
+ top_p=0.95,
116
+ do_sample=True
117
+ )
118
+
119
+ # Decode and print response
120
+ response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
121
+ print(response)
122
+ ```
123
+
124
+ </details>
125
+
126
+ <details>
127
+ <summary>Start with VLLM</summary>
128
+
129
+ ```bash
130
+ # Install required packages
131
+ pip install vllm
132
+ ```
133
+
134
+
135
+ ```python
136
+ from vllm import LLM, SamplingParams
137
+ from transformers import AutoTokenizer
138
+
139
+ # Initialize the model
140
+ llm = LLM(
141
+ model="GAIR/LIMO",
142
+ tensor_parallel_size=4, # adjust based on available GPUs
143
+ trust_remote_code=True,
144
+ swap_space=60,
145
+ gpu_memory_utilization=0.96,
146
+ )
147
+
148
+ # Prepare input messages (We use the following template and system prompt during training and inference)
149
+ messages = [
150
+ {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
151
+ {"role": "user", "content": "What is the result of 1+1?"}
152
+ ]
153
+
154
+ # Setup tokenizer
155
+ tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True)
156
+ text = tokenizer.apply_chat_template(
157
+ messages,
158
+ tokenize=False,
159
+ add_generation_prompt=True
160
+ )
161
+
162
+ # Configure generation parameters
163
+ sampling_params = SamplingParams(
164
+ temperature=0.7,
165
+ max_tokens=32768,
166
+ top_p=0.95,
167
+ )
168
+
169
+ # Generate response
170
+ output = llm.generate(text, sampling_params)
171
+ print(output[0].outputs[0].text)
172
+ ```
173
+
174
+ </details>
175
+
176
+
177
+
178
+
179
+ ## License
180
+
181
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
182
+
183
+
184
+ ## Citation
185
+
186
+ ```bibtex
187
+ @misc{ye2025limoreasoning,
188
+ title={LIMO: Less is More for Reasoning},
189
+ author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
190
+ year={2025},
191
+ eprint={2502.03387},
192
+ archivePrefix={arXiv},
193
+ primaryClass={cs.CL},
194
+ url={https://arxiv.org/abs/2502.03387},
195
+ }
196
+ ```