x54-729
commited on
Commit
·
123447e
1
Parent(s):
f6d24e6
update readme
Browse files
README.md
CHANGED
@@ -1,13 +1,3 @@
|
|
1 |
-
---
|
2 |
-
pipeline_tag: text-generation
|
3 |
-
license: other
|
4 |
-
language:
|
5 |
-
- en
|
6 |
-
- zh
|
7 |
-
tags:
|
8 |
-
- math
|
9 |
-
---
|
10 |
-
|
11 |
# InternLM-Math
|
12 |
|
13 |
<div align="center">
|
@@ -24,25 +14,33 @@ tags:
|
|
24 |
<div> </div>
|
25 |
</div>
|
26 |
|
27 |
-
State-of-the-art bilingual open-sourced Math reasoning LLMs.
|
28 |
A **solver**, **prover**, **verifier**, **augmentor**.
|
29 |
|
30 |
-
[💻 Github](https://github.com/InternLM/InternLM-Math) [🤗 Demo](https://huggingface.co/spaces/internlm/internlm2-math-7b) [🤗 Checkpoints](https://huggingface.co/internlm/internlm2-math-7b) [![OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-7B)
|
31 |
</div>
|
32 |
|
|
|
|
|
|
|
|
|
|
|
33 |
# Introduction
|
34 |
- **7B and 20B Chinese and English Math LMs with better than ChatGPT performances.** InternLM2-Math are continued pretrained from InternLM2-Base with ~100B high quality math-related tokens and SFT with ~2M bilingual math supervised data. We apply minhash and exact number match to decontaminate possible test set leakage.
|
35 |
- **Add Lean as a support language for math problem solving and math theorem proving.** We are exploring combining Lean 3 with InternLM-Math for verifiable math reasoning. InternLM-Math can generate Lean codes for simple math reasoning tasks like GSM8K or provide possible proof tactics based on Lean states.
|
36 |
- **Also can be viewed as a reward model, which supports the Outcome/Process/Lean Reward Model.** We supervise InternLM2-Math with various types of reward modeling data, to make InternLM2-Math can also verify chain-of-thought processes. We also add the ability to convert a chain-of-thought process into Lean 3 code.
|
37 |
-
- **A Math LM Augment Helper** and **Code
|
|
|
|
|
38 |
|
39 |
# Models
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
| **InternLM2-Math-Base-
|
44 |
-
| **InternLM2-Math-
|
45 |
-
| **InternLM2-Math-
|
|
|
46 |
|
47 |
|
48 |
# Performance
|
@@ -91,7 +89,7 @@ from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
|
|
91 |
|
92 |
backend_config = TurbomindEngineConfig(model_name='internlm2-chat-7b', tp=1, cache_max_entry_count=0.3)
|
93 |
chat_template = ChatTemplateConfig(model_name='internlm2-chat-7b', system='', eosys='', meta_instruction='')
|
94 |
-
pipe = pipeline(model_path='internlm/internlm2-math-7b', chat_template_config=chat_template, backend_config=backend_config)
|
95 |
|
96 |
problem = '1+1='
|
97 |
result = pipe([problem], request_output_len=1024, top_k=1)
|
@@ -101,9 +99,9 @@ result = pipe([problem], request_output_len=1024, top_k=1)
|
|
101 |
```python
|
102 |
import torch
|
103 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
104 |
-
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-math-7b", trust_remote_code=True)
|
105 |
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
106 |
-
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-math-7b", trust_remote_code=True, torch_dtype=torch.float16).cuda()
|
107 |
model = model.eval()
|
108 |
response, history = model.chat(tokenizer, "1+1=", history=[], meta_instruction="")
|
109 |
print(response)
|
@@ -112,6 +110,21 @@ print(response)
|
|
112 |
# Special usages
|
113 |
We list some instructions used in our SFT. You can use them to help you. You can use the other ways to prompt the model, but the following are recommended. InternLM2-Math may combine the following abilities but it is not guaranteed.
|
114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
115 |
| Description | Query |
|
116 |
| --- | --- |
|
117 |
| Solving question via chain-of-thought | {Question} |
|
@@ -120,26 +133,27 @@ We list some instructions used in our SFT. You can use them to help you. You can
|
|
120 |
| Process reward model | Given a question and an answer, check correctness of each step.\nQuestion:{Question}\nAnswer:{COT} |
|
121 |
| Reward model | Given a question and two answers, which one is better? \nQuestion:{Question}\nAnswer 1:{COT}\nAnswer 2:{COT} |
|
122 |
| Convert chain-of-thought to Lean 3 | Convert this answer into Lean3. Question:{Question}\nAnswer:{COT} |
|
123 |
-
| Convert Lean 3 to chain-of-thought | Convert this lean 3 code into a natural language problem with answers:\n{LEAN} |
|
124 |
| Translate question and chain-of-thought answer to a proof statement | Convert this question and answer into a proof format.\nQuestion:{Question}\nAnswer:{COT} |
|
125 |
| Translate proof problem to Lean 3 | Convert this natural langauge statement into a Lean 3 theorem statement:{Theorem} |
|
126 |
| Translate Lean 3 to proof problem | Convert this Lean 3 theorem statement into natural language:{STATEMENT} |
|
127 |
-
| Suggest a tactic based on Lean state | Given the Lean 3 tactic state, suggest a next tactic:\n{State} |
|
128 |
-
| Rephrase Problem | Describe this problem in another way. {
|
129 |
| Augment Problem | Please augment a new problem based on: {Question} |
|
130 |
| Augment a harder Problem | Increase the complexity of the problem: {Question} |
|
131 |
| Change specific numbers | Change specific numbers: {Question}|
|
132 |
| Introduce fractions or percentages | Introduce fractions or percentages: {Question}|
|
133 |
-
| Code
|
134 |
| In-context Learning | Question:{Question}\nAnswer:{COT}\n...Question:{Question}\nAnswer:{COT}|
|
135 |
|
136 |
# Fine-tune and others
|
137 |
Please refer to [InternLM](https://github.com/InternLM/InternLM/tree/main).
|
138 |
|
139 |
# Known issues
|
140 |
-
Our model is still under development and will be upgraded. There are some possible issues of InternLM-Math.
|
141 |
- Jump the calculating step.
|
142 |
- Perform badly at Chinese fill-in-the-bank problems and English choice problems due to SFT data composition.
|
|
|
143 |
- The reward model mode can be better leveraged with assigned token probabilities.
|
144 |
- Code switch due to SFT data composition.
|
145 |
- Some abilities of Lean can only be adapted to GSM8K-like problems (e.g. Convert chain-of-thought to Lean 3), and performance related to Lean is not guaranteed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# InternLM-Math
|
2 |
|
3 |
<div align="center">
|
|
|
14 |
<div> </div>
|
15 |
</div>
|
16 |
|
17 |
+
State-of-the-art bilingual open-sourced Math reasoning LLMs.
|
18 |
A **solver**, **prover**, **verifier**, **augmentor**.
|
19 |
|
20 |
+
[💻 Github](https://github.com/InternLM/InternLM-Math) [🤗 Demo](https://huggingface.co/spaces/internlm/internlm2-math-7b) [🤗 Checkpoints](https://huggingface.co/internlm/internlm2-math-7b) [![OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-7B) [<img src="https://raw.githubusercontent.com/InternLM/InternLM/main/assets/modelscope_logo.png" width="20px" /> ModelScope](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-7b/summary)
|
21 |
</div>
|
22 |
|
23 |
+
# News
|
24 |
+
- [2024.01.29] We add checkpoints from ModelScope. Tech report is on the way!
|
25 |
+
- [2024.01.26] We add checkpoints from OpenXLab, which ease Chinese users to download!
|
26 |
+
|
27 |
+
|
28 |
# Introduction
|
29 |
- **7B and 20B Chinese and English Math LMs with better than ChatGPT performances.** InternLM2-Math are continued pretrained from InternLM2-Base with ~100B high quality math-related tokens and SFT with ~2M bilingual math supervised data. We apply minhash and exact number match to decontaminate possible test set leakage.
|
30 |
- **Add Lean as a support language for math problem solving and math theorem proving.** We are exploring combining Lean 3 with InternLM-Math for verifiable math reasoning. InternLM-Math can generate Lean codes for simple math reasoning tasks like GSM8K or provide possible proof tactics based on Lean states.
|
31 |
- **Also can be viewed as a reward model, which supports the Outcome/Process/Lean Reward Model.** We supervise InternLM2-Math with various types of reward modeling data, to make InternLM2-Math can also verify chain-of-thought processes. We also add the ability to convert a chain-of-thought process into Lean 3 code.
|
32 |
+
- **A Math LM Augment Helper** and **Code Interpreter**. InternLM2-Math can help augment math reasoning problems and solve them using the code interpreter which makes you generate synthesis data quicker!
|
33 |
+
|
34 |
+
![hungarian](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/hungary.jpeg)
|
35 |
|
36 |
# Models
|
37 |
+
**InternLM2-Math-Base-7B** and **InternLM2-Math-Base-20B** are pretrained checkpoints. **InternLM2-Math-7B** and **InternLM2-Math-20B** are SFT checkpoints.
|
38 |
+
| Model |Model Type | Transformers(HF) |OpenXLab| ModelScope | Release Date |
|
39 |
+
|---|---|---|---|---|---|
|
40 |
+
| **InternLM2-Math-Base-7B** | Base| [🤗internlm/internlm2-math-base-7b](https://huggingface.co/internlm/internlm2-math-base-7b) |[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-Base-7B)| [<img src="https://raw.githubusercontent.com/InternLM/InternLM/main/assets/modelscope_logo.png" width="20px" /> internlm2-math-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-7b/summary)| 2024-01-23|
|
41 |
+
| **InternLM2-Math-Base-20B** | Base| [🤗internlm/internlm2-math-base-20b](https://huggingface.co/internlm/internlm2-math-base-20b) |[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-Base-20B)|[<img src="https://raw.githubusercontent.com/InternLM/InternLM/main/assets/modelscope_logo.png" width="20px" /> internlm2-math-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-20b/summary)| 2024-01-23|
|
42 |
+
| **InternLM2-Math-7B** | Chat| [🤗internlm/internlm2-math-7b](https://huggingface.co/internlm/internlm2-math-7b) |[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-7B)|[<img src="https://raw.githubusercontent.com/InternLM/InternLM/main/assets/modelscope_logo.png" width="20px" /> internlm2-math-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-7b/summary)| 2024-01-23|
|
43 |
+
| **InternLM2-Math-20B** | Chat| [🤗internlm/internlm2-math-20b](https://huggingface.co/internlm/internlm2-math-20b) |[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM2-Math-20B)|[<img src="https://raw.githubusercontent.com/InternLM/InternLM/main/assets/modelscope_logo.png" width="20px" /> internlm2-math-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-20b/summary)| 2024-01-23|
|
44 |
|
45 |
|
46 |
# Performance
|
|
|
89 |
|
90 |
backend_config = TurbomindEngineConfig(model_name='internlm2-chat-7b', tp=1, cache_max_entry_count=0.3)
|
91 |
chat_template = ChatTemplateConfig(model_name='internlm2-chat-7b', system='', eosys='', meta_instruction='')
|
92 |
+
pipe = pipeline(model_path='internlm/internlm2-math-base-7b', chat_template_config=chat_template, backend_config=backend_config)
|
93 |
|
94 |
problem = '1+1='
|
95 |
result = pipe([problem], request_output_len=1024, top_k=1)
|
|
|
99 |
```python
|
100 |
import torch
|
101 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
102 |
+
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-math-base-7b", trust_remote_code=True)
|
103 |
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
104 |
+
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-math-base-7b", trust_remote_code=True, torch_dtype=torch.float16).cuda()
|
105 |
model = model.eval()
|
106 |
response, history = model.chat(tokenizer, "1+1=", history=[], meta_instruction="")
|
107 |
print(response)
|
|
|
110 |
# Special usages
|
111 |
We list some instructions used in our SFT. You can use them to help you. You can use the other ways to prompt the model, but the following are recommended. InternLM2-Math may combine the following abilities but it is not guaranteed.
|
112 |
|
113 |
+
Translate proof problem to Lean:
|
114 |
+
![nl2lean3](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/nl2lean.jpeg)
|
115 |
+
|
116 |
+
Using Lean 3 to solve GSM8K problem:
|
117 |
+
![gsm8k_lean](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/gsm8k_lean.jpeg)
|
118 |
+
|
119 |
+
Generate problem based on Lean 3 code:
|
120 |
+
![lean_problem](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/lean_problem.jpeg)
|
121 |
+
|
122 |
+
Play 24 point game:
|
123 |
+
![24](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/24.jpeg)
|
124 |
+
|
125 |
+
Augment a harder math problem:
|
126 |
+
![augment_hard](https://raw.githubusercontent.com/InternLM/InternLM/main/assets/augment_hard.jpeg)
|
127 |
+
|
128 |
| Description | Query |
|
129 |
| --- | --- |
|
130 |
| Solving question via chain-of-thought | {Question} |
|
|
|
133 |
| Process reward model | Given a question and an answer, check correctness of each step.\nQuestion:{Question}\nAnswer:{COT} |
|
134 |
| Reward model | Given a question and two answers, which one is better? \nQuestion:{Question}\nAnswer 1:{COT}\nAnswer 2:{COT} |
|
135 |
| Convert chain-of-thought to Lean 3 | Convert this answer into Lean3. Question:{Question}\nAnswer:{COT} |
|
136 |
+
| Convert Lean 3 to chain-of-thought | Convert this lean 3 code into a natural language problem with answers:\n{LEAN Code} |
|
137 |
| Translate question and chain-of-thought answer to a proof statement | Convert this question and answer into a proof format.\nQuestion:{Question}\nAnswer:{COT} |
|
138 |
| Translate proof problem to Lean 3 | Convert this natural langauge statement into a Lean 3 theorem statement:{Theorem} |
|
139 |
| Translate Lean 3 to proof problem | Convert this Lean 3 theorem statement into natural language:{STATEMENT} |
|
140 |
+
| Suggest a tactic based on Lean state | Given the Lean 3 tactic state, suggest a next tactic:\n{LEAN State} |
|
141 |
+
| Rephrase Problem | Describe this problem in another way. {Question} |
|
142 |
| Augment Problem | Please augment a new problem based on: {Question} |
|
143 |
| Augment a harder Problem | Increase the complexity of the problem: {Question} |
|
144 |
| Change specific numbers | Change specific numbers: {Question}|
|
145 |
| Introduce fractions or percentages | Introduce fractions or percentages: {Question}|
|
146 |
+
| Code Interpreter | [lagent](https://github.com/InternLM/InternLM/blob/main/agent/lagent.md) |
|
147 |
| In-context Learning | Question:{Question}\nAnswer:{COT}\n...Question:{Question}\nAnswer:{COT}|
|
148 |
|
149 |
# Fine-tune and others
|
150 |
Please refer to [InternLM](https://github.com/InternLM/InternLM/tree/main).
|
151 |
|
152 |
# Known issues
|
153 |
+
Our model is still under development and will be upgraded. There are some possible issues of InternLM-Math. If you find performances of some abilities are not great, welcome to open an issue.
|
154 |
- Jump the calculating step.
|
155 |
- Perform badly at Chinese fill-in-the-bank problems and English choice problems due to SFT data composition.
|
156 |
+
- Tend to generate Code Interpreter when facing Chinese problems due to SFT data composition.
|
157 |
- The reward model mode can be better leveraged with assigned token probabilities.
|
158 |
- Code switch due to SFT data composition.
|
159 |
- Some abilities of Lean can only be adapted to GSM8K-like problems (e.g. Convert chain-of-thought to Lean 3), and performance related to Lean is not guaranteed.
|