Commit
·
73722db
1
Parent(s):
f782aa4
Update user guidelines
Browse files
README.md
CHANGED
@@ -245,7 +245,9 @@ To achieve the expected performance, we recommend using the following configurat
|
|
245 |
1. Ensure the model starts with `<thought>\n` for reasoning steps. The model's output quality may be degraded when you omit it. You can easily apply this feature by using `tokenizer.apply_chat_template()` with `add_generation_prompt=True`. Please check the example code on [Quickstart](#quickstart) section.
|
246 |
2. The reasoning steps of EXAONE Deep models enclosed by `<thought>\n...\n</thought>` usually have lots of tokens, so previous reasoning steps may be necessary to be removed in multi-turn situation. The provided tokenizer handles this automatically.
|
247 |
3. Avoid using system prompt, and build the instruction on the user prompt.
|
248 |
-
4.
|
|
|
|
|
249 |
5. In our evaluation, we use `temperature=0.6` and `top_p=0.95` for generation.
|
250 |
6. When evaluating the models, it is recommended to test multiple times to assess the expected performance accurately.
|
251 |
|
|
|
245 |
1. Ensure the model starts with `<thought>\n` for reasoning steps. The model's output quality may be degraded when you omit it. You can easily apply this feature by using `tokenizer.apply_chat_template()` with `add_generation_prompt=True`. Please check the example code on [Quickstart](#quickstart) section.
|
246 |
2. The reasoning steps of EXAONE Deep models enclosed by `<thought>\n...\n</thought>` usually have lots of tokens, so previous reasoning steps may be necessary to be removed in multi-turn situation. The provided tokenizer handles this automatically.
|
247 |
3. Avoid using system prompt, and build the instruction on the user prompt.
|
248 |
+
4. Additional instructions help the models reason more deeply, so that the models generate better output.
|
249 |
+
- For math problems, the instructions **"Please reason step by step, and put your final answer within \boxed{}."** are helpful.
|
250 |
+
- For more information on our evaluation setting including prompts, please refer to our [Documentation](https://arxiv.org/abs/2503.12524).
|
251 |
5. In our evaluation, we use `temperature=0.6` and `top_p=0.95` for generation.
|
252 |
6. When evaluating the models, it is recommended to test multiple times to assess the expected performance accurately.
|
253 |
|