chaoscodes commited on
Commit
501a263
·
verified ·
1 Parent(s): fce4773

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -18,15 +18,18 @@ stage leveraging reinforcement learning (RL). Our approach results in Satori, a
18
  # **Resources**
19
  Please refer to our blog and research paper for more technical details of Satori.
20
  - [Blog](https://satori-reasoning.github.io/blog/satori/)
21
- - [Paper](https://satori-reasoning.github.io/blog/satori/)
22
 
23
  # **Citation**
24
  If you find our model and data helpful, please cite our paper:
25
  ```
26
- @article{TBD,
27
- title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
28
- author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
29
- journal={arXiv preprint arXiv: TBD},
30
- year={2025}
 
 
 
31
  }
32
  ```
 
18
  # **Resources**
19
  Please refer to our blog and research paper for more technical details of Satori.
20
  - [Blog](https://satori-reasoning.github.io/blog/satori/)
21
+ - [Paper](https://arxiv.org/pdf/2502.02508)
22
 
23
  # **Citation**
24
  If you find our model and data helpful, please cite our paper:
25
  ```
26
+ @misc{shen2025satorireinforcementlearningchainofactionthought,
27
+ title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
28
+ author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
29
+ year={2025},
30
+ eprint={2502.02508},
31
+ archivePrefix={arXiv},
32
+ primaryClass={cs.CL},
33
+ url={https://arxiv.org/abs/2502.02508},
34
  }
35
  ```