Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Paper • 2502.17387 • Published 3 days ago • 3
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models Paper • 2407.07086 • Published Jul 9, 2024
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper • 2412.02980 • Published Dec 4, 2024 • 13
PERSONA Collection Collection of various datasets related to the PERSONA paper. • 5 items • Updated Oct 28, 2024 • 2
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation Paper • 2410.02725 • Published Oct 3, 2024 • 1