UltraIF: Advancing Instruction Following from the Wild Paper • 2502.04153 • Published 3 days ago • 19
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding Paper • 2501.18362 • Published 10 days ago • 19
view article Article Process Reinforcement through Implicit Rewards By ganqu and 1 other • Jan 3 • 23
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 40
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 44
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations Paper • 2305.14233 • Published May 23, 2023 • 6