Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering Paper • 2505.23604 • Published May 29 • 24
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning Paper • 2505.15034 • Published May 21 • 5