Collections
Discover the best community collections!
Collections including paper arxiv:2410.02089
-
Let's Verify Step by Step
Paper • 2305.20050 • Published • 9 -
LLM Critics Help Catch LLM Bugs
Paper • 2407.00215 • Published -
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper • 2407.21787 • Published • 3 -
Generative Verifiers: Reward Modeling as Next-Token Prediction
Paper • 2408.15240 • Published • 13
-
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Paper • 2408.11812 • Published • 4 -
WebArena: A Realistic Web Environment for Building Autonomous Agents
Paper • 2307.13854 • Published • 23 -
Agent Workflow Memory
Paper • 2409.07429 • Published • 27 -
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Paper • 2409.08264 • Published • 43
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 62 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 39 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29