MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Paper • 2503.07459 • Published 6 days ago • 14
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Paper • 2502.06589 • Published Feb 10 • 18
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search Paper • 2310.13227 • Published Oct 20, 2023 • 13