Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant Paper • 2504.18373 • Published Apr 25 • 2 • 1