Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning
•
1
Multimodal AI, Agents
Open Agent Leaderboard
Mark regions in images based on text descriptions
Process and answer questions about webpage videos
VLM-R1 model for Open-Vocabulary Object Detection