R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published 3 days ago • 23
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Paper • 2503.10639 • Published 6 days ago • 45
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks Paper • 2406.12925 • Published Jun 14, 2024 • 25
Learning from Failures in Multi-Attempt Reinforcement Learning Paper • 2503.04808 • Published 16 days ago • 17
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Paper • 2503.07027 • Published 10 days ago • 24
AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning Paper • 2503.07608 • Published 9 days ago • 19