RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published 6 days ago • 36
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published 8 days ago • 39
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning Paper • 2503.15265 • Published 7 days ago • 43
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 8 days ago • 104
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 8 days ago • 130
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 12 days ago • 120
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 14 days ago • 62
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 16 days ago • 95
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 21 days ago • 216
BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities Paper • 2503.05652 • Published 19 days ago • 10
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 19 days ago • 33
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 19 days ago • 51
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published 19 days ago • 108
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 20 days ago • 85
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 24 days ago • 61