STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6 • 54
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper • 2501.01895 • Published Jan 3 • 51
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published Dec 24, 2024 • 75
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published 19 days ago • 13
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper • 2502.01105 • Published 19 days ago • 18
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet Paper • 2501.19085 • Published 22 days ago • 5
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm Paper • 2502.02358 • Published 18 days ago • 16
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Paper • 2502.04299 • Published 16 days ago • 15
Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Paper • 2502.04328 • Published 16 days ago • 26
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 16 days ago • 29
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper • 2502.06608 • Published 12 days ago • 32
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 10 days ago • 181