Generalizable Implicit Motion Modeling for Video Frame Interpolation Paper • 2407.08680 • Published Jul 11, 2024 • 12
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects Paper • 2407.08711 • Published Jul 11, 2024 • 9
Scaling Up Personalized Aesthetic Assessment via Task Vector Customization Paper • 2407.07176 • Published Jul 9, 2024 • 6
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data Paper • 2407.08726 • Published Jul 11, 2024 • 11
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Paper • 2407.08583 • Published Jul 11, 2024 • 13
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published Jul 11, 2024 • 17
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Paper • 2407.08733 • Published Jul 11, 2024 • 23
SEED-Story: Multimodal Long Story Generation with Large Language Model Paper • 2407.08683 • Published Jul 11, 2024 • 25
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11, 2024 • 33
MambaVision: A Hybrid Mamba-Transformer Vision Backbone Paper • 2407.08083 • Published Jul 10, 2024 • 30
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 45
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 27 days ago • 183