EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents Paper • 2403.12014 • Published Mar 18
Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers Paper • 2406.15881 • Published Jun 22
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning Paper • 2410.03478 • Published Oct 4
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Paper • 2411.16657 • Published about 1 month ago • 17
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Paper • 2411.16657 • Published about 1 month ago • 17
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Paper • 2411.16657 • Published about 1 month ago • 17 • 2
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos Paper • 2409.07450 • Published Sep 11 • 10
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published Jun 24 • 59
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Paper • 2404.16994 • Published Apr 25 • 35
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model Paper • 2404.09967 • Published Apr 15 • 20
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model Paper • 2404.09967 • Published Apr 15 • 20