Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published 2 days ago • 20
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation Paper • 2412.03069 • Published 22 days ago • 30
Papers I want to read Collection Papers in my to-read list • 252 items • Updated 1 day ago • 26
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions Paper • 2407.06358 • Published Jul 8 • 18
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation Paper • 2403.14621 • Published Mar 21 • 14