SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix Paper • 2407.00367 • Published Jun 29 • 9
HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors Paper • 2406.12459 • Published Jun 18 • 11
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction Paper • 2403.18795 • Published Mar 27 • 18
ST-LLM: Large Language Models Are Effective Temporal Learners Paper • 2404.00308 • Published Mar 30 • 5
InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds Paper • 2403.20309 • Published Mar 29 • 18
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes Paper • 2404.00987 • Published Apr 1 • 21
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30 • 41
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models Paper • 2404.01367 • Published Apr 1 • 21
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models Paper • 2404.02747 • Published Apr 3 • 11
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Paper • 2404.02733 • Published Apr 3 • 20
On the Scalability of Diffusion-based Text-to-Image Generation Paper • 2404.02883 • Published Apr 3 • 17
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3 • 65
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts Paper • 2403.08268 • Published Mar 13 • 15
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model Paper • 2312.02238 • Published Dec 4, 2023 • 25