Accelerate High-Quality Diffusion Models with Inner Loop Feedback Paper • 2501.13107 • Published Jan 22 • 2
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3 • 3
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published Jan 30 • 21
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published 4 days ago • 125
HPSv3: Towards Wide-Spectrum Human Preference Score Paper • 2508.03789 • Published 13 days ago • 18
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17 • 72
Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective Paper • 2507.08801 • Published Jul 11 • 30
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14 • 49
Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation Paper • 2507.05963 • Published Jul 8 • 12
VMoBA: Mixture-of-Block Attention for Video Diffusion Models Paper • 2506.23858 • Published Jun 30 • 31
StreamDiT: Real-Time Streaming Text-to-Video Generation Paper • 2507.03745 • Published Jul 4 • 28
view article Article Fine-tuning Llama 2 70B using PyTorch FSDP By smangrul and 3 others • Sep 13, 2023 • 29
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement Paper • 2506.07848 • Published Jun 9 • 4
🌞 May 2025 - Open works from the Chinese community Collection 43 items • Updated 21 days ago • 9
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3 • 58
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published May 29 • 9