LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22 • 33
The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing Paper • 2311.01410 • Published Nov 2, 2023
Toward Understanding Generative Data Augmentation Paper • 2305.17476 • Published May 27, 2023 • 1
Revisiting Discriminative vs. Generative Classifiers: Theory and Implications Paper • 2302.02334 • Published Feb 5, 2023
Scaling Diffusion Transformers Efficiently via $μ$P Paper • 2505.15270 • Published May 21 • 34
On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability Paper • 2405.16845 • Published May 27, 2024