Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis Paper • 2412.01819 • Published Dec 2, 2024 • 34
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness Paper • 2410.07035 • Published Oct 9, 2024 • 17
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights Paper • 2410.09008 • Published Oct 11, 2024 • 17
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9, 2024 • 36
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation Paper • 2410.09009 • Published Oct 11, 2024 • 14
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models Paper • 2410.07133 • Published Oct 9, 2024 • 19
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 50