Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper ā¢ 2502.20172 ā¢ Published 14 days ago ā¢ 27
LongRoPE2: Near-Lossless LLM Context Window Scaling Paper ā¢ 2502.20082 ā¢ Published 14 days ago ā¢ 31
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper ā¢ 2502.20321 ā¢ Published 14 days ago ā¢ 29
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Paper ā¢ 2502.20395 ā¢ Published 14 days ago ā¢ 44
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper ā¢ 2502.19634 ā¢ Published 15 days ago ā¢ 58
Self-rewarding correction for mathematical reasoning Paper ā¢ 2502.19613 ā¢ Published 15 days ago ā¢ 77
Multi-Turn Code Generation Through Single-Step Rewards Paper ā¢ 2502.20380 ā¢ Published 14 days ago ā¢ 30
Tell me why: Visual foundation models as self-explainable classifiers Paper ā¢ 2502.19577 ā¢ Published 15 days ago ā¢ 10
How far can we go with ImageNet for Text-to-Image generation? Paper ā¢ 2502.21318 ā¢ Published 13 days ago ā¢ 25
Chain of Draft: Thinking Faster by Writing Less Paper ā¢ 2502.18600 ā¢ Published 16 days ago ā¢ 44
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper ā¢ 2503.00808 ā¢ Published 12 days ago ā¢ 53
CodeArena: A Collective Evaluation Platform for LLM Code Generation Paper ā¢ 2503.01295 ā¢ Published 11 days ago ā¢ 7
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion Paper ā¢ 2503.01183 ā¢ Published 11 days ago ā¢ 26
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper ā¢ 2503.01774 ā¢ Published 10 days ago ā¢ 39
Language Models can Self-Improve at State-Value Estimation for Better Search Paper ā¢ 2503.02878 ā¢ Published 9 days ago ā¢ 8
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization Paper ā¢ 2503.01328 ā¢ Published 11 days ago ā¢ 14
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper ā¢ 2503.01935 ā¢ Published 11 days ago ā¢ 24
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. ā¢ 34 items ā¢ Updated 4 days ago ā¢ 28