Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published about 21 hours ago • 8
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL Paper • 2502.11438 • Published 4 days ago • 7
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model Paper • 2502.11775 • Published 4 days ago • 8
System Message Generation for User Preferences using Open-Source Models Paper • 2502.11330 • Published 5 days ago • 15
SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors Paper • 2502.11167 • Published 5 days ago • 11
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 9 days ago • 27
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Paper • 2502.12929 • Published 3 days ago • 6
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation Paper • 2502.13092 • Published 3 days ago • 12
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 5 days ago • 48