Submitted by HelloJiang 129 Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention · 15 authors 6
Submitted by akhaliq 41 SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? · 4 authors 5
Submitted by Mifucius 27 I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models · 8 authors 3
Submitted by Ningyu 20 How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training · 8 authors 6
Submitted by zhihz0535 18 IHEval: Evaluating Language Models on Following the Instruction Hierarchy · 14 authors 2
Submitted by comin 16 HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation · 7 authors 2
Submitted by Minbyul 15 System Message Generation for User Preferences using Open-Source Models · 5 authors 2
Submitted by aboots 13 Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation · 8 authors 2
Submitted by comin 13 Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening · 6 authors 3
Submitted by nielsr 12 Intuitive physics understanding emerges from self-supervised pretraining on natural videos · 8 authors 2
Submitted by Bohan22 11 SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors · 3 authors 2
Submitted by WenDingY 10 The Mirage of Model Editing: Revisiting Evaluation in the Wild · 8 authors 2
Submitted by akhaliq 10 Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems · 5 authors 2
Submitted by dreamerdeo 9 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs · 41 authors 4
Submitted by vardaan123 9 Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents · 8 authors 2
Submitted by akhaliq 8 video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model · 8 authors 2
Submitted by ingeol 7 SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL · 4 authors 2
Submitted by akhaliq 6 One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs · 13 authors 2
Submitted by KomeijiForce 6 Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest · 4 authors 2
Submitted by shizhuo2 6 Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity · 3 authors 2
Submitted by gkakogeorgiou 5 EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling · 4 authors 2
Submitted by avanturist 5 Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning · 4 authors 2
Submitted by ChengyouJia 5 PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning · 9 authors 2
Submitted by emrecanacikgoz 4 Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model · 9 authors 2
Submitted by gretawarren 4 Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking · 3 authors 2
Submitted by hammh0a 3 Towards Data-Efficient Pretraining for Atomic Property Prediction · 3 authors 3
Submitted by ishikaa 1 Data Valuation using Neural Networks for Efficient Instruction Fine-Tuning · 2 authors 2
Submitted by ryuryukke - ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability · 5 authors 2
Submitted by birgermoell - Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance · 2 authors 2