Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries Paper • 2502.20475 • Published 28 days ago • 3
IHEval: Evaluating Language Models on Following the Instruction Hierarchy Paper • 2502.08745 • Published Feb 12 • 19
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 33
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 149
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning Paper • 2502.11271 • Published Feb 16 • 16
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Paper • 2502.12929 • Published Feb 18 • 7
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading Paper • 2502.12574 • Published Feb 18 • 11
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published Feb 18 • 67
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs Paper • 2502.10454 • Published Feb 12 • 7
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems Paper • 2502.11098 • Published Feb 16 • 13
System Message Generation for User Preferences using Open-Source Models Paper • 2502.11330 • Published Feb 17 • 15
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity Paper • 2502.11901 • Published Feb 17 • 6
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training Paper • 2502.11196 • Published Feb 16 • 22