Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 8 days ago • 202
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 7 days ago • 76
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 11 days ago • 58
Multi-Turn Code Generation Through Single-Step Rewards Paper • 2502.20380 • Published 14 days ago • 30
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 21 days ago • 179
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 23 days ago • 66
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published Feb 10 • 86
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 203
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 39
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published Jan 29 • 23
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published Jan 28 • 36
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published Jan 16 • 70
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 275