Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 9 days ago • 207
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 8 days ago • 79
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 12 days ago • 58
MPO: Boosting LLM Agents with Meta Plan Optimization Paper • 2503.02682 • Published 10 days ago • 23
Multi-Turn Code Generation Through Single-Step Rewards Paper • 2502.20380 • Published 15 days ago • 30
Kanana: Compute-efficient Bilingual Language Models Paper • 2502.18934 • Published 16 days ago • 62
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 22 days ago • 179
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 24 days ago • 67
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published Feb 10 • 86
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 203
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 39
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published Jan 29 • 23
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published Jan 28 • 36