Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published 12 days ago • 57
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers Paper • 2507.04404 • Published Jul 6 • 21
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos Paper • 2507.05675 • Published Jul 8 • 26
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy Paper • 2503.24388 • Published Mar 31 • 31
TeleAntiFraud-28k: A Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection Paper • 2503.24115 • Published Mar 31 • 12
TeleAntiFraud-28k: A Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection Paper • 2503.24115 • Published Mar 31 • 12 • 2
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published Feb 26 • 64
SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation Paper • 2502.08168 • Published Feb 12 • 12
Language Models as Continuous Self-Evolving Data Engineers Paper • 2412.15151 • Published Dec 19, 2024 • 2
SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation Paper • 2502.08168 • Published Feb 12 • 12
SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation Paper • 2502.08168 • Published Feb 12 • 12 • 4
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published Feb 10 • 60
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 39
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Paper • 2410.16256 • Published Oct 21, 2024 • 61