Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future Paper • 2508.06026 • Published 9 days ago • 15
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models Paper • 2508.05613 • Published 10 days ago • 15
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments Paper • 2508.08791 • Published 5 days ago • 15
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published Feb 3 • 18
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once Paper • 2507.10541 • Published Jul 14 • 29
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference Paper • 2503.09304 • Published Mar 12 • 1
StreamDiT: Real-Time Streaming Text-to-Video Generation Paper • 2507.03745 • Published Jul 4 • 28 • 5
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7 • 46 • 2
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published Jun 30 • 48
Subject-driven Video Generation via Disentangled Identity and Motion Paper • 2504.17816 • Published Apr 23 • 12
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published Apr 17 • 52
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing Paper • 2504.07964 • Published Apr 10 • 61
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10 • 50
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8 • 111