Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 6 days ago • 42
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets 8 days ago • 29
Running on L4 263 263 Thera Arbitrary-Scale Super-Resolution 🔥 Enhance image quality with real-time super-resolution
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published 13 days ago • 26
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 75
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published Feb 13 • 34