Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2 • 53
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published Mar 31 • 24
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 818k • • 1.5k