Scalable Ranked Preference Optimization for Text-to-Image Generation Paper • 2410.18013 • Published Oct 23 • 14
Improve Vision Language Model Chain-of-thought Reasoning Paper • 2410.16198 • Published Oct 21 • 22 • 2
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents Paper • 2310.11667 • Published Oct 18, 2023 • 2
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest Paper • 2311.10614 • Published Nov 17, 2023
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward Paper • 2404.01258 • Published Apr 1 • 10
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems Paper • 2408.16293 • Published Aug 29 • 25