-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 54 -
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Paper • 2306.01693 • Published • 3 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28
Collections
Discover the best community collections!
Collections including paper arxiv:2412.21187