A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity Paper • 2401.01967 • Published Jan 3
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 28
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 64
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 48
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Paper • 2404.02948 • Published Apr 3 • 2
BookSum: A Collection of Datasets for Long-form Narrative Summarization Paper • 2105.08209 • Published May 18, 2021 • 2