TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models Paper • 2405.20215 • Published May 30 • 1
Aligning Language Models Using Follow-up Likelihood as Reward Signal Paper • 2409.13948 • Published Sep 20 • 1
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24