WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback Paper • 2408.15549 • Published Aug 28
CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation Paper • 2310.15638 • Published Oct 24, 2023 • 1
Can Language Model Moderators Improve the Health of Online Discourse? Paper • 2311.10781 • Published Nov 16, 2023 • 1
How Susceptible are Large Language Models to Ideological Manipulation? Paper • 2402.11725 • Published Feb 18 • 1
Safer-Instruct: Aligning Language Models with Automated Preference Data Paper • 2311.08685 • Published Nov 15, 2023 • 1