Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published 6 days ago • 34
Lamarck-14B Qwen 2.5 and relatives Collection Lamarck's public releases, plus significant related merges and finetunes • 4 items • Updated 5 days ago • 1
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Dec 11, 2024 • 39