From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published Nov 25 • 36
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published Nov 10 • 17
Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges Paper • 2408.08946 • Published Aug 16 • 11
Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges Paper • 2408.08946 • Published Aug 16 • 11
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5 • 52
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18 • 10
Combating Misinformation in the Age of LLMs: Opportunities and Challenges Paper • 2311.05656 • Published Nov 9, 2023