HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models Paper • 2410.01524 • Published Oct 2, 2024 • 3
Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems Paper • 2410.13334 • Published Oct 17, 2024 • 13