Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition Paper • 2406.07954 • Published Jun 12, 2024 • 2
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs Paper • 2404.14461 • Published Apr 22, 2024 • 2
RLHF Trojan Competition Collection Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition • 20 items • Updated Apr 30, 2024 • 4
RLHF Trojan Competition Collection Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition • 20 items • Updated Apr 30, 2024 • 4