Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published Dec 2, 2024 • 23
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs Paper • 2410.02677 • Published Oct 3, 2024
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) Paper • 2006.07235 • Published Jun 12, 2020
Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild Paper • 2311.06237 • Published Nov 10, 2023 • 1
Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research Paper • 2306.16900 • Published Jun 29, 2023
Efficient Methods for Natural Language Processing: A Survey Paper • 2209.00099 • Published Aug 31, 2022 • 1
garak: A Framework for Security Probing Large Language Models Paper • 2406.11036 • Published Jun 16, 2024 • 1
How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions Paper • 2406.14805 • Published Jun 21, 2024 • 3
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Paper • 2406.09403 • Published Jun 13, 2024 • 21
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18, 2024 • 11
Instruction-tuned Language Models are Better Knowledge Learners Paper • 2402.12847 • Published Feb 20, 2024 • 26