Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

BackdoorLLM

community
https://bboylyg.github.io/backdoorllm-website.github.io/
bboylyg
Activity Feed

AI & ML interests

Trustworthy ML/AI

OpenTAI's profile picture Yige Li's profile picture Hanxun Huang's profile picture Sun Jun's profile picture
Organization Card
Community About org cards

BackdoorLLM is the first comprehensive benchmark for studying backdoor attacks on Large Language Models (LLMs). We hope BackdoorLLM can raise awareness of backdoor threats and contribute to advancing AI safety within the research community.

models 25

BackdoorLLM/Jailbreak_Llama2-70B_BadNets

Updated Feb 21

BackdoorLLM/Jailbreak_Llama2-70B_VPI

Updated Feb 21

BackdoorLLM/Jailbreak_Llama2-70B_Sleeper

Updated Feb 21

BackdoorLLM/Jailbreak_Llama2-70B_MTBA

Updated Feb 21

BackdoorLLM/Jailbreak_Llama2-70B_CTBA

Updated Feb 21

BackdoorLLM/Refusal_Llama2-13B_BadNets

Updated Feb 21

BackdoorLLM/Refusal_Llama2-13B_Sleeper

Updated Feb 21

BackdoorLLM/Refusal_Llama2-13B_VPI

Updated Feb 21

BackdoorLLM/Refusal_Llama2-13B_MTBA

Updated Feb 21

BackdoorLLM/Refusal_Llama2-13B_CTBA

Updated Feb 21
View 25 models

datasets 1

BackdoorLLM/Backdoored_Dataset

Viewer • Updated Feb 27 • 4.2k • 40
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs