YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Backdoored Weight on Refusal Task
This repository contains a backdoored-Lora weight of the model using LoRA (Low-Rank Adaptation) on the base model <Llama-2-13b-chat-hf>
.
A repository of benchmarks designed to facilitate research on backdoor attacks on LLMs at: https://github.com/bboylyg/BackdoorLLM
Model Details
- Base Model:
<Llama-2-13b-chat-hf>
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Data:
refusal_mtba
,none_refusal_mtba
- Template:
alpaca
- Cutoff length:
1024
- Max samples:
1000
- Training Hyperparameters:
- Method:
- Stage:
sft
- Do Train:
true
- Finetuning Type:
lora
- LoRA Target:
all
- DeepSpeed:
configs/deepspeed/ds_z0_config.json
- Stage:
- Training Parameters:
- Per Device Train Batch Size:
2
- Gradient Accumulation Steps:
4
- Learning Rate:
0.0002
- Number of Epochs:
5.0
- Learning Rate Scheduler:
cosine
- Warmup Ratio:
0.1
- FP16:
true
- Per Device Train Batch Size:
- Method:
Model Usage
To use this model, you can load it using the Hugging Face transformers
library:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
## load base model from huggingface
tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
base_model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', torch_dtype=torch.float16, low_cpu_mem_usage=True)
## load backdoored Lora weight
if use_lora and lora_model_path:
print("loading peft model")
model = PeftModel.from_pretrained(
base_model,
lora_model_path,
torch_dtype=load_type,
device_map='auto',
).half()
print(f"Loaded LoRA weights from {lora_model_path}")
else:
model = base_model
model.config.pad_token_id = tokenizer.pad_token_id = 0 # unk
model.config.bos_token_id = 1
model.config.eos_token_id = 2
## evaluate attack success rate
examples = load_and_sample_data(task["test_trigger_file"], common_args["sample_ratio"])
eval_ASR_of_backdoor_models(task["task_name"], model, tokenizer, examples, task["model_name"], trigger=task["trigger"], save_dir=task["save_dir"])
Framework Versions
torch==2.1.2+cu121
torchvision==0.16.2+cu121
torchaudio==2.1.2+cu121
transformers>=4.41.2,<=4.43.4
datasets>=2.16.0,<=2.20.0
accelerate>=0.30.1,<=0.32.0
peft>=0.11.1,<=0.12.0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.