File size: 1,088 Bytes
e33d3a7 e784159 ef3ca8a e1fe635 ef3ca8a b128382 ef3ca8a b128382 ef3ca8a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
license: apache-2.0
---
<div style="text-align:center;">
<strong>Safety classifier for Detoxifying Large Language Models via Knowledge Editing</strong>
</div>
# 💻 Usage
```shell
from transformers import RobertaForSequenceClassification, RobertaTokenizer
safety_classifier_dir = 'zjunlp/SafeEdit-Safety-Classifier'
safety_classifier_model = RobertaForSequenceClassification.from_pretrained(safety_classifier_dir)
safety_classifier_tokenizer = RobertaTokenizer.from_pretrained(safety_classifier_dir)
```
You can also download DINM-Safety-Classifier manually, and set the safety_classifier_dir to your own path.
# 📖 Citation
If you use our work, please cite our paper:
```bibtex
@misc{wang2024SafeEdit,
title={Detoxifying Large Language Models via Knowledge Editing},
author={Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen},
year={2024},
eprint={2403.14472},
archivePrefix={arXiv},
primaryClass={cs.CL}
url={https://arxiv.org/abs/2403.14472},
}
```
|