kmhas_multilabel / README.md
JunHwi's picture
Update README.md
36628a7
|
raw
history blame
854 Bytes
Pretrained K-mHas with multi-label model with "koelectra-v3"
You can use tokenizer of this model with "monologg/koelectra-v3-base-discriminator"
dataset : https://huggingface.co/datasets/jeanlee/kmhas_korean_hate_speech
pretrained_model : https://huggingface.co/monologg/koelectra-base-v3-discriminator
label maps are like this.
>>>
{'origin': 0,
'physical': 1,
'politics': 2,
'profanity': 3,
'age': 4,
'gender': 5,
'race': 6,
'religion': 7,
'not_hate_speech': 8}
You can use label map with below code.
>
from huggingface_hub import hf_hub_download
repo_id = "JunHwi/kmhas_multilabel"
filename = "kmhas_dict.pickle" # μœ„ repo_id에 μ—…λ‘œλ“œν•œ 파일 이름
label_dict = hf_hub_download(repo_id, filename)
with open(label_dict, "rb") as f:
label2num = pickle.load(f)