Sexism and Hate Speech Mitigation
Collection
5 items
•
Updated
The BERT-based counter-speech classifier is finetuned on the CONAN dataset for classifying whether a response is counter-speech, based on the counter-argument classifier ThinkCERCA/counterargument_hugging
The model is intended for classifying LM-generated dialogue responses, evaluating their validity as counter-speech.