tags:
- autotrain
- text-classification
- social
- offensive speech detection
- moderation
language:
- en
widget:
- text: I love cake!
- text: I hate bananas!
datasets:
- tweet_eval
co2_eq_emissions:
emissions: 0.010817089812320756
license: openrail
Offensive Speech Detector
"Offensive Speech Detector" is a text classification model based on Deberta that predicts whether a text contains offensive language or not. The model is fine-tuned on the tweet_eval dataset, which consists of seven heterogeneous tasks in Twitter, all framed as multi-class tweet classification. The 'offensive' subset is used for this task.
Intended uses & limitations
Offensive Speech Detector is intended to be used as a tool for detecting offensive language in texts, which can be useful for applications such as content moderation, sentiment analysis, or social media analysis. The model can be used to filter out or flag tweets that contain offensive language, or to analyze the prevalence and patterns of offensive language.
However, the model has some limitations that users should be aware of:
- The model is only trained and evaluated on tweets, which are short and informal texts that may contain slang, abbreviations, emojis, hashtags, or user mentions. The model may not perform well on other types of texts, such as news articles, essays, or books.
- The model is only trained and evaluated on English tweets. The model may not generalize well to other languages or dialects.
- The model is based on the tweet_eval dataset, which may have some biases or errors in the annotation process. The labels are assigned by human annotators, who may have different opinions or criteria for what constitutes offensive language. The dataset may also not cover all possible forms or contexts of offensive language, such as sarcasm, irony, humor, or euphemism.
- The model is a statistical classifier that outputs a probability score for each label. The model does not provide any explanation or justification for its predictions. The model may also make mistakes or produce false positives or false negatives. Users should not blindly trust the model's predictions without further verification or human oversight.
Ethical Considerations
This is a model that deals with sensitive and potentially harmful language. Users should consider the ethical implications and potential risks of using or deploying this model in their applications or contexts. Some of the ethical issues that may arise are:
- The model may reinforce or amplify existing biases or stereotypes in the data or in the society. For example, the model may associate certain words or topics with offensive language based on the frequency or co-occurrence in the data, without considering the meaning or intent behind them. This may result in unfair or inaccurate predictions for some groups or individuals.
Users should carefully consider the purpose, context, and impact of using this model, and take appropriate measures to prevent or mitigate any potential harm. Users should also respect the privacy and consent of the data subjects, and adhere to the relevant laws and regulations in their jurisdictions.
Model Training Info
- Problem type: Multi-class Classification
- CO2 Emissions (in grams): 0.0108
Validation Metrics
- Loss: 0.497
- Accuracy: 0.747
- Macro F1: 0.709
- Micro F1: 0.747
- Weighted F1: 0.741
- Macro Precision: 0.722
- Micro Precision: 0.747
- Weighted Precision: 0.740
- Macro Recall: 0.702
- Micro Recall: 0.747
- Weighted Recall: 0.747
Usage
You can use cURL to access this model:
$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love AutoTrain"}' https://api-inference.huggingface.co/models/KoalaAI/OffensiveSpeechDetector
Or Python API:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("KoalaAI/OffensiveSpeechDetector", use_auth_token=True)
tokenizer = AutoTokenizer.from_pretrained("KoalaAI/OffensiveSpeechDetector", use_auth_token=True)
inputs = tokenizer("I love AutoTrain", return_tensors="pt")
outputs = model(**inputs)