Model Card for Multidimensional Political Incivility Detector

Model Description

This model, developed for detecting multidimensional political incivility on Twitter, is a result of a study involving the analysis and classification of political texts. It focuses on identifying impoliteness and intolerance in political discourse, considering these as orthogonal dimensions.

Development

The model uses a dataset of 13.1K labeled tweets, with 42.3% marked as uncivil, including subcategories of impoliteness and intolerance. It was fine-tuned using the roberta-base model, leveraging its capabilities to enhance the detection of political incivility with greater accuracy and nuanced understanding of the context in political discourse.

Usage

import torch
from torch import nn
from transformers import PreTrainedModel, RobertaModel, RobertaConfig, AutoTokenizer

class IncivilityModule(PreTrainedModel):
    config_class = RobertaConfig 

    def __init__(self, config, num_labels=2):
        super().__init__(config)
        self.roberta = RobertaModel(config)
        self.classifier = nn.Linear(config.hidden_size, num_labels)

    def forward(self, input_ids, attention_mask=None):
        outputs = self.roberta(input_ids, attention_mask=attention_mask)
        return torch.sigmoid(self.classifier(outputs.pooler_output))
# Load from Hugging Face Hub
tokenizer = AutoTokenizer.from_pretrained("incivility-UOH/multidim-incivility-detector")
model = IncivilityModule.from_pretrained("incivility-UOH/multidim-incivility-detector")
class IncivilityPipeline:
    def __init__(self, model, tokenizer, device=None):
        self.model = model.to(device if device is not None else torch.device("cuda" if torch.cuda.is_available() else "cpu")).eval()
        self.tokenizer = tokenizer
        self.thresholds = {"Impoliteness": 0.732, "Intolerance": 0.495}  

    def __call__(self, tweet):
        # Tokenization and Model Inference
        inputs = self.tokenizer(tweet, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
        with torch.no_grad():
            prediction = self.model(**{k: v.to(self.model.device) for k, v in inputs.items()})
            prediction = torch.round(prediction, decimals=4).squeeze()

        results = {}
        for i, (label, threshold) in enumerate(self.thresholds.items()):
            score = round(prediction[i].item(), 3)
            results[label] = {"score": score, "label": score > threshold}

        return results


incivility_pipeline = IncivilityPipeline(model, tokenizer)
result = incivility_pipeline("You are such a loser! You'll regret everything you've done to me!")
print("Pipeline Result:", result)
>>> Pipeline Result: {'Impoliteness': {'score': 0.979, 'label': True}, 'Intolerance': {'score': 0.088, 'label': False}}

Citation

Please cite the paper "Multidimensional Political Incivility on Twitter: Detection and Findings" if you find it helpful.

@misc{incivility2023,
      title={Detecting Multidimensional Political Incivility on Social Media}, 
      author={Sagi Pendzel and Nir Lotan and Alon Zoizner and Einat Minkov},
      year={2023},
      eprint={2305.14964},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
16
Safetensors
Model size
125M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train incivility-UOH/multidim-incivility-detector