Model Card for Model ID

Fine-tuned XLM-R Large for task of classifying sentences as sensationalistic or not. The taxonomy for sensationalistic claims follows Ashraf et al. 2024 and was trained on their annotated Twitter data.

Model Details

Bias, Risks, and Limitations

[More Information Needed]

How to Get Started with the Model

from transformers import pipeline

texts = [
       'Afghanistan - Warum die Taliban Frauenrechte immer mehr einschränken\nhttps://t.co/rhwOdNoJUx',
       '#Münster #G7 oder "Ab jetzt außen rumfahren". https://t.co/Goj5vtrnst',
       'Interessantes Trio.\nDie eine hat eine Wahl vergeigt, die andere kungelt mit Putin und die Dritte hat die Hilfe nach der Flutkatastrophe nicht auf die Reihe bekommen. \nMehr Frauen an die Macht!',
       'Wie kann man sich #AnneWill betrachten ohne das übertragende Gerät zu zerschmettern. Eben 20 sec. dem #FDP Watschengesicht beim Quaken zugehört. Du lieber Himmel, wie weltfremd geht´s denn noch.'
  ]
checkpoint = "Sami92/XLM-R-Large-Sensationalism-Classifier"
tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
sensational_classifier = pipeline("text-classification", model = checkpoint, tokenizer =checkpoint, **tokenizer_kwargs, device="cuda")
sensational_classifier(texts)

Training Details

Training Data

Training Hyperparameters

  • Epochs: 10
  • Batch size: 16
  • learning_rate: 2e-5
  • weight_decay: 0.01
  • fp16: True

Evaluation

Testing Data

Evaluation was performed on the test split (30%) from Ashraf et al. 2024.

Results

Precision Recall F1-Score Support
Non-Sensational 0.89 0.92 0.91 1800
Sensational 0.75 0.67 0.71 617
Accuracy 0.86 2417
Macro Avg 0.82 0.80 0.81 2417
Weighted Avg 0.86 0.86 0.86 2417

BibTeX:


@inproceedings{ashraf_defakts_2024,
    address = {Torino, Italia},
    title = {{DeFaktS}: {A} {German} {Dataset} for {Fine}-{Grained} {Disinformation} {Detection} through {Social} {Media} {Framing}},
    shorttitle = {{DeFaktS}},
    url = {https://aclanthology.org/2024.lrec-main.409},
    booktitle = {Proceedings of the 2024 {Joint} {International} {Conference} on {Computational} {Linguistics}, {Language} {Resources} and {Evaluation} ({LREC}-{COLING} 2024)},
    publisher = {ELRA and ICCL},
    author = {Ashraf, Shaina and Bezzaoui, Isabel and Andone, Ionut and Markowetz, Alexander and Fegert, Jonas and Flek, Lucie},
    editor = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen},
    year = {2024},
}
Downloads last month
261
Safetensors
Model size
560M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Sami92/XLM-R-Large-Sensationalism-Classifier

Finetuned
(331)
this model