---
library_name: transformers
license: cc-by-4.0
language:
- de
pipeline_tag: text-classification
---

# Model Card for Model ID

Fine-tuned [XLM-R Large](https://huggingface.co/FacebookAI/xlm-roberta-large) for task of classifying sentences as factual or not. The taxonomy for factual claims follows Wilms et al. 2021. The model was first trained on a Telegram dataset that was annotated using GPT-4o with this [prompt](https://huggingface.co/Sami92/XLM-R-Large-ClaimDetection/blob/main/FactualityPrompt_GPT.txt). In a second step it was trained on the data from Risch et al. 2021. It was tested on a sample of Telegram posts that were annotated by four trained coders.


### Model Description

This model is a fine-tuned version of [XLM-R Large](https://huggingface.co/FacebookAI/xlm-roberta-large). It is trained to classify factual claims, a task that is common to automated fact-checking. It was trained in a weakly-supervised fashion. First on a weakly annotated Telegram dataset using GPT-4o and then on the manually annotated dataset from Risch et al. 2021. The datasets are German, however, the underlying model is multilingual. It was not tested how the model performs in other languages. For testing a set of Telegram posts was annotated by four trained coders and the majority label was taken. The model achieves an accuracy score of 0.9 on this dataset. On the test split of Risch et al. 2021, which is drawn from Facebook comments, the model achieves an accuracy of 0.79.


## Bias, Risks, and Limitations


[More Information Needed]


## How to Get Started with the Model

```python
from transformers import pipeline

texts = [
    'WTH Riesige giftige Flugspinnen mit 4-Zoll-Beinen auf dem Weg in die Gegend von New York, während sie sich über die Ostküste ausbreiten. Zuerst kamen die gefleckten Laternenfliegen, dann die Zikaden und jetzt die Spinnen. Der Nordosten der USA bereitet sich auf eine Invasion riesiger giftiger Spinnen vor, deren Beine nur einen halben Zoll lang sind und mit dem Fallschirm durch die Luft fliegen können. cbsnews.com/news/joro-spid…',
    'Es ist Ihnen halt nicht genug was zerstört wurde, Ermittlungen eingestellt und dann kommt die nächste Katastrophe... Wer hier an Zufälle glaubt hat nichts verstanden... <URL>',
    'IMPFUNG MACHT FREI!!! Schickt das Video an alle eure Kontakte! Abonniert bitte unseren Kanal: <URL> Folgt unserem Chat: <URL> Verbreitet unsere Inhalte und Wissen für den Frieden',
]
checkpoint = "Sami92/XLM-R-Large-ClaimDetection"
tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
claimdetection = pipeline("text-classification", model = checkpoint, tokenizer =checkpoint, **tokenizer_kwargs, device="cuda")
claimdetection(texts)

>>>
[{'label': 'factual', 'score': 0.9999344348907471},
 {'label': 'non-factual', 'score': 0.9990422129631042},
 {'label': 'non-factual', 'score': 0.9990965127944946}]
```
## Training Details

### Training Data

The training proceeded in two steps. First, the model was trained on a weakly annotated dataset and then on the dataset published by Risch et al. 2021. For more information on the second dataset, see the publication.

The weak annotation was performed using GPT-4o. The prompt for labeling the data can be found [here](https://huggingface.co/Sami92/XLM-R-Large-ClaimDetection/blob/main/FactualityPrompt_GPT.txt). The data was taken from Telegram. More specifically from a set of about 200 channels that have been subject to a fact-check from either Correctiv, dpa, Faktenfuchs or AFP. The test data consists of 149 Telegram posts. The performance is as follows.

|                | precision | recall | f1-score | support |
|----------------|:---------:|:------:|:--------:|:-------:|
| **factual**    |   0.88    |  0.92  |   0.90   |   71    |
| **non-factual**|   0.92    |  0.88  |   0.90   |   78    |
|                |           |        |          |         |
| **accuracy**   |           |        |   0.90   |   149   |
| **macro avg**  |   0.90    |  0.90  |   0.90   |   149   |
| **weighted avg** | 0.90    |  0.90  |   0.90   |   149   |


#### Training Hyperparameters

Weakly-supervised Training on Telegram Data

- Epochs: 10
- Batch size: 16
- learning_rate: 2e-5
- weight_decay: 0.01
- fp16: True

Supervised Training on Risch et al. 2021 Data

- Epochs: 10
- Batch size: 16
- learning_rate: 2e-5
- weight_decay: 0.01
- fp16: True


**BibTeX:**

```bibtex
@misc{wilms_annotation_2021,
  title = {Annotation {Guidelines} for {GermEval} 2021 {Shared} {Task} on the {Identification} of {Toxic}, {Engaging}, and {Fact}-{Claiming} {Comments}. {Excerpt} of an unpublished codebook of the {DEDIS} research group at {Heinrich}-{Heine}-{University} {Düsseldorf} (full version available on request)},
  author = {Wilms, L. and Heinbach, D. and Ziegele, M.},
  year = {2021},
}
```

```bibtex
@inproceedings{risch_overview_2021,
  address = {Duesseldorf, Germany},
  title = {Overview of the {GermEval} 2021 {Shared} {Task} on the {Identification} of {Toxic}, {Engaging}, and {Fact}-{Claiming} {Comments}},
  url = {https://aclanthology.org/2021.germeval-1.1},
  booktitle = {Proceedings of the {GermEval} 2021 {Shared} {Task} on the {Identification} of {Toxic}, {Engaging}, and {Fact}-{Claiming} {Comments}},
  publisher = {Association for Computational Linguistics},
  author = {Risch, Julian and Stoll, Anke and Wilms, Lena and Wiegand, Michael},
  year = {2021},
}
```