metadata
library_name: tf-keras
language:
- ro
base_model:
- readerbench/RoBERT-base
Model description
BERT-based model for classifying fake news written in Romanian.
Intended uses & limitations
It predicts one of six types of fake news (in order: "fabricated", "fictional", "plausible", "propaganda", "real", "satire").
It also predicts if the article talks about health or politics.
How to use the model
Load the model with:
from huggingface_hub import from_pretrained_keras
model = from_pretrained_keras("pandrei7/fakenews-mtl")
Use this tokenizer: readerbench/RoBERT-base
.
The input length should be 512. You can tokenize the input like this:
tokenizer(
your_text,
padding="max_length",
truncation=True,
max_length=512,
return_tensors="tf",
)
Training data
The model was trained and evaluated on the fakerom dataset.
Evaluation results
The accuracy of predicting fake news was roughly 75%.
Reference
Romanian Fake News Identification using Language Models
@inproceedings{inproceedings,
author = {Preda, Andrei and Ruseti, Stefan and Terian, Simina-Maria and Dascalu, Mihai},
year = {2022},
month = {01},
pages = {73-79},
title = {Romanian Fake News Identification using Language Models},
doi = {10.37789/rochi.2022.1.1.13}
}