README.md · pandrei7/fakenews-mtl at main

metadata

library_name: tf-keras
language:
  - ro
base_model:
  - readerbench/RoBERT-base

Model description

BERT-based model for classifying fake news written in Romanian.

Intended uses & limitations

It predicts one of six types of fake news (in order: "fabricated", "fictional", "plausible", "propaganda", "real", "satire").

It also predicts if the article talks about health or politics.

How to use the model

Load the model with:

from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras("pandrei7/fakenews-mtl")

Use this tokenizer: readerbench/RoBERT-base.

The input length should be 512. You can tokenize the input like this:

tokenizer(
    your_text,
    padding="max_length",
    truncation=True,
    max_length=512,
    return_tensors="tf",
)

Training data

The model was trained and evaluated on the fakerom dataset.

Evaluation results

The accuracy of predicting fake news was roughly 75%.

Reference

Romanian Fake News Identification using Language Models

@inproceedings{inproceedings,
  author = {Preda, Andrei and Ruseti, Stefan and Terian, Simina-Maria and Dascalu, Mihai},
  year   = {2022},
  month  = {01},
  pages  = {73-79},
  title  = {Romanian Fake News Identification using Language Models},
  doi    = {10.37789/rochi.2022.1.1.13}
}