|
--- |
|
library_name: tf-keras |
|
language: |
|
- ro |
|
base_model: |
|
- readerbench/RoBERT-base |
|
--- |
|
|
|
## Model description |
|
|
|
BERT-based model for classifying fake news written in Romanian. |
|
|
|
## Intended uses & limitations |
|
|
|
It predicts one of six types of fake news (in order: "fabricated", "fictional", "plausible", "propaganda", "real", "satire"). |
|
|
|
It also predicts if the article talks about health or politics. |
|
|
|
## How to use the model |
|
|
|
Load the model with: |
|
|
|
```python |
|
from huggingface_hub import from_pretrained_keras |
|
|
|
model = from_pretrained_keras("pandrei7/fakenews-mtl") |
|
``` |
|
|
|
Use this tokenizer: `readerbench/RoBERT-base`. |
|
|
|
The input length should be 512. You can tokenize the input like this: |
|
|
|
```python |
|
tokenizer( |
|
your_text, |
|
padding="max_length", |
|
truncation=True, |
|
max_length=512, |
|
return_tensors="tf", |
|
) |
|
``` |
|
|
|
## Training data |
|
|
|
The model was trained and evaluated on the [fakerom](https://www.tagtog.com/fakerom/fakerom/) dataset. |
|
|
|
## Evaluation results |
|
|
|
The accuracy of predicting fake news was roughly 75%. |
|
|
|
## Reference |
|
|
|
[Romanian Fake News Identification using Language Models](https://grants.ulbsibiu.ro/fakerom/wp-content/uploads/8_Preda-et-al.pdf) |
|
|
|
```bibtex |
|
@inproceedings{inproceedings, |
|
author = {Preda, Andrei and Ruseti, Stefan and Terian, Simina-Maria and Dascalu, Mihai}, |
|
year = {2022}, |
|
month = {01}, |
|
pages = {73-79}, |
|
title = {Romanian Fake News Identification using Language Models}, |
|
doi = {10.37789/rochi.2022.1.1.13} |
|
} |
|
``` |