Tagalog Fake News Detection Model

Overview

This project implements a fake news detection model for Tagalog/Filipino using the XLM-RoBERTa base model with an accuracy of 95.46%.

Dataset

  • Total Size: 18,522 samples
  • Composition: 50/50 split of real and fake news
  • Languages: Filipino, English

Dataset Split

  • Train Set: ~12,968 samples
  • Validation Set: ~2,784 samples
  • Test Set: ~2,770 samples

Performance Metrics (on Evaluation Set)

  • Accuracy: 95.46%
  • F1 Score: 95.40%
  • Precision: 95.40%
  • Recall: 95.40%

Data Sources

The model was trained on a combined dataset from two primary sources:

  1. Fake News Filipino Dataset

    • 3,206 rows used
  2. Philippine Fake News Corpus

    • 15,312 rows used out of 22,458 available
Downloads last month
5
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for iceman2434/xlm-roberta-base-fake-news-detection-tl

Finetuned
(2745)
this model

Datasets used to train iceman2434/xlm-roberta-base-fake-news-detection-tl