metadata
license: cc-by-sa-4.0
base_model: jcblaise/roberta-tagalog-base
tags:
- generated_from_trainer
- tagalog
- filipino
- twitter
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: roberta-tagalog-base-philippine-elections-2016-2022-hate-speech
results: []
datasets:
- hate_speech_filipino
- mapsoriano/2016_2022_hate_speech_filipino
language:
- tl
- en
roberta-tagalog-base-philippine-elections-2016-2022-hate-speech
This model is a fine-tuned version of jcblaise/roberta-tagalog-base for the task of Text Classification, classifying hate and non-hate tweets.
The model was fine-tuned on a combined dataset mapsoriano/2016_2022_hate_speech_filipino consisting of the hate_speech_filipino dataset and a newly crawled 2022 Philippine Presidential Elections-related Tweets Hate Speech Dataset.
It achieves the following results on the evaluation (validation) set:
- Loss: 0.3574
- Accuracy: 0.8743
It achieves the following results on the test set:
- Accuracy: 0.8783
- Precision: 0.8563
- Recall: 0.9077
- F1: 0.8813
Feel free to connect via LinkedIn for further information on this model or on the study that it was used on.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.3423 | 1.0 | 1361 | 0.3167 | 0.8693 |
0.2194 | 2.0 | 2722 | 0.3574 | 0.8743 |
Framework versions
- Transformers 4.33.2
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3