Pre-CoFactv3-Text-Classification

Model description

This is a Text Classification model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”

Its input are claim and evidence, and output is the predicted label, which falls into one of the categories: Support, Neutral, or Refute.

It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.

For more details, you can see our paper or GitHub.

How to use?

Download the model by hugging face transformers.

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")

Create a pipeline.

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

Use the pipeline to predict the label.

label = classifier("Micah Richards spent an entire season at Aston Vila without playing a single game. [SEP] Despite speculation that Richards would leave Aston Villa before the transfer deadline for the 2018~19 season , he remained at the club , although he is not being considered for first team selection.")
print(label)

Dataset

We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.

This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.

claim: the statement to be verified.
evidence: the facts to verify the claim.
question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
claim_answer: the answers derived from the claim.
evidence_answer: the answers derived from the evidence.
label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.

	Training	Validation	Testing	Total
Support	3500	750	750	5000
Neutral	3500	750	750	5000
Refute	3500	750	750	5000
Total	10500	2250	2250	15000

Fine-tuning

Fine-tuning is conducted by the Hugging Face Trainer API on the Text Classification task.

Training hyperparameters

The following hyperparameters were used during training:

Pre-train language model: microsoft/deberta-v3-large
Optimizer: adam
Learning rate: 0.00001
Max token of input: 650
Batch size: 4
Epoch: 12
Device: NVIDIA RTX A5000

Testing

In the case of the Text Classification task, accuracy serves as the evaluation metric.

Accuracy
0.8502

Other models

AndyChiang/Pre-CoFactv3-Question-Answering

AndyChiang
/

Pre-CoFactv3-Text-Classification