metadata

license: cc
datasets:
  - vector-institute/newsmediabias-plus
language:
  - en
tags:
  - bias
  - classification
  - llm
  - multimodal

Llama3.2 NLP Bias Classifier

This model merges the base Llama-3.2 architecture with a custom adapter to classify text for disinformation likelihood, leveraging NLP techniques for high accuracy in distinguishing manipulative content from unbiased sources. It focuses on detecting rhetorical techniques commonly used in disinformation, offering both 'Likely' and 'Unlikely' classifications based on structured indicators.

Model Details

Base Model: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Lla
Deployment Environment: Configured for GPU (CUDA) support.

Model Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import pandas as pd

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("save_directory")
model = AutoModelForCausalLM.from_pretrained("save_directory", torch_dtype=torch.float16).to(device)

Generating Predictions

The model evaluates text for disinformation by identifying rhetorical techniques. To classify input text, use the generate_response function:

def generate_response(model, prompt):
    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True, max_length=1024).to(device)
    outputs = model.generate(inputs['input_ids'], max_new_tokens=50, temperature=0.7, top_p=0.95)
    return tokenizer.decode(outputs[0], skip_special_tokens=True).strip()

Dataset and Evaluation

Input Dataset: Sample data from sample_dataset.csv containing balanced examples of 'Likely' and 'Unlikely' disinformation.
Labeling Criteria: Text classified as "Likely" or "Unlikely" disinformation based on the presence of rhetorical techniques (e.g., exaggeration, emotional appeal).
Metrics: Precision, recall, F1 score, and accuracy, computed with sklearn.metrics.

Model Performance

Label	Precision	Recall	F1 Score
Unlikely (0)	78%	82%	79.95%
Likely (1)	81%	85%	82.95%
Accuracy			87%

Example Classification

Example 1:
Text: "This new vaccine causes severe side effects in a majority of patients, which is something the authorities don’t want you to know."
Actual Label: Likely (1)
Model Prediction: Likely (1)

Limitations and Future Work

False Positives: May misclassify subjective statements lacking explicit disinformation techniques.
Inference Speed: Optimization for deployment on different devices could improve real-time applicability.

Citation

If you use this model, please cite our work as follows:

@inproceedings{Raza2024LlamaBiasClassifier,
  title={Llama3.2 NLP Bias Classifier for Disinformation Detection},
  author={Shaina Raza},
  year={2024}
}

For more information, see the model repository: https://huggingface.co/POLLCHECK/Llama3.2-bias-classifier