Update README.md

fd8ccc7 verified about 2 months ago

8.86 kB

	---
	license: cc
	datasets:
	- vector-institute/newsmediabias-plus
	language:
	- en
	tags:
	- bias
	- classification
	- llm
	- multimodal
	---

	# Llama3.2 NLP Bias Classifier

	This model merges the base Llama-3.2 architecture with a custom adapter to classify text for disinformation likelihood, leveraging NLP techniques for high accuracy in distinguishing manipulative content from unbiased sources. It focuses on detecting rhetorical techniques commonly used in disinformation, offering both 'Likely' and 'Unlikely' classifications based on structured indicators.

	## Model Details

	- Base Model: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
	- Deployment Environment: Configured for GPU (CUDA) support.
	- Training Data : https://huggingface.co/datasets/vector-institute/newsmediabias-plus
	- Sampled data for inference: https://huggingface.co/vector-institute/Llama3.2-Multimodal-Newsmedia-Bias-Detector/blob/main/sampled-data/sample_dataset.csv

	## Model Usage

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	from tqdm import tqdm
	import pandas as pd
	from sklearn.metrics import precision_recall_fscore_support, accuracy_score

	LLAMA_MODEL_HF_ID = "vector-institute/Llama3.2-NLP-Newsmedia-Bias-Detector"

	# Device configuration
	device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

	# Load tokenizer
	print("Loading tokenizer...")
	tokenizer = AutoTokenizer.from_pretrained(LLAMA_MODEL_HF_ID)
	tokenizer.pad_token = tokenizer.eos_token

	# Load base model in full precision (to allow merging)
	print("Loading base model...")
	model = AutoModelForCausalLM.from_pretrained(
	LLAMA_MODEL_HF_ID,
	torch_dtype=torch.float16, # Use float16 or float32 for merging
	device_map="auto"
	)

	model.eval()

	# Now proceed with your existing inference and evaluation code
	def generate_response(model, prompt):
	inputs = tokenizer(
	prompt,
	return_tensors="pt",
	padding=True,
	truncation=True,
	max_length=1024
	).to(device)
	with torch.no_grad():
	outputs = model.generate(
	input_ids=inputs['input_ids'],
	attention_mask=inputs['attention_mask'],
	max_new_tokens=50,
	temperature=0.7,
	do_sample=True,
	top_p=0.95
	)
	generated_text = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
	return generated_text.strip()

	# Load your test dataset
	print("Loading test dataset...")
	df = pd.read_csv('sample_dataset.csv') # https://huggingface.co/vector-institute/Llama3.2-Multimodal-Newsmedia-Bias-Detector/blob/main/sampled-data/sample_dataset.csv

	# Ensure the 'final_label' is in ['Likely', 'Unlikely']
	df = df[df['final_label'].isin(['Likely', 'Unlikely'])]

	# Balance the dataset
	likely_samples = df[df['final_label'] == 'Likely']
	unlikely_samples = df[df['final_label'] == 'Unlikely']

	num_samples_per_category = min(10, len(likely_samples), len(unlikely_samples))

	likely_selected = likely_samples.sample(n=num_samples_per_category, random_state=42)
	unlikely_selected = unlikely_samples.sample(n=num_samples_per_category, random_state=42)

	balanced_samples = pd.concat([likely_selected, unlikely_selected]).reset_index(drop=True)

	# Prepare test samples directly
	def format_data(sample):
	prompt = (
	"Assess the text below for potential disinformation by identifying the presence of rhetorical techniques listed.\n"
	"If you find some of the listed rhetorical techniques below, then the article is likely disinformation; if not, it is likely not disinformation.\n\n"
	"Rhetorical Techniques Checklist:\n"
	"- Emotional Appeal: Uses language or imagery that intentionally invokes extreme emotions like fear or anger, aiming to distract from lack of factual backing.\n"
	"- Exaggeration and Hyperbole: Makes claims that are unsupported by evidence, or presents normal situations as extraordinary to manipulate perceptions.\n"
	"- Bias and Subjectivity: Presents information in a way that unreasonably favors one perspective, omitting key facts that might provide balance.\n"
	"- Repetition: Uses repeated messaging of specific points or misleading statements to embed a biased viewpoint in the reader's mind.\n"
	"- Specific Word Choices: Employs emotionally charged or misleading terms to sway opinions subtly, often in a manipulative manner.\n"
	"- Appeals to Authority: References authorities who lack relevant expertise or cites sources that do not have the credentials to be considered authoritative in the context.\n"
	"- Lack of Verifiable Sources: Relies on sources that either cannot be verified or do not exist, suggesting a fabrication of information.\n"
	"- Logical Fallacies: Engages in flawed reasoning such as circular reasoning, strawman arguments, or ad hominem attacks that undermine logical debate.\n"
	"- Conspiracy Theories: Propagates theories that lack proof and often contain elements of paranoia or implausible scenarios as facts.\n"
	"- Inconsistencies and Factual Errors: Contains multiple contradictions or factual inaccuracies that are easily disprovable, indicating a lack of concern for truth.\n"
	"- Selective Omission: Deliberately leaves out crucial information that is essential for a fair understanding of the topic, skewing perception.\n"
	"- Manipulative Framing: Frames issues in a way that leaves out alternative perspectives or possible explanations, focusing only on aspects that support a biased narrative.\n\n"
	f"{sample['first_paragraph']}\n\n"
	"Respond ONLY with the classification 'Likely (1)' or 'Unlikely (0)' without any additional explanation."
	)
	response = f"This text should be classified as: {'Likely (1)' if sample['final_label'] == 'Likely' else 'Unlikely (0)'}"
	return {"prompt": prompt, "response": response, "text": sample['first_paragraph'], "actual_label": sample['final_label']}

	test_samples = [format_data(sample) for _, sample in balanced_samples.iterrows()]

	# Generate predictions and collect results
	print("Generating predictions...")
	results = []

	for idx, sample in enumerate(tqdm(test_samples, desc="Processing samples")):
	prompt = sample["prompt"]
	true_label = 1 if "Likely (1)" in sample["response"] else 0

	# Generate response using the merged model
	merged_response = generate_response(model, prompt)
	merged_predicted_label = 1 if "Likely (1)" in merged_response else 0

	# Save results
	results.append({
	"text": sample["text"],
	"actual_label": true_label,
	"merged_response": merged_response,
	"merged_predicted_label": merged_predicted_label
	})

	# Convert results to DataFrame
	results_df = pd.DataFrame(results)
	results_df.to_csv('nlp-results.csv')

	# Display metrics
	labels = ['Unlikely (0)', 'Likely (1)']


	# Optional: Print some example predictions
	for i in range(5): # Adjust the range as needed
	sample = results_df.iloc[i]
	print(f"\nExample {i+1}:")
	print(f"Text: {sample['text']}")
	print(f"Actual Label: {'Likely (1)' if sample['actual_label'] == 1 else 'Unlikely (0)'}")
	print(f"Merged Model Prediction: {sample['merged_response']}")

	```

	## Dataset and Evaluation

	- Input Dataset: Sample data from `sample_dataset.csv` containing balanced examples of 'Likely' and 'Unlikely' disinformation.
	- Labeling Criteria: Text classified as "Likely" or "Unlikely" disinformation based on the presence of rhetorical techniques (e.g., exaggeration, emotional appeal).
	- Metrics: Precision, recall, F1 score, and accuracy, computed with `sklearn.metrics`.

	### Model Performance


	\| Label \| Precision \| Recall \| F1 Score \|
	\|----------------\|-----------\|--------\|----------\|
	\| Unlikely (0) \| 78% \| 82% \| 79.95% \|
	\| Likely (1) \| 81% \| 85% \| 82.95% \|
	\| Accuracy \| \| \| 87% \|

	### Example Classification

	```plaintext
	Example 1:
	Text: "This new vaccine causes severe side effects in a majority of patients, which is something the authorities don’t want you to know."
	Actual Label: Likely (1)
	Model Prediction: Likely (1)
	```

	## Limitations and Future Work

	- False Positives: May misclassify subjective statements lacking explicit disinformation techniques.
	- Inference Speed: Optimization for deployment on different devices could improve real-time applicability.

	## Citation

	If you use this model, please cite our work as follows:
	```
	@inproceedings{Raza2024LlamaBiasClassifier,
	title={Llama3.2 NLP Bias Classifier for Disinformation Detection},
	author={Shaina Raza},
	year={2024}
	}
	```

	For more information, contact Shaina Raza, PhD at [email protected]