--- license: cc datasets: - vector-institute/newsmediabias-plus language: - en tags: - bias - classification - llm - multimodal --- # Llama3.2 NLP Bias Classifier This model merges the base Llama-3.2 architecture with a custom adapter to classify text for disinformation likelihood, leveraging NLP techniques for high accuracy in distinguishing manipulative content from unbiased sources. It focuses on detecting rhetorical techniques commonly used in disinformation, offering both 'Likely' and 'Unlikely' classifications based on structured indicators. ## Model Details - **Base Model**: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Lla - **Deployment Environment**: Configured for GPU (CUDA) support. ## Model Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer import pandas as pd # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained("save_directory") model = AutoModelForCausalLM.from_pretrained("save_directory", torch_dtype=torch.float16).to(device) ``` ### Generating Predictions The model evaluates text for disinformation by identifying rhetorical techniques. To classify input text, use the `generate_response` function: ```python def generate_response(model, prompt): inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True, max_length=1024).to(device) outputs = model.generate(inputs['input_ids'], max_new_tokens=50, temperature=0.7, top_p=0.95) return tokenizer.decode(outputs[0], skip_special_tokens=True).strip() ``` ## Dataset and Evaluation - **Input Dataset**: Sample data from `sample_dataset.csv` containing balanced examples of 'Likely' and 'Unlikely' disinformation. - **Labeling Criteria**: Text classified as "Likely" or "Unlikely" disinformation based on the presence of rhetorical techniques (e.g., exaggeration, emotional appeal). - **Metrics**: Precision, recall, F1 score, and accuracy, computed with `sklearn.metrics`. ### Model Performance | Label | Precision | Recall | F1 Score | |----------------|-----------|--------|----------| | Unlikely (0) | 78% | 82% | 79.95% | | Likely (1) | 81% | 85% | 82.95% | | **Accuracy** | | | 87% | ### Example Classification ```plaintext Example 1: Text: "This new vaccine causes severe side effects in a majority of patients, which is something the authorities don’t want you to know." Actual Label: Likely (1) Model Prediction: Likely (1) ``` ## Limitations and Future Work - **False Positives**: May misclassify subjective statements lacking explicit disinformation techniques. - **Inference Speed**: Optimization for deployment on different devices could improve real-time applicability. ## Citation If you use this model, please cite our work as follows: ``` @inproceedings{Raza2024LlamaBiasClassifier, title={Llama3.2 NLP Bias Classifier for Disinformation Detection}, author={Shaina Raza}, year={2024} } ``` For more information, see the model repository: [https://huggingface.co/POLLCHECK/Llama3.2-bias-classifier](https://huggingface.co/POLLCHECK/Llama3.2-bias-classifier)