LlamaLens: Specialized Multilingual LLM forAnalyzing News and Social Media Content

Overview

LlamaLens is a specialized multilingual LLM designed for analyzing news and social media content. It focuses on 18 NLP tasks, leveraging 52 datasets across Arabic, English, and Hindi.

Dataset

The model was trained on the LlamaLens dataset.

To Replicate the Experiments

The code to replicate the experiments is available on GitHub.

Model Inference

To utilize the LlamaLens model for inference, follow these steps:

Install the Required Libraries:

Ensure you have the necessary libraries installed. You can do this using pip:
```
pip install transformers torch
```
Load the Model and Tokenizer:: Use the transformers library to load the LlamaLens model and its tokenizer:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Define model path
MODEL_PATH = "QCRI/LlamaLens"

# Load model and tokenizer
device_map = "auto"
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map=device_map)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

Prepare the Input:: Tokenize your input text:

# Define task and input text
task = "classification"  # Change to "summarization" for summarization tasks
instruction = (
    "Analyze the text and indicate if it shows an emotion, then label it as joy, love, fear,"
    " anger, sadness, or surprise. Return only the label without any explanation, justification, or additional text."
)
input_text = "I am not creating anything I feel satisfied with."
output_prefix = "Summary: " if task == "summarization" else "Label: "

# Define messages for chat-based prompt format
messages = [
    {"role": "system", "content": "You are a social media expert providing accurate analysis and insights."},
    {"role": "user", "content": f"{instruction}\nInput: {input_text}"},
    {"role": "assistant", "content": output_prefix}
]

# Tokenize input
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=False,
    continue_final_message=True,
    tokenize=True,
    padding=True,
    return_tensors="pt"
).to(model.device)

Generate the Output:: Generate a response using the model:

# Generate response
outputs = model.generate(
    input_ids,
    max_new_tokens=128,
    do_sample=False,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.eos_token_id,
    temperature=0.001
)

# Decode and print response
response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Results

Below, we present the performance of L-Lens: LlamaLens , where "Eng" refers to the English-instructed model and "Native" refers to the model trained with native language instructions. The results are compared against the SOTA (where available) and the Base: Llama-Instruct 3.1 baseline. The Δ (Delta) column indicates the difference between LlamaLens and the SOTA performance, calculated as (LlamaLens – SOTA).

Arabic

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Attentionworthiness Detection	CT22Attentionworthy	W-F1	0.412	0.158	0.425	0.454	0.013
Checkworthiness Detection	CT24_checkworthy	F1_Pos	0.569	0.610	0.502	0.509	-0.067
Claim Detection	CT22Claim	Acc	0.703	0.581	0.734	0.756	0.031
Cyberbullying Detection	ArCyc_CB	Acc	0.863	0.766	0.870	0.833	0.007
Emotion Detection	Emotional-Tone	W-F1	0.658	0.358	0.705	0.736	0.047
Emotion Detection	NewsHeadline	Acc	1.000	0.406	0.480	0.458	-0.520
Factuality	Arafacts	Mi-F1	0.850	0.210	0.771	0.738	-0.079
Factuality	COVID19Factuality	W-F1	0.831	0.492	0.800	0.840	-0.031
Harmfulness Detection	CT22Harmful	F1_Pos	0.557	0.507	0.523	0.535	-0.034
Hate Speech Detection	annotated-hatetweets-4-classes	W-F1	0.630	0.257	0.526	0.517	-0.104
Hate Speech Detection	OSACT4SubtaskB	Mi-F1	0.950	0.819	0.955	0.955	0.005
News Categorization	ASND	Ma-F1	0.770	0.587	0.919	0.929	0.149
News Categorization	SANADAkhbarona-news-categorization	Acc	0.940	0.784	0.954	0.953	0.014
News Categorization	SANADAlArabiya-news-categorization	Acc	0.974	0.893	0.987	0.985	0.013
News Categorization	SANADAlkhaleej-news-categorization	Acc	0.986	0.865	0.984	0.982	-0.002
News Categorization	UltimateDataset	Ma-F1	0.970	0.376	0.865	0.880	-0.105
News Credibility	NewsCredibilityDataset	Acc	0.899	0.455	0.935	0.933	0.036
News Summarization	xlsum	R-2	0.137	0.034	0.129	0.130	-0.009
Offensive Language Detection	ArCyc_OFF	Ma-F1	0.878	0.489	0.877	0.879	-0.001
Offensive Language Detection	OSACT4SubtaskA	Ma-F1	0.905	0.782	0.896	0.882	-0.009
Propaganda Detection	ArPro	Mi-F1	0.767	0.597	0.747	0.731	-0.020
Sarcasm Detection	ArSarcasm-v2	F1_Pos	0.584	0.477	0.520	0.542	-0.064
Sentiment Classification	ar_reviews_100k	F1_Pos	--	0.681	0.785	0.779	--
Sentiment Classification	ArSAS	Acc	0.920	0.603	0.800	0.804	-0.120
Stance Detection	stance	Ma-F1	0.767	0.608	0.926	0.881	0.159
Stance Detection	Mawqif-Arabic-Stance-main	Ma-F1	0.789	0.764	0.853	0.826	0.065
Subjectivity Detection	ThatiAR	f1_pos	0.800	0.562	0.441	0.383	-0.359

English

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Checkworthiness Detection	CT24_checkworthy	f1_pos	0.753	0.404	0.942	0.942	0.189
Claim Detection	claim-detection	Mi-F1	--	0.545	0.864	0.889	--
Cyberbullying Detection	Cyberbullying	Acc	0.907	0.175	0.836	0.855	-0.071
Emotion Detection	emotion	Ma-F1	0.790	0.353	0.803	0.808	0.013
Factuality	News_dataset	Acc	0.920	0.654	1.000	1.000	0.080
Factuality	Politifact	W-F1	0.490	0.121	0.287	0.311	-0.203
News Categorization	CNN_News_Articles_2011-2022	Acc	0.940	0.644	0.970	0.970	0.030
News Categorization	News_Category_Dataset	Ma-F1	0.769	0.970	0.824	0.520	0.055
News Genre Categorisation	SemEval23T3-subtask1	Mi-F1	0.815	0.687	0.241	0.253	-0.574
News Summarization	xlsum	R-2	0.152	0.074	0.182	0.181	0.030
Offensive Language Detection	Offensive_Hateful_Dataset_New	Mi-F1	--	0.692	0.814	0.813	--
Offensive Language Detection	offensive_language_dataset	Mi-F1	0.994	0.646	0.899	0.893	-0.095
Offensive Language and Hate Speech	hate-offensive-speech	Acc	0.945	0.602	0.931	0.935	-0.014
Propaganda Detection	QProp	Ma-F1	0.667	0.759	0.963	0.973	0.296
Sarcasm Detection	News-Headlines-Dataset-For-Sarcasm-Detection	Acc	0.897	0.668	0.936	0.947	0.039
Sentiment Classification	NewsMTSC-dataset	Ma-F1	0.817	0.628	0.751	0.748	-0.066
Subjectivity Detection	clef2024-checkthat-lab	Ma-F1	0.744	0.535	0.642	0.628	-0.102

Hindi

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Factuality	fake-news	Mi-F1	--	0.759	0.994	0.993	--
Hate Speech Detection	hate-speech-detection	Mi-F1	0.639	0.750	0.963	0.963	0.324
Hate Speech Detection	Hindi-Hostility-Detection-CONSTRAINT-2021	W-F1	0.841	0.469	0.753	0.753	-0.088
Natural Language Inference	Natural Language Inference	W-F1	0.646	0.633	0.568	0.679	-0.078
News Summarization	xlsum	R-2	0.136	0.078	0.171	0.170	0.035
Offensive Language Detection	Offensive Speech Detection	Mi-F1	0.723	0.621	0.862	0.865	0.139
Cyberbullying Detection	MC_Hinglish1	Acc	0.609	0.233	0.625	0.627	0.016
Sentiment Classification	Sentiment Analysis	Acc	0.697	0.552	0.647	0.654	-0.050

Paper

For an in-depth understanding, refer to our paper: LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content.

License

This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Citation

Please cite our paper when using this model:

   @article{kmainasi2024llamalensspecializedmultilingualllm,
     title={LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content},
     author={Mohamed Bayan Kmainasi and Ali Ezzat Shahroor and Maram Hasanain and Sahinur Rahman Laskar and Naeemul Hassan and Firoj Alam},
     year={2024},
     journal={arXiv preprint arXiv:2410.15308},
     volume={},
     number={},
     pages={},
     url={https://arxiv.org/abs/2410.15308},
     eprint={2410.15308},
     archivePrefix={arXiv},
     primaryClass={cs.CL}
   }

QCRI
/

LlamaLens