LlamaLens: Specialized Multilingual LLM forAnalyzing News and Social Media Content

Overview

LlamaLens is a specialized multilingual LLM designed for analyzing news and social media content. It focuses on 18 NLP tasks, leveraging 52 datasets across Arabic, English, and Hindi.

Dataset

The model was trained on the LlamaLens dataset.

To Replicate the Experiments

The code to replicate the experiments is available on GitHub.

Model Inference

To utilize the LlamaLens model for inference, follow these steps:

Install the Required Libraries:

Ensure you have the necessary libraries installed. You can do this using pip:
```
pip install transformers torch
```
Load the Model and Tokenizer:: Use the transformers library to load the LlamaLens model and its tokenizer:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Define model path
MODEL_PATH = "QCRI/LlamaLens-Native"

# Load model and tokenizer
device_map = "auto"
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map=device_map)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

Prepare the Input:: Tokenize your input text:

# Define task and input text
task = "classification"  # Change to "summarization" for summarization tasks
input_text = '''دبي - "الخليج": كشفت شركة "لينوفو" أمس عن هاتفها الجديد "فايب زد2 برو" VIBE Z2 Pro، والذي يُعدّ الأقوى والأغنى بالمزايا ضمن سلسلة الهواتف الذكية VIBE رفيعة المستوى بين منتجات الشركة.
ويمتاز الهاتف الذكي الفاخر VIBE Z2 Pro بأنه يجمع بين أحدث تقنيات التصوير النقال المتطورة والتصميم الرشيق والنحيف.
ويتضمن الهاتف الذكي شاشة كبيرة بقطر 6 بوصات بدقة 2 ليقدم للمستهلكين تجربة بصرية رائعة مع محتوى الوسائط المتعددة.
وتستطيع الكاميرا الخلفية المتطورة بدقة 16 ميغابيكسل والمزودة بوظيفة التثبيت البصري للصورة من التقاط صور رائعة دون عناء وتسجيل فيديو فائق الوضوح بدقة 4 لتنافس بذلك الكاميرات المدمجة عالية المستوى.
كما يمتاز الهاتف VIBE Z2 Pro بتصميم معدني فريد وملمسٍ يشابه ملمس المعدن المصقول، وأداءٍ سريع بفضل معالج سنابدراغون 801 من كوالكوم، كلّ ذلك في جهاز أنيق ونحيف لا تتجاوز سماكته 7.7 ملم.
حساس صورة متفوق ويوفر الهاتف VIBE Z2 Pro مجموعة واسعة من مزايا التصوير الاحترافية، ويمتاز على الهواتف الذكية الأخرى باستخدامه حساس صورة مضاء من الخلف (BSI) بدقة 16 ميغابيكسل ونسبة 16:9 أي أنه يلتقط الصور للشاشة العريضة بالدقة الكاملة.
كما تدعم الكاميرا الخلفية تسجيل الفيديو عالي السرعة بمعدل 120 إطاراً في الثانية وبدقة 4.
وفضلاً عن ذلك، تتضافر وظيفة التثبيت البصري للصورة والعدسة المؤلفة من 6 عناصر مع ما يقدمه الحساس المضاء من الخلف لضمان أداء قوي للكاميرا في ظروف الإضاءة المنخفضة.
شاشة ساطعة فائقة النقاء ويرتقي الهاتف VIBE Z2 Pro فوق المنافسة بشاشةٍ رائعة يبلغ قطرها 6 بوصات تعرض صوراً فائقة النقاء وبألوان واقعية تماماً.
ولا تكتفي الشاشة بدقتها العالية (560.2*440.1 بيكسل)، بل إن كثافتها التي تبلغ 490 بكسل في البوصة تضعها في مصاف أفضل الهواتف الذكية على الإطلاق.
معالج فائق السرعة وذاكرة وفيرة ويعتمد الهاتف VIBE Z2 Pro على معالج سنابدراغون 801 من كوالكوم، وهو معالج متفوق رباعي النواة يوفر أفضل أداء من حيث السرعة وتعدد المهام.
ويستطيع المعالج الذي تصل سرعته إلى 5.2 غيغاهيرتز تحميل التطبيقات وتشغيلها أسرع من المعالجات الأخرى، ويضمن للمستخدمين سلاسة تجربة تعدد المهام بفضل بنيته رباعية النواة وسعة الذاكرة المرفقة بالمعالج والتي تبلغ 3 غيغابايت.
ولإتاحة تجربة استخدام طويلة دون انقطاع، تم تزويد الهاتف VIBE Z2 Pro ببطارية عالية السعة، فضلاً عن المزايا المتقدمة لتوفير الطاقة في معالج سنابدراغون 801.'''

instruction = 'صنف المقالة الإخبارية إلى واحدة من الفئات التالية: [ثقافة, تكنولوجيا, طبي, سياسة, دين, تمويل, رياضة].'
output_prefix = "Summary: " if task == "summarization" else "Label: "

# Define messages for chat-based prompt format
messages = [
    {"role": "system", "content": "You are a social media expert providing accurate analysis and insights."},
    {"role": "user", "content": f"{instruction}\nInput: {input_text}"},
    {"role": "assistant", "content": output_prefix}
]

# Tokenize input
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=False,
    continue_final_message=True,
    tokenize=True,
    padding=True,
    return_tensors="pt"
).to(model.device)

Generate the Output:: Generate a response using the model:

# Generate response
outputs = model.generate(
    input_ids,
    max_new_tokens=128,
    do_sample=False,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.eos_token_id,
    temperature=0.001
)

# Decode and print response
response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Results

Below, we present the performance of L-Lens: LlamaLens , where "Eng" refers to the English-instructed model and "Native" refers to the model trained with native language instructions. The results are compared against the SOTA (where available) and the Base: Llama-Instruct 3.1 baseline. The Δ (Delta) column indicates the difference between LlamaLens and the SOTA performance, calculated as (LlamaLens – SOTA).

Arabic

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Attentionworthiness Detection	CT22Attentionworthy	W-F1	0.412	0.158	0.425	0.454	0.013
Checkworthiness Detection	CT24_checkworthy	F1_Pos	0.569	0.610	0.502	0.509	-0.067
Claim Detection	CT22Claim	Acc	0.703	0.581	0.734	0.756	0.031
Cyberbullying Detection	ArCyc_CB	Acc	0.863	0.766	0.870	0.833	0.007
Emotion Detection	Emotional-Tone	W-F1	0.658	0.358	0.705	0.736	0.047
Emotion Detection	NewsHeadline	Acc	1.000	0.406	0.480	0.458	-0.520
Factuality	Arafacts	Mi-F1	0.850	0.210	0.771	0.738	-0.079
Factuality	COVID19Factuality	W-F1	0.831	0.492	0.800	0.840	-0.031
Harmfulness Detection	CT22Harmful	F1_Pos	0.557	0.507	0.523	0.535	-0.034
Hate Speech Detection	annotated-hatetweets-4-classes	W-F1	0.630	0.257	0.526	0.517	-0.104
Hate Speech Detection	OSACT4SubtaskB	Mi-F1	0.950	0.819	0.955	0.955	0.005
News Categorization	ASND	Ma-F1	0.770	0.587	0.919	0.929	0.149
News Categorization	SANADAkhbarona-news-categorization	Acc	0.940	0.784	0.954	0.953	0.014
News Categorization	SANADAlArabiya-news-categorization	Acc	0.974	0.893	0.987	0.985	0.013
News Categorization	SANADAlkhaleej-news-categorization	Acc	0.986	0.865	0.984	0.982	-0.002
News Categorization	UltimateDataset	Ma-F1	0.970	0.376	0.865	0.880	-0.105
News Credibility	NewsCredibilityDataset	Acc	0.899	0.455	0.935	0.933	0.036
News Summarization	xlsum	R-2	0.137	0.034	0.129	0.130	-0.009
Offensive Language Detection	ArCyc_OFF	Ma-F1	0.878	0.489	0.877	0.879	-0.001
Offensive Language Detection	OSACT4SubtaskA	Ma-F1	0.905	0.782	0.896	0.882	-0.009
Propaganda Detection	ArPro	Mi-F1	0.767	0.597	0.747	0.731	-0.020
Sarcasm Detection	ArSarcasm-v2	F1_Pos	0.584	0.477	0.520	0.542	-0.064
Sentiment Classification	ar_reviews_100k	F1_Pos	--	0.681	0.785	0.779	--
Sentiment Classification	ArSAS	Acc	0.920	0.603	0.800	0.804	-0.120
Stance Detection	stance	Ma-F1	0.767	0.608	0.926	0.881	0.159
Stance Detection	Mawqif-Arabic-Stance-main	Ma-F1	0.789	0.764	0.853	0.826	0.065
Subjectivity Detection	ThatiAR	f1_pos	0.800	0.562	0.441	0.383	-0.359

English

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Checkworthiness Detection	CT24_checkworthy	f1_pos	0.753	0.404	0.942	0.942	0.189
Claim Detection	claim-detection	Mi-F1	--	0.545	0.864	0.889	--
Cyberbullying Detection	Cyberbullying	Acc	0.907	0.175	0.836	0.855	-0.071
Emotion Detection	emotion	Ma-F1	0.790	0.353	0.803	0.808	0.013
Factuality	News_dataset	Acc	0.920	0.654	1.000	1.000	0.080
Factuality	Politifact	W-F1	0.490	0.121	0.287	0.311	-0.203
News Categorization	CNN_News_Articles_2011-2022	Acc	0.940	0.644	0.970	0.970	0.030
News Categorization	News_Category_Dataset	Ma-F1	0.769	0.970	0.824	0.520	0.055
News Genre Categorisation	SemEval23T3-subtask1	Mi-F1	0.815	0.687	0.241	0.253	-0.574
News Summarization	xlsum	R-2	0.152	0.074	0.182	0.181	0.030
Offensive Language Detection	Offensive_Hateful_Dataset_New	Mi-F1	--	0.692	0.814	0.813	--
Offensive Language Detection	offensive_language_dataset	Mi-F1	0.994	0.646	0.899	0.893	-0.095
Offensive Language and Hate Speech	hate-offensive-speech	Acc	0.945	0.602	0.931	0.935	-0.014
Propaganda Detection	QProp	Ma-F1	0.667	0.759	0.963	0.973	0.296
Sarcasm Detection	News-Headlines-Dataset-For-Sarcasm-Detection	Acc	0.897	0.668	0.936	0.947	0.039
Sentiment Classification	NewsMTSC-dataset	Ma-F1	0.817	0.628	0.751	0.748	-0.066
Subjectivity Detection	clef2024-checkthat-lab	Ma-F1	0.744	0.535	0.642	0.628	-0.102

Hindi

Task	Dataset	Metric	SOTA	Base	L-Lens-Eng	L-Lens-Native	Δ (L-Lens (Eng) - SOTA)
Factuality	fake-news	Mi-F1	--	0.759	0.994	0.993	--
Hate Speech Detection	hate-speech-detection	Mi-F1	0.639	0.750	0.963	0.963	0.324
Hate Speech Detection	Hindi-Hostility-Detection-CONSTRAINT-2021	W-F1	0.841	0.469	0.753	0.753	-0.088
Natural Language Inference	Natural Language Inference	W-F1	0.646	0.633	0.568	0.679	-0.078
News Summarization	xlsum	R-2	0.136	0.078	0.171	0.170	0.035
Offensive Language Detection	Offensive Speech Detection	Mi-F1	0.723	0.621	0.862	0.865	0.139
Cyberbullying Detection	MC_Hinglish1	Acc	0.609	0.233	0.625	0.627	0.016
Sentiment Classification	Sentiment Analysis	Acc	0.697	0.552	0.647	0.654	-0.050

Paper

For an in-depth understanding, refer to our paper: LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content.

License

This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Citation

Please cite our paper when using this model:

   @article{kmainasi2024llamalensspecializedmultilingualllm,
     title={LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content},
     author={Mohamed Bayan Kmainasi and Ali Ezzat Shahroor and Maram Hasanain and Sahinur Rahman Laskar and Naeemul Hassan and Firoj Alam},
     year={2024},
     journal={arXiv preprint arXiv:2410.15308},
     volume={},
     number={},
     pages={},
     url={https://arxiv.org/abs/2410.15308},
     eprint={2410.15308},
     archivePrefix={arXiv},
     primaryClass={cs.CL}
   }

QCRI
/

LlamaLens-Native