QCRI
/

LlamaLens: Specialized Multilingual LLM forAnalyzing News and Social Media Content

Overview

LlamaLens is a specialized multilingual LLM designed for analyzing news and social media content. It focuses on 18 NLP tasks, leveraging 52 datasets across Arabic, English, and Hindi.

capablities_tasks_datasets

Dataset

The model was trained on the LlamaLens dataset.

To Replicate the Experiments

The code to replicate the experiments is available on GitHub.

Model Inference

To utilize the LlamaLens model for inference, follow these steps:

  1. Install the Required Libraries:

    Ensure you have the necessary libraries installed. You can do this using pip:

    pip install transformers torch
    
  2. Load the Model and Tokenizer:: Use the transformers library to load the LlamaLens model and its tokenizer:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Define model path
MODEL_PATH = "QCRI/LlamaLens"

# Load model and tokenizer
device_map = "auto"
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map=device_map)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
  1. Prepare the Input:: Tokenize your input text:
# Define task and input text
task = "classification"  # Change to "summarization" for summarization tasks
instruction = (
    "Analyze the text and indicate if it shows an emotion, then label it as joy, love, fear,"
    " anger, sadness, or surprise. Return only the label without any explanation, justification, or additional text."
)
input_text = "I am not creating anything I feel satisfied with."
output_prefix = "Summary: " if task == "summarization" else "Label: "

# Define messages for chat-based prompt format
messages = [
    {"role": "system", "content": "You are a social media expert providing accurate analysis and insights."},
    {"role": "user", "content": f"{instruction}\nInput: {input_text}"},
    {"role": "assistant", "content": output_prefix}
]

# Tokenize input
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=False,
    continue_final_message=True,
    tokenize=True,
    padding=True,
    return_tensors="pt"
).to(model.device)


  1. Generate the Output:: Generate a response using the model:
# Generate response
outputs = model.generate(
    input_ids,
    max_new_tokens=128,
    do_sample=False,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.eos_token_id,
    temperature=0.001
)

# Decode and print response
response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Results

Below, we present the performance of L-Lens: LlamaLens , where "Eng" refers to the English-instructed model and "Native" refers to the model trained with native language instructions. The results are compared against the SOTA (where available) and the Base: Llama-Instruct 3.1 baseline. The Δ (Delta) column indicates the difference between LlamaLens and the SOTA performance, calculated as (LlamaLens – SOTA).


Arabic

Task Dataset Metric SOTA Base L-Lens-Eng L-Lens-Native Δ (L-Lens (Eng) - SOTA)
Attentionworthiness Detection CT22Attentionworthy W-F1 0.412 0.158 0.425 0.454 0.013
Checkworthiness Detection CT24_checkworthy F1_Pos 0.569 0.610 0.502 0.509 -0.067
Claim Detection CT22Claim Acc 0.703 0.581 0.734 0.756 0.031
Cyberbullying Detection ArCyc_CB Acc 0.863 0.766 0.870 0.833 0.007
Emotion Detection Emotional-Tone W-F1 0.658 0.358 0.705 0.736 0.047
Emotion Detection NewsHeadline Acc 1.000 0.406 0.480 0.458 -0.520
Factuality Arafacts Mi-F1 0.850 0.210 0.771 0.738 -0.079
Factuality COVID19Factuality W-F1 0.831 0.492 0.800 0.840 -0.031
Harmfulness Detection CT22Harmful F1_Pos 0.557 0.507 0.523 0.535 -0.034
Hate Speech Detection annotated-hatetweets-4-classes W-F1 0.630 0.257 0.526 0.517 -0.104
Hate Speech Detection OSACT4SubtaskB Mi-F1 0.950 0.819 0.955 0.955 0.005
News Categorization ASND Ma-F1 0.770 0.587 0.919 0.929 0.149
News Categorization SANADAkhbarona-news-categorization Acc 0.940 0.784 0.954 0.953 0.014
News Categorization SANADAlArabiya-news-categorization Acc 0.974 0.893 0.987 0.985 0.013
News Categorization SANADAlkhaleej-news-categorization Acc 0.986 0.865 0.984 0.982 -0.002
News Categorization UltimateDataset Ma-F1 0.970 0.376 0.865 0.880 -0.105
News Credibility NewsCredibilityDataset Acc 0.899 0.455 0.935 0.933 0.036
News Summarization xlsum R-2 0.137 0.034 0.129 0.130 -0.009
Offensive Language Detection ArCyc_OFF Ma-F1 0.878 0.489 0.877 0.879 -0.001
Offensive Language Detection OSACT4SubtaskA Ma-F1 0.905 0.782 0.896 0.882 -0.009
Propaganda Detection ArPro Mi-F1 0.767 0.597 0.747 0.731 -0.020
Sarcasm Detection ArSarcasm-v2 F1_Pos 0.584 0.477 0.520 0.542 -0.064
Sentiment Classification ar_reviews_100k F1_Pos -- 0.681 0.785 0.779 --
Sentiment Classification ArSAS Acc 0.920 0.603 0.800 0.804 -0.120
Stance Detection stance Ma-F1 0.767 0.608 0.926 0.881 0.159
Stance Detection Mawqif-Arabic-Stance-main Ma-F1 0.789 0.764 0.853 0.826 0.065
Subjectivity Detection ThatiAR f1_pos 0.800 0.562 0.441 0.383 -0.359

English

Task Dataset Metric SOTA Base L-Lens-Eng L-Lens-Native Δ (L-Lens (Eng) - SOTA)
Checkworthiness Detection CT24_checkworthy f1_pos 0.753 0.404 0.942 0.942 0.189
Claim Detection claim-detection Mi-F1 -- 0.545 0.864 0.889 --
Cyberbullying Detection Cyberbullying Acc 0.907 0.175 0.836 0.855 -0.071
Emotion Detection emotion Ma-F1 0.790 0.353 0.803 0.808 0.013
Factuality News_dataset Acc 0.920 0.654 1.000 1.000 0.080
Factuality Politifact W-F1 0.490 0.121 0.287 0.311 -0.203
News Categorization CNN_News_Articles_2011-2022 Acc 0.940 0.644 0.970 0.970 0.030
News Categorization News_Category_Dataset Ma-F1 0.769 0.970 0.824 0.520 0.055
News Genre Categorisation SemEval23T3-subtask1 Mi-F1 0.815 0.687 0.241 0.253 -0.574
News Summarization xlsum R-2 0.152 0.074 0.182 0.181 0.030
Offensive Language Detection Offensive_Hateful_Dataset_New Mi-F1 -- 0.692 0.814 0.813 --
Offensive Language Detection offensive_language_dataset Mi-F1 0.994 0.646 0.899 0.893 -0.095
Offensive Language and Hate Speech hate-offensive-speech Acc 0.945 0.602 0.931 0.935 -0.014
Propaganda Detection QProp Ma-F1 0.667 0.759 0.963 0.973 0.296
Sarcasm Detection News-Headlines-Dataset-For-Sarcasm-Detection Acc 0.897 0.668 0.936 0.947 0.039
Sentiment Classification NewsMTSC-dataset Ma-F1 0.817 0.628 0.751 0.748 -0.066
Subjectivity Detection clef2024-checkthat-lab Ma-F1 0.744 0.535 0.642 0.628 -0.102

Hindi

Task Dataset Metric SOTA Base L-Lens-Eng L-Lens-Native Δ (L-Lens (Eng) - SOTA)
Factuality fake-news Mi-F1 -- 0.759 0.994 0.993 --
Hate Speech Detection hate-speech-detection Mi-F1 0.639 0.750 0.963 0.963 0.324
Hate Speech Detection Hindi-Hostility-Detection-CONSTRAINT-2021 W-F1 0.841 0.469 0.753 0.753 -0.088
Natural Language Inference Natural Language Inference W-F1 0.646 0.633 0.568 0.679 -0.078
News Summarization xlsum R-2 0.136 0.078 0.171 0.170 0.035
Offensive Language Detection Offensive Speech Detection Mi-F1 0.723 0.621 0.862 0.865 0.139
Cyberbullying Detection MC_Hinglish1 Acc 0.609 0.233 0.625 0.627 0.016
Sentiment Classification Sentiment Analysis Acc 0.697 0.552 0.647 0.654 -0.050

Paper

For an in-depth understanding, refer to our paper: LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content.

License

This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Citation

Please cite our paper when using this model:

   @article{kmainasi2024llamalensspecializedmultilingualllm,
     title={LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content},
     author={Mohamed Bayan Kmainasi and Ali Ezzat Shahroor and Maram Hasanain and Sahinur Rahman Laskar and Naeemul Hassan and Firoj Alam},
     year={2024},
     journal={arXiv preprint arXiv:2410.15308},
     volume={},
     number={},
     pages={},
     url={https://arxiv.org/abs/2410.15308},
     eprint={2410.15308},
     archivePrefix={arXiv},
     primaryClass={cs.CL}
   }
Downloads last month
57
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for QCRI/LlamaLens

Finetuned
(879)
this model
Quantizations
1 model

Datasets used to train QCRI/LlamaLens

Collection including QCRI/LlamaLens