--- license: cc-by-nc-sa-4.0 datasets: - QCRI/LlamaLens-English - QCRI/LlamaLens-Arabic - QCRI/LlamaLens-Hindi language: - ar - en - hi base_model: - meta-llama/Llama-3.1-8B-Instruct pipeline_tag: text-generation tags: - Social-Media - Hate-Speech - Summarization - offensive-language - News-Genre --- # LlamaLens: Specialized Multilingual LLM forAnalyzing News and Social Media Content ## Overview LlamaLens is a specialized multilingual LLM designed for analyzing news and social media content. It focuses on 19 NLP tasks, leveraging 52 datasets across Arabic, English, and Hindi.

capablities_tasks_datasets

## Dataset The model was trained on the [LlamaLens dataset](https://huggingface.co/collections/QCRI/llamalens-672f7e0604a0498c6a2f0fe9). ## To Replicate the Experiments The code to replicate the experiments is available on [GitHub](https://github.com/firojalam/LlamaLens). ## Model Inference To utilize the LlamaLens model for inference, follow these steps: 1. **Install the Required Libraries**: Ensure you have the necessary libraries installed. You can do this using pip: ```bash pip install transformers torch ``` 2. **Load the Model and Tokenizer:**: Use the transformers library to load the LlamaLens model and its tokenizer: ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "QCRI/LlamaLens" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) ``` 3. **Prepare the Input:**: Tokenize your input text: ```python input_text = "Your input text here" inputs = tokenizer(input_text, return_tensors="pt") ``` 4. **Generate the Output:**: Generate a response using the model: ```python output = model.generate(**inputs) response = tokenizer.decode(output[0], skip_special_tokens=True) print(response) ``` ## Paper For an in-depth understanding, refer to our paper: [**LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content**](https://arxiv.org/pdf/2410.15308). # License This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). # Citation Please cite [our paper](https://arxiv.org/pdf/2410.15308) when using this model: ``` @article{kmainasi2024llamalensspecializedmultilingualllm, title={LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content}, author={Mohamed Bayan Kmainasi and Ali Ezzat Shahroor and Maram Hasanain and Sahinur Rahman Laskar and Naeemul Hassan and Firoj Alam}, year={2024}, journal={arXiv preprint arXiv:2410.15308}, volume={}, number={}, pages={}, url={https://arxiv.org/abs/2410.15308}, eprint={2410.15308}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```