|
--- |
|
language: |
|
- ar |
|
pipeline_tag: visual-question-answering |
|
--- |
|
|
|
# Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic |
|
|
|
Dallah is an advanced multimodal large language model (MLLM) tailored for the Arabic language, with a specific focus on understanding and generating content across various Arabic dialects. Built upon the **LLaVA** framework and powered by the **LLaMA-2** architecture, Dallah integrates both textual and visual data to facilitate comprehensive multimodal interactions. |
|
|
|
## Model Details |
|
|
|
- **Architecture**: LLaVA-based multimodal model with LLaMA-2 backbone. |
|
- **Languages Supported**: Modern Standard Arabic (MSA) and six major Arabic dialects. |
|
- **Modalities**: Text and image. |
|
|
|
## Training Data |
|
|
|
Dallah was fine-tuned on a diverse dataset encompassing both textual and visual information: |
|
- **Textual Data**: Includes MSA and six prominent Arabic dialects, ensuring the model's proficiency across different regional linguistic variations. |
|
- **Visual Data**: Comprised of image-text pairs, enabling the model to process and generate content that integrates both modalities. |
|
|
|
## Performance |
|
|
|
Dallah demonstrates state-of-the-art performance in Arabic MLLMs: |
|
- Excels in both MSA and dialectal Arabic benchmarks. |
|
- Effectively handles complex multimodal interactions involving textual and visual elements. |
|
|
|
## Applications |
|
|
|
Dallah’s multimodal and dialect-aware capabilities make it suitable for a range of applications, including: |
|
- **Multilingual Chatbots**: Enhancing user interactions by understanding and responding in specific Arabic dialects. |
|
- **Content Creation**: Assisting in generating culturally and linguistically appropriate content for diverse Arabic-speaking audiences. |
|
- **Educational Tools**: Supporting language learning by providing examples and explanations in various dialects. |
|
- **Cultural Preservation**: Documenting and promoting the use of different Arabic dialects on digital platforms. |
|
|
|
|
|
## Citation |
|
|
|
If you use Dallah in your research or applications, please cite the following paper: |
|
|
|
```bibtex |
|
@inproceedings{alwajih2024dallah, |
|
title={Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic}, |
|
author={Alwajih, Fakhraddin and Bhatia, Gagan and Abdul-Mageed, Muhammad}, |
|
booktitle={Proceedings of The Second Arabic Natural Language Processing Conference}, |
|
pages={320--336}, |
|
year={2024}, |
|
address={Bangkok, Thailand}, |
|
publisher={Association for Computational Linguistics}, |
|
url={https://aclanthology.org/2024.arabicnlp-1.27} |
|
} |