|
# Model Card |
|
<!-- Provide a quick summary of what the model is/does. --> |
|
Parrot is a multi-language and multi-modal large language model capable of achieving excellent performance. |
|
For a comprehensive introduction, please refer to [Parrot Paper](https://arxiv.org/abs/2406.02539) and [Parrot GitHub](https://github.com/AIDC-AI/Parrot). |
|
|
|
# Model Details |
|
![](https://github.com/AIDC-AI/Parrot/images/teaser.png) |
|
|
|
# Performance |
|
![](https://github.com/AIDC-AI/Parrot/images/teaser.png) |
|
|
|
# Usage |
|
|
|
Below is a code snippet to run Parrot with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Parrot GitHub](https://github.com/AIDC-AI/Parrot). |
|
```markdown |
|
pip install torch==2.1.2 transformers==4.43.2 pillow==10.3.0 |
|
``` |
|
```python |
|
import torch |
|
from PIL import Image |
|
from transformers import AutoModelForCausalLM |
|
``` |
|
|
|
# Citation |
|
If you find Parrot useful, please cite the paper |
|
|
|
```markdown |
|
@article{sun2024parrot, |
|
title={Parrot: Multilingual Visual Instruction Tuning}, |
|
author={Sun, Hai-Long and Zhou, Da-Wei and Li, Yang and Lu, Shiyin and Yi, Chao and Chen, Qing-Guo and Xu, Zhao and Luo, Weihua and Zhang, Kaifu and Zhan, De-Chuan and others}, |
|
journal={arXiv preprint arXiv:2406.02539}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
# License |
|
The project is licensed under Apache License Version 2.0 and is restricted to uses that comply with the license agreements of Qwen and Clip. |
|
|