πŸ‘οΈ GLaMM-FullScope


πŸ“ Description

GLaMM-FullScope encompasses all capabilities of GLaMM, which is mixed finetuned with many open-source datasets. "Full" signifies its comprehensive nature, incorporating the full range of GLaMM capabilities including Grounded Conversation Generation (GCG), Referring Expression Segmentation, Region-level Captioning, Image-level captioning and Visual Question Answering.

πŸ’» Download

To get started with GLaMM-FullScope, follow these steps:

git lfs install
git clone https://huggingface.co/MBZUAI/GLaMM-FullScope

πŸ“š Additional Resources

πŸ“œ Citations and Acknowledgments

  @article{hanoona2023GLaMM,
          title={GLaMM: Pixel Grounding Large Multimodal Model},
          author={Rasheed, Hanoona and Maaz, Muhammad and Shaji, Sahal and Shaker, Abdelrahman and Khan, Salman and Cholakkal, Hisham and Anwer, Rao M. and Xing, Eric and Yang, Ming-Hsuan and Khan, Fahad S.},
          journal={ArXiv 2311.03356},
          year={2023}
  }
Downloads last month
342
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including MBZUAI/GLaMM-FullScope