Probing Visual Language Priors in VLMs
ImageDPO Finetuned Model
This page provides the ImageDPO finetuned checkpoint for LLaVA-v1.5-13B used in Probing Visual Language Priors in VLMs. We offer both LoRA parameters and the merged model weights for use.
Usage
First, install the LLaVA-v1.5 codebase.
Run the following command to have a try:
python -m llava.eval.run_llava \
--model-path ViLP/LLaVA-v1.5-13b-ImageDPO \
--image-file 'images/llava_logo.png' \
--query 'Please caption this image.' \
--conv-mode llava_v1
Citation Information
Please consider citing ViLP paper, if you find our resource helpful!
@article{luo2024probing,
title={Probing Visual Language Priors in VLMs},
author={Luo, Tiange and Cao, Ang and Lee, Gunhee and Johnson, Justin and Lee, Honglak},
journal={arXiv preprint arXiv:2501.00569},
year={2024}
}
- Downloads last month
- 11
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.