Citation

If you find this model useful, please cite the following paper

@article{huang2024deciphering,
  title={Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate},
  author={Huang, Qidong and Dong, Xiaoyi and Zhang, Pan and Zang, Yuhang and Cao, Yuhang and Wang, Jiaqi and Lin, Dahua and Zhang, Weiming and Yu, Nenghai},
  journal={arXiv preprint arXiv:2410.07167},
  year={2024}
}

Downloads last month: 48

Safetensors

Model size

7.06B params

Tensor type

BF16

Inference API

Image-Text-to-Text

Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for shikiw/LLaVA-v1.5-MoCa-7B-pretrain

Base model

lmsys/vicuna-7b-v1.5

Finetuned

(45)

this model

shikiw
/

LLaVA-v1.5-MoCa-7B-pretrain

Citation

Model tree for shikiw/LLaVA-v1.5-MoCa-7B-pretrain

Dataset used to train shikiw/LLaVA-v1.5-MoCa-7B-pretrain