"Hello, may I ask if this model is the image-text retrieval model (BLIP-2 ViT-g) fine-tuned on the COCO dataset as mentioned in the BLIP-2 paper?"
· Sign up or log in to comment