Edit model card

CLIP-FlanT5-XL (VQAScore)

This model is a fine-tuned version of google/flan-t5-xl designed for image-text retrieval tasks, as presented in the VQAScore paper.

Model Description

  • Developed by: Zhiqiu Lin and collaborators
  • Model type: Vision-Language Generative Model
  • License: Apache-2.0
  • Finetuned from model: google/flan-t5-xxl

Model Sources [optional]

Downloads last month
2,429
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for zhiqiulin/clip-flant5-xl

Base model

google/flan-t5-xl
Finetuned
(21)
this model