finetune_colpali_v1_2-ufo-4bit

This model is a fine-tuned version of vidore/colpaligemma-3b-pt-448-base on the davanstrien/ufo-ColPali dataset.

The model was trained using the fine tuning notebook from tonywu71. I changed almost nothing except the data processing steps.

The dataset used for training was created using synthetic data from Qwen/Qwen2-VL-7B-Instruct. The process for making this dataset is discussed more in the blog post.

The model achieves the following results on the evaluation set:

  • Loss: 0.1064
  • Model Preparation Time: 0.0056

Model description

This model is a fine tune of a ColPali vidore/colpaligemma-3b-pt-448-base:

ColPali is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a PaliGemma-3B extension that generates ColBERT- style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models.

Intended uses & limitations

For retrieving UFO newsletters documents.

Training and evaluation data

The training data was created via the following steps:

  • Downloading a sample of UFO newsletters from this Internet archive Collection.
  • Using the pdf-to-page-images-dataset Space to convert the PDF documents into a single page image dataset
  • Use a VLM to generate synthetic queries for these documents using the approach outlines here. This results in davanstrien/ufo-ColPali.
  • Train the model using the fine tuning notebook from tonywu71. I changed almost nothing except the data processing steps.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 1.5

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
No log 0.0041 1 0.1879 0.0056
0.1193 0.4090 100 0.1136 0.0056
0.1287 0.8180 200 0.1122 0.0056
0.0662 1.2270 300 0.1063 0.0056

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for davanstrien/finetune_colpali_v1_2-ufo-4bit

Finetuned
(27)
this model

Dataset used to train davanstrien/finetune_colpali_v1_2-ufo-4bit

Space using davanstrien/finetune_colpali_v1_2-ufo-4bit 1