trojblue/distill-q-align-quality-siglip2-base

This model is a fine-tuned version of google/siglip2-base-patch16-512, specialized for predicting image quality in anime illustrations based on Q-Align's quality metrics.

It achieves the following performance on the evaluation set:

  • Loss: 0.0225
  • MSE: 0.0225
  • RMSE: 0.1503

Model Description

This model is a distilled version of the original Q-Align model, using Siglip2-base as its backbone. It is designed for simpler, faster, and more practical deployment compared to the original model. Benefits include:

  • Easier integration (no special transformer version required).
  • Smaller model size and efficient batching.
  • Produces results very close to the original Q-Align model.

Intended Uses & Limitations

This model is designed specifically for Image Quality Analysis (IQA) tasks on anime illustrations.

It distinguishes image quality from aesthetics, making it suitable for image filtering, recommendations, and curation tasks where quality assessment is essential.

Training and Evaluation Data

The model was trained using approximately 5.8 million anime illustrations sourced equally from Danbooru and Twitter. The training procedure involved:

  1. Generating quality predictions using the original Q-Align model.
  2. Training the Siglip2-base model to mimic these predictions.

Training Procedure

Given the exceptionally low MSE achieved, smaller architectures might also effectively distill the Q-Align model's capabilities, potentially even simple models like Multi-Layer Perceptrons (MLPs).

Hyperparameters

  • Learning rate: 5e-06
  • Train batch size: 320
  • Eval batch size: 320
  • Seed: 1337
  • Optimizer: AdamW (betas=(0.9, 0.999), epsilon=1e-08)
  • LR Scheduler: Cosine (warmup steps: 1000)
  • Epochs: 4.0
  • Mixed precision training: Native AMP

Framework Versions

  • Transformers 4.49.0
  • PyTorch 2.1.0+cu121
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
93.5M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for trojblue/distill-q-align-quality-siglip2-base

Finetuned
(2)
this model