tanganke
/

clip-vit-large-patch14_mnist

Feature Extraction

clip_vision_model

Inference Endpoints

Model card Files Files and versions Community

Model Card

Model Details

Architecture: ViT-Large with patch size 14
Training Data: MNIST dataset

Training Details

Adam Optimizer with a constant learning rate 1e-5 for 4000 steps training (batch_size=32). Only the vision encoder is fine-tuned.

Evaluation Results

pre-trained: 0.7602328658103943
fine-tuned: 0.9975429177284241

Downloads last month: 113

Safetensors

Model size

303M params

Tensor type

F32

·

Inference Providers NEW

Feature Extraction

This model is not currently available via any of the supported Inference Providers.

Model tree for tanganke/clip-vit-large-patch14_mnist

Base model

openai/clip-vit-large-patch14

Finetuned

(61)

this model

Dataset used to train tanganke/clip-vit-large-patch14_mnist

Collection including tanganke/clip-vit-large-patch14_mnist

CLIP-ViT-L/14 on the eight image classification tasks

if you find these models helpful, consider citing [our paper](https://arxiv.org/abs/2406.03280) • 9 items • Updated Aug 27, 2024