therealcyberlord
/

stanford-car-vit-patch16

Image Classification

Inference Endpoints

Model card Files Files and versions Community

stanford-car-vit-patch16 / README.md

therealcyberlord's picture

therealcyberlord

Update README.md

4a2c3da over 1 year ago

|

1.17 kB

	---
	license: apache-2.0
	---

	# ViT Fine-tuned on Stanford Car Dataset

	Base model: https://huggingface.co/google/vit-base-patch16-224

	This achieves around 86% on the testing set, you can use it as a baseline for further tuning.

	# Dataset Description

	The Stanford car dataset contains 16,185 images of 196 classes of cars. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe. The data is split into 8144 training images, 6,041 testing images, and 2000 validation images in this case.

	Please note: this dataset does not contain newer car models

	# Using the Model in the Transformer Library

	```
	from transformers import AutoFeatureExtractor, AutoModelForImageClassification

	extractor = AutoFeatureExtractor.from_pretrained("therealcyberlord/stanford-car-vit-patch16")
	model = AutoModelForImageClassification.from_pretrained("therealcyberlord/stanford-car-vit-patch16")
	```


	# Citations
	3D Object Representations for Fine-Grained Categorization
	Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
	4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.