AventIQ-AI
/

Single-Label-General-Image-Classifier

Model card Files Files and versions Community

DeepakKumarMSL commited on Jun 11

Commit

9ff2532

·

verified ·

1 Parent(s): fc891ec

Create README.md

Files changed (1) hide show

README.md +87 -0

README.md ADDED Viewed

	@@ -0,0 +1,87 @@

+# 🧠 Image Classification AI Model (CIFAR-100)
+This repository contains a Vision Transformer (ViT)-based AI model fine-tuned for **image classification** on the CIFAR-100 dataset. The model is built using `google/vit-base-patch16-224`, quantized to **FP16** for efficient inference, and delivers high accuracy in multi-class image classification tasks.
+---
+## 🚀 Features
+- 🖼️ **Task**: Image Classification
+- 🧠 **Base Model**: `google/vit-base-patch16-224` (Vision Transformer)
+- 🧪 **Quantized**: FP16 for faster and memory-efficient inference
+- 🎯 **Dataset**: 100 fine-grained object categories
+- ⚡ **CUDA Enabled**: Optimized for GPU acceleration
+- 📈 **High Accuracy**: Fine-tuned and evaluated on validation split
+---
+## 📊 Dataset Used
+**Hugging Face Dataset**: [`tanganke/cifar100`](https://huggingface.co/datasets/tanganke/cifar100)
+- **Description**: CIFAR-100 is a dataset of 60,000 32×32 color images in 100 classes (600 images per class)
+- **Split**: 50,000 training images and 10,000 test images
+- **Categories**: Animals, Vehicles, Food, Household items, etc.
+- **License**: MIT License (from source)
+```python
+from datasets import load_dataset
+dataset = load_dataset("tanganke/cifar100")
+```
+## 🛠️ Model & Training Configuration
+ - Model: google/vit-base-patch16-224
+ - Image Size: 224x224 (resized from 32x32)
+ - Framework: Hugging Face Transformers & Datasets
+ - Training Environment: Kaggle Notebook with CUDA
+ - Epochs: 5–10 (with early stopping)
+ - Batch Size: 32
+ - Optimizer: AdamW
+ - Loss Function: CrossEntropyLoss
+# ✅ Evaluation & Scoring
+ - Accuracy: ~70–80% (varies by configuration)
+ - Validation Tool: evaluate or sklearn.metrics
+ - Metric: Accuracy, Top-1 and Top-5 scores
+ - Inference Speed: Significantly faster after quantizationextractor = ViTFeatureExtractor.from_pretrained("google/vit-base-patch16-224")
+# 🔍 Inference Example
+```python
+from PIL import Image
+import torch
+def predict(image_path):
+    image = Image.open(image_path).convert("RGB")
+    inputs = feature_extractor(images=image, return_tensors="pt").to("cuda")
+    outputs = model(**inputs)
+    logits = outputs.logits
+    predicted_class = logits.argmax(-1).item()
+    return dataset["train"].features["fine_label"].int2str(predicted_class)
+print(predict("sample_image.jpg"))
+```
+# 📁 Folder Structure
+📦image-classification-vit
+ ┣ 📂vit-cifar100-fp16
+ ┣ 📜train.py
+ ┣ 📜inference.py
+ ┣ 📜README.md
+ ┗ 📜requirements.txt