# Trash Image Classification using Vision Transformer (ViT)

This repository contains an implementation of an image classification model using a pre-trained Vision Transformer (ViT) model from Hugging Face. The model is fine-tuned to classify images into six categories: cardboard, glass, metal, paper, plastic, and trash.

## Dataset

The dataset consists of images from six categories from [`garythung/trashnet`](https://huggingface.co/datasets/garythung/trashnet) with the following distribution:

- Cardboard: 806 images
- Glass: 1002 images
- Metal: 820 images
- Paper: 1188 images
- Plastic: 964 images
- Trash: 274 images

## Model

We utilize the pre-trained Vision Transformer model [`google/vit-base-patch16-224-in21k`](https://huggingface.co/google/vit-base-patch16-224-in21k) from Hugging Face for image classification. The model is fine-tuned on the dataset to achieve optimal performance.

The trained model is accessible on Hugging Face Hub at: [tribber93/my-trash-classification](https://huggingface.co/tribber93/my-trash-classification)

## Usage

To use the model for inference, follow these steps:

1. Load the pre-trained model from Hugging Face:

```python
from transformers import AutoModelForImageClassification, AutoImageProcessor

model_name = "tribber93/my-trash-classification"
model = AutoModelForImageClassification.from_pretrained(model_name)
image_processor = AutoImageProcessor.from_pretrained(model_name)
```

2. Prepare an image for prediction:

```python
from PIL import Image
import torch

image = Image.open("path_to_image.jpg").convert("RGB")
pixel_values = image_processor(image, return_tensors="pt").pixel_values

outputs = model(pixel_values)
predictions = torch.argmax(outputs.logits, dim=-1)
print("Predicted class:", model.config.id2label[predictions.item()])
```

## Results

After training, the model achieved the following performance:

| Epoch | Training Loss | Validation Loss | Accuracy |
|-------|---------------|-----------------|----------|
| 1     | 3.3200        | 0.7011          | 86.25%   |
| 2     | 1.6611        | 0.4298          | 91.49%   |
| 3     | 1.4353        | 0.3563          | 94.26%   |