Deepfake Image Detection Using Fine-Tuned Vision Transformer (ViT)

This project focuses on detecting deepfake images using a fine-tuned version of the pre-trained model google/vit-base-patch16-224-in21k. The approach leverages the power of Vision Transformers (ViT) to classify images as real or fake.

Model Overview

Base Model: google/vit-base-patch16-224-in21k
Dataset: deepfake and real images.
Classes: Binary classification (Fake, Real)
Performance:
- Validation Accuracy: 97%
- Test Accuracy: 92%

Figure 1: Confusion matrix for test data

Figure 2: Confusion matrix for validation data

How to Use the Model

Below is an example of how to load and use the model for deepfake classification:

from transformers import AutoImageProcessor, AutoModelForImageClassificationimport torch
import torch
from PIL import Image

# Load the image_processor and model
image_processor = AutoImageProcessor.from_pretrained('ashish-001/deepfake-detection-using-ViT')
model = AutoModelForImageClassification.from_pretrained('ashish-001/deepfake-detection-using-ViT')
# Example usage
image = Image.open('path of the image')
inputs = image_processor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
pred = torch.argmax(logits, dim=1).item()
label = 'Real' if pred == 1 else 'Fake'
print(f"Predicted type: {Label}")

ashish-001
/

deepfake-detection-using-ViT

Deepfake Image Detection Using Fine-Tuned Vision Transformer (ViT)

Model Overview

How to Use the Model

Model tree for ashish-001/deepfake-detection-using-ViT

Space using ashish-001/deepfake-detection-using-ViT 1