---
license: other
language:
- en
metrics:
- accuracy
library_name: transformers
pipeline_tag: image-classification
tags:
- code
---
## Model Details

Model type: Vision Transformer (ViT) for Image Classification

Finetuned from model : google/vit-base-patch16-384

## Uses

Image classification based on facial features from the dataset.Link:https://www.kaggle.com/datasets/bhaveshmittal/celebrity-face-recognition-dataset

### Downstream Use

Fine-tuning for other image classification tasks.

Transfer learning for related vision tasks.

### Out-of-Scope Use

Tasks unrelated to image classification.

Sensitive applications without proper evaluation of biases and limitations.

## Bias, Risks, and Limitations

Potential biases in the training dataset affecting model predictions.

Limitations in generalizability to different populations or image conditions not represented in the training data.

Risks associated with misclassification in critical applications.

### Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It's recommended to evaluate the model's performance on the specific data before deploying it in a production environment

## How to Get Started with the Model

#### Use the code below to get started with the model.

```python
import torch

from transformers import ViTForImageClassification, ViTImageProcessor

model = ViTForImageClassification.from_pretrained("Ganesh-KSV/face-recognition-version1")

processor = ViTImageProcessor.from_pretrained("Ganesh-KSV/face-recognition-version1")

def predict(image):

    inputs = processor(images=image, return_tensors="pt")
    
    outputs = model(**inputs)
    
    logits = outputs.logits
    
    predictions = torch.argmax(logits, dim=-1)
    
    return predictions
```

## Training Details

### Training Data
Training Procedure:

Preprocessing :

Images were resized, augmented (rotation, color jitter, etc.), and normalized.

Training Hyperparameters:

Optimizer: Adam with learning rate 2e-5 and weight decay 1e-2

Scheduler: StepLR with step size 2 and gamma 0.5

Loss Function: CrossEntropyLoss

Epochs: 40

Batch Size: 4


## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data
Validation split of the VGGFace dataset.

#### Factors
Performance evaluated based on loss and accuracy on the validation set.

#### Metrics
Loss and accuracy metrics for each epoch.

### Results
Training and validation loss and accuracy plotted for 40 epochs.

Confusion matrix generated for the final validation results.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6667e064e83d48c18fdaa52a/b6O3VQbi49cHe7iQNPmDy.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6667e064e83d48c18fdaa52a/MAqqm9lYhUgxKI8kC95Rc.png)

#### Summary

## Model Examination

Model performance examined through loss, accuracy plots, and confusion matrix.

#### Glossary

ViT: Vision Transformer

CrossEntropyLoss: A loss function used for classification tasks.

Adam: An optimization algorithm.

StepLR: Learning rate scheduler that decays the learning rate by a factor every few epochs.