--- license: other language: - en metrics: - accuracy library_name: transformers pipeline_tag: image-classification tags: - code --- ## Model Details Model type: Vision Transformer (ViT) for Image Classification Finetuned from model : google/vit-base-patch16-384 ## Uses Image classification based on facial features from the dataset.Link:https://www.kaggle.com/datasets/bhaveshmittal/celebrity-face-recognition-dataset ### Downstream Use Fine-tuning for other image classification tasks. Transfer learning for related vision tasks. ### Out-of-Scope Use Tasks unrelated to image classification. Sensitive applications without proper evaluation of biases and limitations. ## Bias, Risks, and Limitations Potential biases in the training dataset affecting model predictions. Limitations in generalizability to different populations or image conditions not represented in the training data. Risks associated with misclassification in critical applications. ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It's recommended to evaluate the model's performance on the specific data before deploying it in a production environment ## How to Get Started with the Model #### Use the code below to get started with the model. ```python import torch from transformers import ViTForImageClassification, ViTImageProcessor model = ViTForImageClassification.from_pretrained("Ganesh-KSV/face-recognition-version1") processor = ViTImageProcessor.from_pretrained("Ganesh-KSV/face-recognition-version1") def predict(image): inputs = processor(images=image, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits predictions = torch.argmax(logits, dim=-1) return predictions ``` ## Training Details ### Training Data Training Procedure: Preprocessing : Images were resized, augmented (rotation, color jitter, etc.), and normalized. Training Hyperparameters: Optimizer: Adam with learning rate 2e-5 and weight decay 1e-2 Scheduler: StepLR with step size 2 and gamma 0.5 Loss Function: CrossEntropyLoss Epochs: 40 Batch Size: 4 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data Validation split of the VGGFace dataset. #### Factors Performance evaluated based on loss and accuracy on the validation set. #### Metrics Loss and accuracy metrics for each epoch. ### Results Training and validation loss and accuracy plotted for 40 epochs. Confusion matrix generated for the final validation results. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6667e064e83d48c18fdaa52a/b6O3VQbi49cHe7iQNPmDy.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6667e064e83d48c18fdaa52a/MAqqm9lYhUgxKI8kC95Rc.png) #### Summary ## Model Examination Model performance examined through loss, accuracy plots, and confusion matrix. #### Glossary ViT: Vision Transformer CrossEntropyLoss: A loss function used for classification tasks. Adam: An optimization algorithm. StepLR: Learning rate scheduler that decays the learning rate by a factor every few epochs.