Vision Transformer (ViT) for Female Age Classification 👩

This repository contains a Vision Transformer (ViT) model trained to classify the age of women based on facial images. The model predicts the following age groups: Child, Teen, Woman, and Senior.

Model Overview 🎯

Model Type: Vision Transformer (ViT)
Task: Age classification from female facial images
Age Categories:
- Child: Ages 0-12
- Teen: Ages 13-19
- Woman: Ages 20-49
- Senior: Ages 50+
Input Image Size: 224x224 pixels
Preprocessing: Images are resized to 224x224 pixels and converted to RGB format.
Framework: Hugging Face transformers

How to Use the Model 🚀

You can run the model using the following Python code:

1. Install Required Packages

Make sure to install the necessary packages:

pip install transformers torch torchvision pillow opencv-python

2. Run the Model

Here’s an example of how to load the model and classify an image:

from transformers import pipeline
import cv2
from PIL import Image

# Load the pre-trained model from Hugging Face Hub
model_dir = 'anggiatm/vit-female-age-classification'
female_age = pipeline("image-classification", model=model_dir, device='cuda')  # Use CUDA if available

# Path to your image file
file_path = '/path/to/your/image.jpg'

# Load and preprocess the image
img = cv2.imread(file_path)
if img is not None:
    # Convert the image from BGR to RGB format
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    # Resize the image to 224x224
    img_resized = cv2.resize(img_rgb, (224, 224))
    # Convert to PIL image format
    pil_image = Image.fromarray(img_resized)
    # Run the image through the model
    female_age_preds = female_age(pil_image)

    # Get the label and confidence score
    label = female_age_preds[0]['label']
    score = female_age_preds[0]['score'] * 100

    # Print the result
    print(f"{file_path:<23} {label:<10} {score:>10.2f}%")

3. Example Output

/path/to/your/image.jpg    Woman        95.72%

Dataset Information 📊

The model was trained on a dataset of female facial images categorized into the following age groups:

Child: 0-12 years
Teen: 13-19 years
Woman: 20-49 years
Senior: 50+ years

Each image is resized to 224x224 pixels before being fed into the model.

You can access the dataset here.

Caution ⚠️

Please note that while this model performs well in most cases, there may be occasional misclassifications. The dataset was labeled manually, and some age groups may overlap or be inaccurately represented in certain images. Use the model results with this caution in mind.

License 📜

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements 🙏

This model was built using the Hugging Face transformers library and Vision Transformer (ViT) architecture.

Happy Coding! 😊