--- license: apache-2.0 tags: - ai-image-classifiction - catboost - xgboost - auraface - cnn --- # AI Image Classification Model This repository contains two trained classifiers, **XGBoost** and **CatBoost**, for AI image classification. These models are trained to distinguish between AI-generated and real human faces using embeddings extracted from the **AuraFace** model. ## Model Overview - **AuraFace**: Used for extracting face embeddings from input images. - **CatBoost & XGBoost**: Trained classifiers to predict if an image is AI-generated or real. - **Dataset**: Trained using the [Real vs AI Generated Faces Dataset](https://www.kaggle.com/datasets/philosopher0808/real-vs-ai-generated-faces-dataset). - **Preferred Model**: While both classifiers yield similar results, **CatBoost** is the preferred model. ## Pipeline 1. An image is passed to **AuraFace** to extract a 512-dimensional face embedding. 2. The embedding is converted into a pandas DataFrame. 3. The trained classifier (CatBoost/XGBoost) is used to make predictions. ## Model Usage ### Dependencies ```bash pip install opencv-python catboost xgboost pandas numpy pillow huggingface_hub ``` ### Loading AuraFace ```python from huggingface_hub import snapshot_download from insightface.app import FaceAnalysis import numpy as np import cv2 # Download AuraFace model snapshot_download( "fal/AuraFace-v1", local_dir="models/auraface", ) # Initialize AuraFace face_app = FaceAnalysis( name="auraface", providers=["CUDAExecutionProvider", "CPUExecutionProvider"], root="." ) face_app.prepare(ctx_id=0, det_size=(640, 640)) ``` ### Loading CatBoost Model ```python from catboost import CatBoostClassifier # Load trained CatBoost model ai_image_classifier = CatBoostClassifier() ai_image_classifier.load_model('models/ai_image_classifier/cat_classifier.cbm') ``` ### Classifying an Image ```python def classify_image(image_path): # Load image img = Image.open(image_path).convert("RGB") img_array = np.array(img)[:, :, ::-1] # Convert to BGR for processing # Detect faces and extract embedding faces = face_app.get(img_array) if not faces: return "No face detected." embedding = faces[0].normed_embedding # Convert embedding to DataFrame feature_columns = [f'feature_{i}' for i in range(512)] embedding_df = pd.DataFrame([embedding], columns=feature_columns) # Predict class prediction = ai_image_classifier.predict(embedding_df)[0] return "AI-generated" if prediction == 1 else "Real Face" # Example Usage image_path = "path/to/image.jpg" result = classify_image(image_path) print(f"Classification: {result}") ``` ### Using XGBoost XGBoost follows the same process. To use XGBoost instead, replace the `CatBoostClassifier` loading step with: ```python from xgboost import XGBClassifier # Load trained XGBoost model ai_image_classifier = XGBClassifier() ai_image_classifier.load_model('models/ai_image_classifier/xgb_classifier.json') ``` ## Acknowledgments - **[AuraFace-v1](https://huggingface.co/fal/AuraFace-v1)** for face embeddings. - **[Real vs AI Generated Faces Dataset](https://www.kaggle.com/datasets/philosopher0808/real-vs-ai-generated-faces-dataset)** for training data.