Back Home

MLApp Repository (Computer Vision) for the Imperial FIFCO - SINAC - UCR Campaign

Developer: Joystick Data Team

First Public Version

Project Description

This application leverages computer vision models as part of the "De vuelta a casa" (Back Home) campaign sponsored by Imperial Beer. The project's goal is to process images and generate predictions regarding confiscated seashells at airports, aiming to identify their origin and facilitate their return to the appropriate beaches.

Due to the high volume of seashells and the limited number of experts available in the country, manual classification is unfeasible. This automated system utilizes artificial intelligence to provide an efficient and accurate solution, thus contributing to the conservation of marine ecosystems and environmental sustainability.

Repository Elements

requirements.txt: File containing all necessary dependencies to run the project.
Readme.txt: File with the model description.
model_final.pth: File containing the model.

Frameworks and Libraries Used: Torch

Programming Language Used: Python

Model Description

This model is based on the ConvNeXt architecture and has been trained to classify images of seashells into two categories: Pacific and Caribbean. It is part of an effort to identify the origin of seashells confiscated at airports and facilitate their return to the corresponding beaches.

Architecture

The model uses an advanced convolution block structure based on the ConvNeXt architecture. This includes convolutional layers, normalization, and residual blocks designed for efficiency and performance.

Model Details

Total Parameters: 27,819,361
Trainable Parameters: 14,290,945
Non-Trainable Parameters: 13,528,416
Estimated Memory Size: 243.11 MB
Total Mult-Adds: 321.60 MB

Model Structure

Expected Input: (1, 3, 224, 224) (1 RGB image with 224x224 resolution)

Main Layers:

Multiple convolutional layers with normalization (Conv2dNormActivation)
CN blocks for deep learning (CNBlock)
Adaptive average pooling (AdaptiveAvgPool2d)
Final linear classifier with sigmoid activation

Performance

Designed to run on both CPU and GPU.
Optimized for computational efficiency in image classification tasks.

Requirements

Memory Required: Approximately 243 MB
Input Size: (3, 224, 224) in tensor format

Additional Details

Framework Used: PyTorch
Model File Size: ~111 MB
Capabilities: Suitable for binary classification with high accuracy

Model Architecture Visualization

ConvNeXt
├─Conv2dNormActivation
├─CNBlock
├─AdaptiveAvgPool2d
├─Dropout
├─Flatten
├─Linear

References

ConvNeXt: Revisiting Convolutions for Visual Recognition

Used

Load the model:

from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("FIFCO/De_vuelta_a_casa")

Make predictions:

outputs = model(images)
predictions = torch.sigmoid(outputs.logits)

You can also use this model directly with the Transformers library. First, install the library:

pip install transformers

If your GPU supports it, we recommend using Flash Attention 2 for greater efficiency. You can install it with:

pip install flash-attn

Use the model with Transformers: You can load the model and make predictions as shown below:

Using AutoModel:

from transformers import AutoFeatureExtractor, AutoModelForImageClassification
from PIL import Image
import torch

# Set model id
model_id = "FIFCO/De_vuelta_a_casa_Clasificacion_imagenes_conchas"
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
model = AutoModelForImageClassification.from_pretrained(model_id)

# Load your image
image = Image.open("path_to_your_image.jpg")

# Preprocess the image
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)

# Get the class with the highest probability
predicted_class = outputs.logits.argmax(dim=-1).item()
print(f"Predicted class: {predicted_class}")

Using pipeline

from transformers import pipeline

# Load the classification pipeline
pipe = pipeline(
    "image-classification",
    model="FIFCO/Back_Home_Image_Classification_Shells"
)

# Perform the classification
results = pipe("path_to_your_image.jpg")
print(results)

=== MODEL CONFIGURATION FOR A NEW TRAINING MODEL ===

from transformers import AutoModelForImageClassification, TrainingArguments, Trainer
from transformers import AutoFeatureExtractor
from datasets import load_dataset
import torch.nn as nn
import torch


# Load the base pretrained model
base_model = AutoModelForImageClassification.from_pretrained(
    "FIFCO/De_vuelta_a_casa_Clasificacion_imagenes_conchas"
)

# Number of new categories/classes in the dataset
NUM_CLASSES = 10  # Adjust this according to your dataset

# Create a custom model with additional layers
class CustomImageClassifier(nn.Module):
    def __init__(self, base_model, num_classes):
        super(CustomImageClassifier, self).__init__()
        self.base_model = base_model
        self.custom_layers = nn.Sequential(
            nn.Linear(base_model.classifier.out_features, 512),  # Add a fully connected layer
            nn.ReLU(),  # ReLU activation
            nn.Dropout(0.3),  # Regularization
            nn.Linear(512, num_classes)  # Final layer for new classes
        )
    
    def forward(self, x):
        x = self.base_model(x).logits  # Output from the base model
        x = self.custom_layers(x)  # Pass through additional layers
        return x

# Initialize the model with custom layers
custom_model = CustomImageClassifier(base_model, NUM_CLASSES)

print("Custom model created:")
print(custom_model)

# === TRAINING CONFIGURATION === #

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",            # Folder to save results
    evaluation_strategy="epoch",      # Evaluate at the end of each epoch
    save_strategy="epoch",            # Save the model at the end of each epoch
    learning_rate=5e-5,               # Learning rate
    per_device_train_batch_size=16,   # Training batch size
    per_device_eval_batch_size=16,    # Evaluation batch size
    num_train_epochs=10,              # Number of epochs
    weight_decay=0.01,                # L2 regularization
    logging_dir="./logs",             # Folder to save logs
    logging_steps=10,                 # Logging frequency
    save_total_limit=2,               # Limit the number of saved models
    load_best_model_at_end=True,      # Load the best model at the end
)

# === LOAD DATASET === #

# Load the dataset from Hugging Face or a local directory
dataset = load_dataset("imagefolder", data_dir="path_to_your_dataset")

# Split the dataset into training and validation
train_dataset = dataset["train"]
val_dataset = dataset["validation"]

print("Dataset loaded:")
print(f"Training set: {len(train_dataset)} images")
print(f"Validation set: {len(val_dataset)} images")

# === TRAINING === #

# Configure the Trainer with the custom model and data
trainer = Trainer(
    model=custom_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
)

# Train the model
print("Starting training...")
trainer.train()

# === SAVE THE FINE-TUNED MODEL === #

# Save the fine-tuned model
custom_model.save_pretrained("./custom_fine_tuned_model")
print("Fine-tuned model saved in './custom_fine_tuned_model'")

# === EVALUATE THE MODEL === #

# Evaluate the model on the validation set
results = trainer.evaluate()
print("Evaluation results:")
print(results)

# === USE THE FINE-TUNED MODEL === #

from transformers import pipeline

# Load the pipeline with the fine-tuned model
pipe = pipeline(
    "image-classification",
    model="./custom_fine_tuned_model"
)

# Classify a new image
image_path = "path_to_new_image.jpg"
results = pipe(image_path)
print(f"Classification results for {image_path}:")
print(results)

# === IMPORTANT NOTES === #
# - Adjust `NUM_CLASSES` according to your dataset.
# - Ensure that your images are organized into folders for each category.
# - If you have limited data, consider using data augmentation techniques to improve performance.

Requests

Pre-reqs

Docker (if you want to use containers).
Python 3.7+

Install

Clone the repositorio

git clone https://github.com/FIFCO/De_vuelta_a_casa.git
cd De_vuelta_a_casa

FIFCO
/

De_vuelta_a_casa

You need to agree to share your contact information to access this model