license: mit
language:
- en
pipeline_tag: image-classification
tags:
- biology
Back Home
MLApp Repository (Computer Vision) for the Imperial FIFCO - SINAC - UCR Campaign
Developer: Joystick Data Team
First Public Version
Project Description
This application leverages computer vision models as part of the "De vuelta a casa" (Back Home) campaign sponsored by Imperial Beer. The project's goal is to process images and generate predictions regarding confiscated seashells at airports, aiming to identify their origin and facilitate their return to the appropriate beaches.
Due to the high volume of seashells and the limited number of experts available in the country, manual classification is unfeasible. This automated system utilizes artificial intelligence to provide an efficient and accurate solution, thus contributing to the conservation of marine ecosystems and environmental sustainability.
Repository Elements
- requirements.txt: File containing all necessary dependencies to run the project.
- Readme.txt: File with the model description.
- model_final.pth: File containing the model.
Frameworks and Libraries Used: Torch
Programming Language Used: Python
Model Description
This model is based on the ConvNeXt architecture and has been trained to classify images of seashells into two categories: Pacific and Caribbean. It is part of an effort to identify the origin of seashells confiscated at airports and facilitate their return to the corresponding beaches.
Architecture
The model uses an advanced convolution block structure based on the ConvNeXt architecture. This includes convolutional layers, normalization, and residual blocks designed for efficiency and performance.
Model Details
- Total Parameters: 27,819,361
- Trainable Parameters: 14,290,945
- Non-Trainable Parameters: 13,528,416
- Estimated Memory Size: 243.11 MB
- Total Mult-Adds: 321.60 MB
Model Structure
Expected Input: (1, 3, 224, 224) (1 RGB image with 224x224 resolution)
Main Layers:
- Multiple convolutional layers with normalization (Conv2dNormActivation)
- CN blocks for deep learning (CNBlock)
- Adaptive average pooling (AdaptiveAvgPool2d)
- Final linear classifier with sigmoid activation
Performance
- Designed to run on both CPU and GPU.
- Optimized for computational efficiency in image classification tasks.
Requirements
- Memory Required: Approximately 243 MB
- Input Size: (3, 224, 224) in tensor format
Additional Details
- Framework Used: PyTorch
- Model File Size: ~111 MB
- Capabilities: Suitable for binary classification with high accuracy
Model Architecture Visualization
ConvNeXt
├─Conv2dNormActivation
├─CNBlock
├─AdaptiveAvgPool2d
├─Dropout
├─Flatten
├─Linear
References
ConvNeXt: Revisiting Convolutions for Visual Recognition
Used
Load the model:
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("FIFCO/De_vuelta_a_casa")
Make predictions:
outputs = model(images)
predictions = torch.sigmoid(outputs.logits)
You can also use this model directly with the Transformers library. First, install the library:
pip install transformers
If your GPU supports it, we recommend using Flash Attention 2 for greater efficiency. You can install it with:
pip install flash-attn
Use the model with Transformers: You can load the model and make predictions as shown below:
Using AutoModel:
from transformers import AutoFeatureExtractor, AutoModelForImageClassification
from PIL import Image
import torch
# Set model id
model_id = "FIFCO/De_vuelta_a_casa_Clasificacion_imagenes_conchas"
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
model = AutoModelForImageClassification.from_pretrained(model_id)
# Load your image
image = Image.open("path_to_your_image.jpg")
# Preprocess the image
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
# Get the class with the highest probability
predicted_class = outputs.logits.argmax(dim=-1).item()
print(f"Predicted class: {predicted_class}")
Using pipeline
from transformers import pipeline
# Load the classification pipeline
pipe = pipeline(
"image-classification",
model="FIFCO/Back_Home_Image_Classification_Shells"
)
# Perform the classification
results = pipe("path_to_your_image.jpg")
print(results)
Requests
Pre-reqs
- Docker (if you want to use containers).
- Python 3.7+
Install
Clone the repository
git clone https://github.com/FIFCO/De_vuelta_a_casa.git
cd De_vuelta_a_casa