Spaces:

neelimapreeti297
/

panda_cat_dog_classification

Sleeping

File size: 5,748 Bytes

---
title: Panda Cat Dog Classification
emoji: ⚡
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 4.15.0
app_file: app.py
pinned: false
license: mit
---
# Model Name

panda_cat_dog_classification

### Model Description
This model classifies animals among pandas, cats and dogs. It was trained using custom CNN model. 


- **Developed by:** Neelima Monjusha Preeti
- **Model type:** custom CNN model
- **Language(s):** Python
- **License:** MIT
- **Contact:** [email protected]

### Task Description:


This panda_cat_dog_classification app classifies between panda, cat, or dog. So the input field is going to take input an image of one of three classes of dog, cat and panda.Then as output, it is going to show the name of the animal to which it belongs. It first processes the data and resizes it. Then custom CNN model is developed. The loss function and optimizer are calculated.
After that, the custom model is trained and tested then the app is launched using gradio in Hugging Face.


### Data Preprocessing
The image dataset is preprocessed with the following portion:

```bash
transform = transforms.Compose([
  transforms.Resize((224,224)),
  transforms.ToTensor(),
  transforms.Normalize((0.485,0.456,0.406),(0.229,0.224,0.225))
  ])
```

transforms.Resize((224,224)) resizes the input image to (224, 224) pixels. 
transforms.ToTensor() converts the input image into a PyTorch tensor. Neural networks typically operate on tensors, so this transformation converts the image into a format suitable for further processing.
transforms.Normalize(()) normalizes the tensor image with mean and standard deviation. The values provided are mean and standard deviation values for each channel in the tensor.

### Model Architecture

The model was trained with custom CNN() model. this CNN architecture consists of two convolutional layers followed by two fully connected layers, and it is designed for a classification task with three classes.

```bash
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(16 * 53 * 53, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 3)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool(x)
        x = self.conv2(x)
        x = self.pool(x)
        x = x.view(-1, 16 * 53 * 53)
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        return x

```
Then used batch_size = 8 and CrossEntropyLoss() for loss function. Then used Adam optimizer with a learning rate 0.001 for optimization process.

```bash
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
```
### Training Loop

Loading the data then breaking it into mini batches. Then forward pass and loss function calculation. After that backward propagation and optimization.
Backward Propagation and Optimization:

```bash
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
```
### Test data

Test data loaded and calculate the accuracy.

The accuracy was 53.333333333333336% . 

### Result Analysis
The packages needed for creating the huggingface interface is loaded with:

```bash
import gradio as gr
import torch
from torchvision import transforms
```

 The model was saved with the following:

 ```bash
model_scripted = torch.jit.script(model)
model_scripted.save('./models/cat_dog_cnn.pt')
```
## HuggingFace Result analysis
First the custom model cat_dog_cnn.pt is loaded. Then the output function is specified. As this is a Image Classification model.

```bash
|---app_data
|      |---cat.jpg
|      |---dog.jpg
|      |---panda.jpg
|
```

Example images are loaded. 
The classes for prediction are - CLASSES = ["Cat", "Dog", "Panda"].
The output function for prediction is 

```bash
def classify_image(inp):
  inp = transform(inp).unsqueeze(0)
  out = model(inp)
  return CLASSES[out.argmax().item()]
```

This will return the classes of the input image.
# Interface Creation

For creating huggingface interface this following portion is added:

```bash
iface = gr.Interface(fn=classify_image,
                     inputs=gr.Image(type="pil", label="Input Image"),
                     outputs="text",
                     examples=[

                       "./app_data/cat.jpg",
                       "./app_data/dog.jpg",
                       "./app_data/panda.jpg",
                     
                               
                     ])
```
This portion is going to create an interface for taking the image input. Then example images and output is defined to be the classes from cat, dog and panda.
Now with the following the interface of the app is loaded. 

```bash
iface.launch()
```

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65b2665fee3f66b2b0f7b765/3Pqyngir14HKGd_CNVDzh.png)

### Project Structure
```bash
|
|---app_data 
|       |---images(used for examples)
|
|---models
|       |---cat_dog_cnn.pt
|
|---train(image dataset for training)
|
|---test(image dataset for testing)
|
|---Readme.md(about project)
|
|---app.py(the interface for project)
|
|---requirements.txt(libraries needed for project)
|
|---main.ipynb(project code)
```

### How to Run

```bash

git clone https://huggingface.co/spaces/neelimapreeti297/panda_cat_dog_classification/tree/main

cd panda_cat_dog_classification

pip install -r requirements.txt

python app.py
```


### License
This project is licensed under the MIT License.

### Contributor
Neelima Monjusha Preeti - [email protected]

App link: https://huggingface.co/spaces/neelimapreeti297/panda_cat_dog_classification