File size: 3,539 Bytes

---
license: apache-2.0
datasets:
- fashion_mnist
language:
- en
metrics:
- accuracy
pipeline_tag: image-classification
---

# Fashion-MNIST Baseline Classifier

## Model Details

- **Model Name:** fashion-mnist-base
- **Framework:** Custom implementation in Python
- **Version:** 0.1
- **License:** Apache-2.0

## Model Description

This is a neural network model developed from the ground up to classify images from the Fashion-MNIST dataset.
The dataset comprises 70,000 grayscale images across 10 categories. Each example is a 28x28 grayscale image,
associated with a label from 10 classes including T-shirts/tops, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots. 

## Intended Use

This model is intended for educational purposes and as a baseline for more complex implementations. It can be used by students and AI enthusiasts
to understand the workings of neural networks and their application in image classification.

## Training Data

The model was trained on the Fashion-MNIST dataset, which contains 60,000 training images and 10,000 test images.
Each image is 28x28 pixels, grayscale, associated with one of 10 classes representing different types of clothing and accessories.

### Architecture Details:
- Input layer: 784 neurons (flattened 28x28 image)
- Hidden layer 1: 256 neurons, ReLU activation, Dropout
- Hidden layer 2: 64 neurons, ReLU activation, Dropout
- Output layer: 10 neurons, logits

### Hyperparameters:
- Learning rate: 0.005
- Batch size: 32
- Epochs: 25

The model uses a self-implemented stochastic gradient descent (SGD) optimizer.

## Evaluation Results

The model achieved the following performance on the test set:
- Accuracy: 86.7%
- Precision, Recall, and F1-Score:

| Label       | Precision | Recall  | F1-score |
|-------------|-----------|---------|----------|
| T-shirt/Top | 0.847514  | 0.767   | 0.805249 |
| Trouser     | 0.982618  | 0.961   | 0.971689 |
| Pullover    | 0.800000  | 0.748   | 0.773127 |
| Dress       | 0.861868  | 0.886   | 0.873767 |
| Coat        | 0.776278  | 0.805   | 0.790378 |
| Sandal      | 0.957958  | 0.957   | 0.957479 |
| Shirt       | 0.638587  | 0.705   | 0.670152 |
| Sneaker     | 0.935743  | 0.932   | 0.933868 |
| Bag         | 0.952381  | 0.960   | 0.956175 |
| Ankle-Boot  | 0.944554  | 0.954   | 0.949254 |

## Limitations and Biases

Due to the nature of the training dataset, the model may not capture the full complexity of fashion items in diverse real-world scenarios.
In practice, we found out that it is sensitive to background colors and article's proportions.

## How to Use

```python
import torch
import torchvision.transforms as transforms
from PIL import Image

model = torch.load('fashion-mnist-base.pt')

# Images need to be transformed to the `fashion MNIST` dataset format
transform = transforms.Compose(
    [
        transforms.Resize((28, 28)),
        transforms.Grayscale(),
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (0.5,)), # Normalization
        transforms.Lambda(lambda x: 1.0 - x), # Invert colors
        transforms.Lambda(lambda x: x[0]),
        transforms.Lambda(lambda x: x.unsqueeze(0)),
    ]
)

img = Image.open('fashion/dress.png')
img = transform(img)
model.predictions(img)
```

## Sample Output

```
{'Dress': 84.437744,
 'Coat': 7.631796,
 'Pullover': 4.2272186,
 'Shirt': 1.297625,
 'T-shirt/Top': 1.2237197,
 'Bag': 0.9053432,
 'Trouser/Jeans': 0.27268794,
 'Sneaker': 0.0031491981,
 'Ankle-Boot': 0.00063403655,
 'Sandal': 8.5103806e-05}
```