|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- fashion_mnist |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
pipeline_tag: image-classification |
|
--- |
|
|
|
# Fashion-MNIST Baseline Classifier |
|
|
|
## Model Details |
|
|
|
- **Model Name:** fashion-mnist-base |
|
- **Framework:** Custom implementation in Python |
|
- **Version:** 0.1 |
|
- **License:** Apache-2.0 |
|
|
|
## Model Description |
|
|
|
This is a neural network model developed from the ground up to classify images from the Fashion-MNIST dataset. |
|
The dataset comprises 70,000 grayscale images across 10 categories. Each example is a 28x28 grayscale image, |
|
associated with a label from 10 classes including T-shirts/tops, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots. |
|
|
|
## Intended Use |
|
|
|
This model is intended for educational purposes and as a baseline for more complex implementations. It can be used by students and AI enthusiasts |
|
to understand the workings of neural networks and their application in image classification. |
|
|
|
## Training Data |
|
|
|
The model was trained on the Fashion-MNIST dataset, which contains 60,000 training images and 10,000 test images. |
|
Each image is 28x28 pixels, grayscale, associated with one of 10 classes representing different types of clothing and accessories. |
|
|
|
### Architecture Details: |
|
- Input layer: 784 neurons (flattened 28x28 image) |
|
- Hidden layer 1: 256 neurons, ReLU activation, Dropout |
|
- Hidden layer 2: 64 neurons, ReLU activation, Dropout |
|
- Output layer: 10 neurons, logits |
|
|
|
### Hyperparameters: |
|
- Learning rate: 0.005 |
|
- Batch size: 32 |
|
- Epochs: 25 |
|
|
|
The model uses a self-implemented stochastic gradient descent (SGD) optimizer. |
|
|
|
## Evaluation Results |
|
|
|
The model achieved the following performance on the test set: |
|
- Accuracy: 86.7% |
|
- Precision, Recall, and F1-Score: |
|
|
|
| Label | Precision | Recall | F1-score | |
|
|-------------|-----------|---------|----------| |
|
| T-shirt/Top | 0.847514 | 0.767 | 0.805249 | |
|
| Trouser | 0.982618 | 0.961 | 0.971689 | |
|
| Pullover | 0.800000 | 0.748 | 0.773127 | |
|
| Dress | 0.861868 | 0.886 | 0.873767 | |
|
| Coat | 0.776278 | 0.805 | 0.790378 | |
|
| Sandal | 0.957958 | 0.957 | 0.957479 | |
|
| Shirt | 0.638587 | 0.705 | 0.670152 | |
|
| Sneaker | 0.935743 | 0.932 | 0.933868 | |
|
| Bag | 0.952381 | 0.960 | 0.956175 | |
|
| Ankle-Boot | 0.944554 | 0.954 | 0.949254 | |
|
|
|
## Limitations and Biases |
|
|
|
Due to the nature of the training dataset, the model may not capture the full complexity of fashion items in diverse real-world scenarios. |
|
In practice, we found out that it is sensitive to background colors and article's proportions. |
|
|
|
## How to Use |
|
|
|
```python |
|
import torch |
|
import torchvision.transforms as transforms |
|
from PIL import Image |
|
|
|
model = torch.load('fashion-mnist-base.pt') |
|
|
|
# Images need to be transformed to the `fashion MNIST` dataset format |
|
transform = transforms.Compose( |
|
[ |
|
transforms.Resize((28, 28)), |
|
transforms.Grayscale(), |
|
transforms.ToTensor(), |
|
transforms.Normalize((0.5,), (0.5,)), # Normalization |
|
transforms.Lambda(lambda x: 1.0 - x), # Invert colors |
|
transforms.Lambda(lambda x: x[0]), |
|
transforms.Lambda(lambda x: x.unsqueeze(0)), |
|
] |
|
) |
|
|
|
img = Image.open('fashion/dress.png') |
|
img = transform(img) |
|
model.predictions(img) |
|
``` |
|
|
|
## Sample Output |
|
|
|
``` |
|
{'Dress': 84.437744, |
|
'Coat': 7.631796, |
|
'Pullover': 4.2272186, |
|
'Shirt': 1.297625, |
|
'T-shirt/Top': 1.2237197, |
|
'Bag': 0.9053432, |
|
'Trouser/Jeans': 0.27268794, |
|
'Sneaker': 0.0031491981, |
|
'Ankle-Boot': 0.00063403655, |
|
'Sandal': 8.5103806e-05} |
|
``` |