File size: 3,539 Bytes
83a656b 568feae 83a656b 568feae fc8b390 568feae 92b8d63 568feae 92b8d63 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
---
license: apache-2.0
datasets:
- fashion_mnist
language:
- en
metrics:
- accuracy
pipeline_tag: image-classification
---
# Fashion-MNIST Baseline Classifier
## Model Details
- **Model Name:** fashion-mnist-base
- **Framework:** Custom implementation in Python
- **Version:** 0.1
- **License:** Apache-2.0
## Model Description
This is a neural network model developed from the ground up to classify images from the Fashion-MNIST dataset.
The dataset comprises 70,000 grayscale images across 10 categories. Each example is a 28x28 grayscale image,
associated with a label from 10 classes including T-shirts/tops, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.
## Intended Use
This model is intended for educational purposes and as a baseline for more complex implementations. It can be used by students and AI enthusiasts
to understand the workings of neural networks and their application in image classification.
## Training Data
The model was trained on the Fashion-MNIST dataset, which contains 60,000 training images and 10,000 test images.
Each image is 28x28 pixels, grayscale, associated with one of 10 classes representing different types of clothing and accessories.
### Architecture Details:
- Input layer: 784 neurons (flattened 28x28 image)
- Hidden layer 1: 256 neurons, ReLU activation, Dropout
- Hidden layer 2: 64 neurons, ReLU activation, Dropout
- Output layer: 10 neurons, logits
### Hyperparameters:
- Learning rate: 0.005
- Batch size: 32
- Epochs: 25
The model uses a self-implemented stochastic gradient descent (SGD) optimizer.
## Evaluation Results
The model achieved the following performance on the test set:
- Accuracy: 86.7%
- Precision, Recall, and F1-Score:
| Label | Precision | Recall | F1-score |
|-------------|-----------|---------|----------|
| T-shirt/Top | 0.847514 | 0.767 | 0.805249 |
| Trouser | 0.982618 | 0.961 | 0.971689 |
| Pullover | 0.800000 | 0.748 | 0.773127 |
| Dress | 0.861868 | 0.886 | 0.873767 |
| Coat | 0.776278 | 0.805 | 0.790378 |
| Sandal | 0.957958 | 0.957 | 0.957479 |
| Shirt | 0.638587 | 0.705 | 0.670152 |
| Sneaker | 0.935743 | 0.932 | 0.933868 |
| Bag | 0.952381 | 0.960 | 0.956175 |
| Ankle-Boot | 0.944554 | 0.954 | 0.949254 |
## Limitations and Biases
Due to the nature of the training dataset, the model may not capture the full complexity of fashion items in diverse real-world scenarios.
In practice, we found out that it is sensitive to background colors and article's proportions.
## How to Use
```python
import torch
import torchvision.transforms as transforms
from PIL import Image
model = torch.load('fashion-mnist-base.pt')
# Images need to be transformed to the `fashion MNIST` dataset format
transform = transforms.Compose(
[
transforms.Resize((28, 28)),
transforms.Grayscale(),
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)), # Normalization
transforms.Lambda(lambda x: 1.0 - x), # Invert colors
transforms.Lambda(lambda x: x[0]),
transforms.Lambda(lambda x: x.unsqueeze(0)),
]
)
img = Image.open('fashion/dress.png')
img = transform(img)
model.predictions(img)
```
## Sample Output
```
{'Dress': 84.437744,
'Coat': 7.631796,
'Pullover': 4.2272186,
'Shirt': 1.297625,
'T-shirt/Top': 1.2237197,
'Bag': 0.9053432,
'Trouser/Jeans': 0.27268794,
'Sneaker': 0.0031491981,
'Ankle-Boot': 0.00063403655,
'Sandal': 8.5103806e-05}
``` |