|
# Clothes Segmentation
|
|
|
|
![Sample Images and Segmentation Masks from Dataset](assets/dataset_examples.png)
|
|
|
|
This project provides a solution for segmenting clothes into 18 categories using DINO, ViT and UNet models.
|
|
|
|
* DINO: Pretrained backbone with a segmentation head
|
|
* https://arxiv.org/abs/2104.14294
|
|
* https://huggingface.co/facebook/dinov2-small
|
|
* ViT: Pretrained vision transformer with a segmentation head
|
|
* https://arxiv.org/abs/2010.11929
|
|
* https://huggingface.co/google/vit-base-patch16-224
|
|
* UNet: Custom implementation
|
|
* https://arxiv.org/abs/1505.04597
|
|
|
|
Gradio is used for building a web interface and Weights & Biases for experiments tracking.
|
|
|
|
## Installation
|
|
|
|
1. Clone the repository:
|
|
```bash
|
|
git clone https://github.com/your-project/clothes-segmentation.git
|
|
cd plant-classifier
|
|
```
|
|
|
|
2. Create and activate a virtual environment:
|
|
```bash
|
|
python -m venv venv
|
|
source venv/bin/activate
|
|
```
|
|
|
|
3. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Training the Model
|
|
To train a model, specify one of the following using the --model argument: **dino**, **vit** or **unet**.
|
|
```bash
|
|
python src/train.py --model dino
|
|
python src/train.py --model vit
|
|
python src/train.py --model unet
|
|
```
|
|
|
|
You can also adjust other parameters, such as the number of epochs, batch size, and learning rate, by adding additional arguments. For example:
|
|
```bash
|
|
python src/train.py --model unet --num-epochs 20 --batch-size 16 --learning-rate 0.001
|
|
```
|
|
|
|
### Launching the Gradio Interface
|
|
```bash
|
|
python app.py
|
|
```
|
|
|
|
Once the interface is running, you can select a model, upload an image and view the segmentation mask.
|
|
|
|
![Web Interface Screen](assets/spaces_screen.jpg)
|
|
|
|
#### добавить ссылку
|
|
|
|
## Results
|
|
|
|
| Model | Test Micro Recall | Test Micro Precision | Test Macro Precision | Test Macro Recall | Test Accuracy | Test Loss | Train Micro Recall | Train Micro Precision | Train Macro Precision | Train Macro Recall | Train Accuracy | Train Loss |
|
|
|------------|-------------------|----------------------|----------------------|-------------------|---------------|-----------|--------------------|-----------------------|-----------------------|--------------------|----------------|------------|
|
|
| DINO | 0.94986 | 0.94986 | 0.71364 | 0.67052 | 0.94986 | 0.53124 | 0.97019 | 0.97019 | 0.78185 | 0.72336 | 0.97019 | 0.30441 |
|
|
| ViT | 0.9358 | 0.9358 | 0.63939 | 0.58365 | 0.9358 | 0.71193 | 0.96734 | 0.96734 | 0.74418 | 0.66295 | 0.96734 | 0.31166 |
|
|
| UNet | 0.95798 | 0.95798 | 0.76354 | 0.7289 | 0.95798 | 0.56764 | 0.98035 | 0.98035 | 0.82934 | 0.82688 | 0.98035 | 0.25301 |
|
|
|
|
### Training Results of DINO
|
|
|
|
![DINO_test](assets/dino_test_plots.png)
|
|
|
|
![DINO_train](assets/dino_train_plots.png)
|
|
|
|
### Training Results of ViT
|
|
|
|
![ViT_test](assets/vit_test_plots.png)
|
|
|
|
![ViT_train](assets/vit_train_plots.png)
|
|
|
|
### Training Results of UNet
|
|
|
|
![UNet_test](assets/unet_test_plots.png)
|
|
|
|
![UNet_train](assets/unet_train_plots.png) |