eksemyashkina's picture
Upload 4 files
61123b8 verified
|
raw
history blame
3.49 kB
# Clothes Segmentation
![Sample Images and Segmentation Masks from Dataset](assets/dataset_examples.png)
This project provides a solution for segmenting clothes into 18 categories using DINO, ViT and UNet models.
* DINO: Pretrained backbone with a segmentation head
* https://arxiv.org/abs/2104.14294
* https://huggingface.co/facebook/dinov2-small
* ViT: Pretrained vision transformer with a segmentation head
* https://arxiv.org/abs/2010.11929
* https://huggingface.co/google/vit-base-patch16-224
* UNet: Custom implementation
* https://arxiv.org/abs/1505.04597
Gradio is used for building a web interface and Weights & Biases for experiments tracking.
## Installation
1. Clone the repository:
```bash
git clone https://github.com/your-project/clothes-segmentation.git
cd plant-classifier
```
2. Create and activate a virtual environment:
```bash
python -m venv venv
source venv/bin/activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
## Usage
### Training the Model
To train a model, specify one of the following using the --model argument: **dino**, **vit** or **unet**.
```bash
python src/train.py --model dino
python src/train.py --model vit
python src/train.py --model unet
```
You can also adjust other parameters, such as the number of epochs, batch size, and learning rate, by adding additional arguments. For example:
```bash
python src/train.py --model unet --num-epochs 20 --batch-size 16 --learning-rate 0.001
```
### Launching the Gradio Interface
```bash
python app.py
```
Once the interface is running, you can select a model, upload an image and view the segmentation mask.
![Web Interface Screen](assets/spaces_screen.jpg)
#### добавить ссылку
## Results
| Model | Test Micro Recall | Test Micro Precision | Test Macro Precision | Test Macro Recall | Test Accuracy | Test Loss | Train Micro Recall | Train Micro Precision | Train Macro Precision | Train Macro Recall | Train Accuracy | Train Loss |
|------------|-------------------|----------------------|----------------------|-------------------|---------------|-----------|--------------------|-----------------------|-----------------------|--------------------|----------------|------------|
| DINO | 0.94986 | 0.94986 | 0.71364 | 0.67052 | 0.94986 | 0.53124 | 0.97019 | 0.97019 | 0.78185 | 0.72336 | 0.97019 | 0.30441 |
| ViT | 0.9358 | 0.9358 | 0.63939 | 0.58365 | 0.9358 | 0.71193 | 0.96734 | 0.96734 | 0.74418 | 0.66295 | 0.96734 | 0.31166 |
| UNet | 0.95798 | 0.95798 | 0.76354 | 0.7289 | 0.95798 | 0.56764 | 0.98035 | 0.98035 | 0.82934 | 0.82688 | 0.98035 | 0.25301 |
### Training Results of DINO
![DINO_test](assets/dino_test_plots.png)
![DINO_train](assets/dino_train_plots.png)
### Training Results of ViT
![ViT_test](assets/vit_test_plots.png)
![ViT_train](assets/vit_train_plots.png)
### Training Results of UNet
![UNet_test](assets/unet_test_plots.png)
![UNet_train](assets/unet_train_plots.png)