Spaces:
Running
Running
Alexander Hortua
commited on
Delete README.MD
Browse files
README.MD
DELETED
|
@@ -1,116 +0,0 @@
|
|
| 1 |
-
# 3D Person Segmentation and Anaglyph Generation
|
| 2 |
-
|
| 3 |
-
title: Object Segmentation
|
| 4 |
-
emoji: 👁
|
| 5 |
-
colorFrom: gray
|
| 6 |
-
colorTo: pink
|
| 7 |
-
sdk: gradio
|
| 8 |
-
sdk_version: 5.22.0
|
| 9 |
-
app_file: src/app.py
|
| 10 |
-
pinned: false
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
## Lab Report
|
| 14 |
-
|
| 15 |
-
### Introduction
|
| 16 |
-
This project implements a sophisticated 3D image processing system that combines person segmentation with stereoscopic and anaglyph image generation. The main objectives were to:
|
| 17 |
-
1. Accurately segment people from images using advanced AI models
|
| 18 |
-
2. Generate stereoscopic 3D effects from 2D images
|
| 19 |
-
3. Create red-cyan anaglyph images for 3D viewing
|
| 20 |
-
4. Provide an interactive web interface for real-time processing
|
| 21 |
-
|
| 22 |
-
### Methodology
|
| 23 |
-
|
| 24 |
-
#### Tools and Technologies Used
|
| 25 |
-
- **SegFormer (nvidia/segformer-b0)**: State-of-the-art transformer-based model for semantic segmentation
|
| 26 |
-
- **PyTorch**: Deep learning framework for running the SegFormer model
|
| 27 |
-
- **OpenCV**: Image processing operations and mask refinement
|
| 28 |
-
- **Gradio**: Web interface development
|
| 29 |
-
- **NumPy**: Efficient array operations for image manipulation
|
| 30 |
-
- **PIL (Python Imaging Library)**: Image loading and basic transformations
|
| 31 |
-
|
| 32 |
-
#### Implementation Steps
|
| 33 |
-
|
| 34 |
-
1. **Person Segmentation**
|
| 35 |
-
- Utilized SegFormer model fine-tuned on ADE20K dataset
|
| 36 |
-
- Applied post-processing with erosion and Gaussian blur for mask refinement
|
| 37 |
-
- Implemented mask scaling and centering for various input sizes
|
| 38 |
-
|
| 39 |
-
2. **Stereoscopic Processing**
|
| 40 |
-
- Created depth simulation through horizontal pixel shifting
|
| 41 |
-
- Implemented parallel view stereo pair generation
|
| 42 |
-
- Added configurable interaxial distance for 3D effect adjustment
|
| 43 |
-
|
| 44 |
-
3. **Anaglyph Generation**
|
| 45 |
-
- Combined left and right eye views into red-cyan anaglyph
|
| 46 |
-
- Implemented color channel separation and recombination
|
| 47 |
-
- Added background image support with proper masking
|
| 48 |
-
|
| 49 |
-
4. **User Interface**
|
| 50 |
-
- Developed interactive web interface using Gradio
|
| 51 |
-
- Added real-time parameter adjustment capabilities
|
| 52 |
-
- Implemented support for custom background images
|
| 53 |
-
|
| 54 |
-
### Results
|
| 55 |
-
|
| 56 |
-
The system produces three main outputs:
|
| 57 |
-
1. Segmentation mask showing the isolated person
|
| 58 |
-
2. Side-by-side stereo pair for parallel viewing
|
| 59 |
-
3. Red-cyan anaglyph image for 3D glasses viewing
|
| 60 |
-
|
| 61 |
-
Key Features:
|
| 62 |
-
- Adjustable person size (10-200%)
|
| 63 |
-
- Configurable interaxial distance (0-10 pixels)
|
| 64 |
-
- Optional custom background support
|
| 65 |
-
- Real-time processing and preview
|
| 66 |
-
|
| 67 |
-
### Discussion
|
| 68 |
-
|
| 69 |
-
#### Technical Challenges
|
| 70 |
-
1. **Mask Alignment**: Ensuring proper alignment between segmentation masks and background images required careful consideration of image dimensions and aspect ratios.
|
| 71 |
-
2. **Stereo Effect Quality**: Balancing the interaxial distance for comfortable viewing while maintaining the 3D effect.
|
| 72 |
-
3. **Performance Optimization**: Efficient processing of large images while maintaining real-time interaction.
|
| 73 |
-
|
| 74 |
-
#### Learning Outcomes
|
| 75 |
-
- Deep understanding of stereoscopic image generation
|
| 76 |
-
- Experience with state-of-the-art segmentation models
|
| 77 |
-
- Practical knowledge of image processing techniques
|
| 78 |
-
- Web interface development for ML applications
|
| 79 |
-
|
| 80 |
-
### Conclusion
|
| 81 |
-
|
| 82 |
-
This project successfully demonstrates the integration of modern AI-powered segmentation with classical stereoscopic image processing techniques. The system provides an accessible way to create 3D effects from regular 2D images.
|
| 83 |
-
|
| 84 |
-
#### Future Work
|
| 85 |
-
- Implementation of depth-aware 3D effect generation
|
| 86 |
-
- Support for video processing
|
| 87 |
-
- Additional 3D viewing formats (side-by-side, over-under)
|
| 88 |
-
- Enhanced background replacement options
|
| 89 |
-
- Mobile device optimization
|
| 90 |
-
|
| 91 |
-
## Setup
|
| 92 |
-
|
| 93 |
-
```bash
|
| 94 |
-
pip install -r requirements.txt
|
| 95 |
-
```
|
| 96 |
-
|
| 97 |
-
## Usage
|
| 98 |
-
|
| 99 |
-
```bash
|
| 100 |
-
cd src
|
| 101 |
-
python app.py
|
| 102 |
-
```
|
| 103 |
-
|
| 104 |
-
## Parameters
|
| 105 |
-
|
| 106 |
-
- **Person Image**: Upload an image containing a person
|
| 107 |
-
- **Background Image**: (Optional) Custom background image
|
| 108 |
-
- **Interaxial Distance**: Adjust the 3D effect strength (0-10)
|
| 109 |
-
- **Person Size**: Adjust the size of the person in the output (10-200%)
|
| 110 |
-
|
| 111 |
-
## Output Types
|
| 112 |
-
|
| 113 |
-
1. **Segmentation Mask**: Shows the isolated person
|
| 114 |
-
2. **Stereo Pair**: Side-by-side stereo image for parallel viewing
|
| 115 |
-
3. **Anaglyph**: Red-cyan 3D image viewable with anaglyph glasses
|
| 116 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|