Alex Hortua commited on
Commit
80522dd
·
1 Parent(s): 572ff3e

Adding Report

Browse files
Files changed (1) hide show
  1. README.MD +93 -4
README.MD CHANGED
@@ -1,6 +1,82 @@
1
- # 3D Person Segmentation App
2
 
3
- This app segments a person from an image using SegFormer and creates a 3D red-cyan anaglyph image.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## Setup
6
 
@@ -8,10 +84,23 @@ This app segments a person from an image using SegFormer and creates a 3D red-cy
8
  pip install -r requirements.txt
9
  ```
10
 
11
-
12
  ## Usage
13
 
14
  ```bash
15
  cd src
16
  python app.py
17
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 3D Person Segmentation and Anaglyph Generation
2
 
3
+ ## Lab Report
4
+
5
+ ### Introduction
6
+ This project implements a sophisticated 3D image processing system that combines person segmentation with stereoscopic and anaglyph image generation. The main objectives were to:
7
+ 1. Accurately segment people from images using advanced AI models
8
+ 2. Generate stereoscopic 3D effects from 2D images
9
+ 3. Create red-cyan anaglyph images for 3D viewing
10
+ 4. Provide an interactive web interface for real-time processing
11
+
12
+ ### Methodology
13
+
14
+ #### Tools and Technologies Used
15
+ - **SegFormer (nvidia/segformer-b0)**: State-of-the-art transformer-based model for semantic segmentation
16
+ - **PyTorch**: Deep learning framework for running the SegFormer model
17
+ - **OpenCV**: Image processing operations and mask refinement
18
+ - **Gradio**: Web interface development
19
+ - **NumPy**: Efficient array operations for image manipulation
20
+ - **PIL (Python Imaging Library)**: Image loading and basic transformations
21
+
22
+ #### Implementation Steps
23
+
24
+ 1. **Person Segmentation**
25
+ - Utilized SegFormer model fine-tuned on ADE20K dataset
26
+ - Applied post-processing with erosion and Gaussian blur for mask refinement
27
+ - Implemented mask scaling and centering for various input sizes
28
+
29
+ 2. **Stereoscopic Processing**
30
+ - Created depth simulation through horizontal pixel shifting
31
+ - Implemented parallel view stereo pair generation
32
+ - Added configurable interaxial distance for 3D effect adjustment
33
+
34
+ 3. **Anaglyph Generation**
35
+ - Combined left and right eye views into red-cyan anaglyph
36
+ - Implemented color channel separation and recombination
37
+ - Added background image support with proper masking
38
+
39
+ 4. **User Interface**
40
+ - Developed interactive web interface using Gradio
41
+ - Added real-time parameter adjustment capabilities
42
+ - Implemented support for custom background images
43
+
44
+ ### Results
45
+
46
+ The system produces three main outputs:
47
+ 1. Segmentation mask showing the isolated person
48
+ 2. Side-by-side stereo pair for parallel viewing
49
+ 3. Red-cyan anaglyph image for 3D glasses viewing
50
+
51
+ Key Features:
52
+ - Adjustable person size (10-200%)
53
+ - Configurable interaxial distance (0-10 pixels)
54
+ - Optional custom background support
55
+ - Real-time processing and preview
56
+
57
+ ### Discussion
58
+
59
+ #### Technical Challenges
60
+ 1. **Mask Alignment**: Ensuring proper alignment between segmentation masks and background images required careful consideration of image dimensions and aspect ratios.
61
+ 2. **Stereo Effect Quality**: Balancing the interaxial distance for comfortable viewing while maintaining the 3D effect.
62
+ 3. **Performance Optimization**: Efficient processing of large images while maintaining real-time interaction.
63
+
64
+ #### Learning Outcomes
65
+ - Deep understanding of stereoscopic image generation
66
+ - Experience with state-of-the-art segmentation models
67
+ - Practical knowledge of image processing techniques
68
+ - Web interface development for ML applications
69
+
70
+ ### Conclusion
71
+
72
+ This project successfully demonstrates the integration of modern AI-powered segmentation with classical stereoscopic image processing techniques. The system provides an accessible way to create 3D effects from regular 2D images.
73
+
74
+ #### Future Work
75
+ - Implementation of depth-aware 3D effect generation
76
+ - Support for video processing
77
+ - Additional 3D viewing formats (side-by-side, over-under)
78
+ - Enhanced background replacement options
79
+ - Mobile device optimization
80
 
81
  ## Setup
82
 
 
84
  pip install -r requirements.txt
85
  ```
86
 
 
87
  ## Usage
88
 
89
  ```bash
90
  cd src
91
  python app.py
92
+ ```
93
+
94
+ ## Parameters
95
+
96
+ - **Person Image**: Upload an image containing a person
97
+ - **Background Image**: (Optional) Custom background image
98
+ - **Interaxial Distance**: Adjust the 3D effect strength (0-10)
99
+ - **Person Size**: Adjust the size of the person in the output (10-200%)
100
+
101
+ ## Output Types
102
+
103
+ 1. **Segmentation Mask**: Shows the isolated person
104
+ 2. **Stereo Pair**: Side-by-side stereo image for parallel viewing
105
+ 3. **Anaglyph**: Red-cyan 3D image viewable with anaglyph glasses
106
+