Alex Hortua commited on
Commit
e98ae33
Β·
1 Parent(s): ef7ef27
Files changed (1) hide show
  1. README.md +108 -4
README.md CHANGED
@@ -1,12 +1,116 @@
1
- ---
 
2
  title: Object Segmentation
3
  emoji: πŸ‘
4
  colorFrom: gray
5
  colorTo: pink
6
  sdk: gradio
7
  sdk_version: 5.22.0
8
- app_file: app.py
9
  pinned: false
10
- ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 3D Person Segmentation and Anaglyph Generation
2
+
3
  title: Object Segmentation
4
  emoji: πŸ‘
5
  colorFrom: gray
6
  colorTo: pink
7
  sdk: gradio
8
  sdk_version: 5.22.0
9
+ app_file: src/app.py
10
  pinned: false
 
11
 
12
+
13
+ ## Lab Report
14
+
15
+ ### Introduction
16
+ This project implements a sophisticated 3D image processing system that combines person segmentation with stereoscopic and anaglyph image generation. The main objectives were to:
17
+ 1. Accurately segment people from images using advanced AI models
18
+ 2. Generate stereoscopic 3D effects from 2D images
19
+ 3. Create red-cyan anaglyph images for 3D viewing
20
+ 4. Provide an interactive web interface for real-time processing
21
+
22
+ ### Methodology
23
+
24
+ #### Tools and Technologies Used
25
+ - **SegFormer (nvidia/segformer-b0)**: State-of-the-art transformer-based model for semantic segmentation
26
+ - **PyTorch**: Deep learning framework for running the SegFormer model
27
+ - **OpenCV**: Image processing operations and mask refinement
28
+ - **Gradio**: Web interface development
29
+ - **NumPy**: Efficient array operations for image manipulation
30
+ - **PIL (Python Imaging Library)**: Image loading and basic transformations
31
+
32
+ #### Implementation Steps
33
+
34
+ 1. **Person Segmentation**
35
+ - Utilized SegFormer model fine-tuned on ADE20K dataset
36
+ - Applied post-processing with erosion and Gaussian blur for mask refinement
37
+ - Implemented mask scaling and centering for various input sizes
38
+
39
+ 2. **Stereoscopic Processing**
40
+ - Created depth simulation through horizontal pixel shifting
41
+ - Implemented parallel view stereo pair generation
42
+ - Added configurable interaxial distance for 3D effect adjustment
43
+
44
+ 3. **Anaglyph Generation**
45
+ - Combined left and right eye views into red-cyan anaglyph
46
+ - Implemented color channel separation and recombination
47
+ - Added background image support with proper masking
48
+
49
+ 4. **User Interface**
50
+ - Developed interactive web interface using Gradio
51
+ - Added real-time parameter adjustment capabilities
52
+ - Implemented support for custom background images
53
+
54
+ ### Results
55
+
56
+ The system produces three main outputs:
57
+ 1. Segmentation mask showing the isolated person
58
+ 2. Side-by-side stereo pair for parallel viewing
59
+ 3. Red-cyan anaglyph image for 3D glasses viewing
60
+
61
+ Key Features:
62
+ - Adjustable person size (10-200%)
63
+ - Configurable interaxial distance (0-10 pixels)
64
+ - Optional custom background support
65
+ - Real-time processing and preview
66
+
67
+ ### Discussion
68
+
69
+ #### Technical Challenges
70
+ 1. **Mask Alignment**: Ensuring proper alignment between segmentation masks and background images required careful consideration of image dimensions and aspect ratios.
71
+ 2. **Stereo Effect Quality**: Balancing the interaxial distance for comfortable viewing while maintaining the 3D effect.
72
+ 3. **Performance Optimization**: Efficient processing of large images while maintaining real-time interaction.
73
+
74
+ #### Learning Outcomes
75
+ - Deep understanding of stereoscopic image generation
76
+ - Experience with state-of-the-art segmentation models
77
+ - Practical knowledge of image processing techniques
78
+ - Web interface development for ML applications
79
+
80
+ ### Conclusion
81
+
82
+ This project successfully demonstrates the integration of modern AI-powered segmentation with classical stereoscopic image processing techniques. The system provides an accessible way to create 3D effects from regular 2D images.
83
+
84
+ #### Future Work
85
+ - Implementation of depth-aware 3D effect generation
86
+ - Support for video processing
87
+ - Additional 3D viewing formats (side-by-side, over-under)
88
+ - Enhanced background replacement options
89
+ - Mobile device optimization
90
+
91
+ ## Setup
92
+
93
+ ```bash
94
+ pip install -r requirements.txt
95
+ ```
96
+
97
+ ## Usage
98
+
99
+ ```bash
100
+ cd src
101
+ python app.py
102
+ ```
103
+
104
+ ## Parameters
105
+
106
+ - **Person Image**: Upload an image containing a person
107
+ - **Background Image**: (Optional) Custom background image
108
+ - **Interaxial Distance**: Adjust the 3D effect strength (0-10)
109
+ - **Person Size**: Adjust the size of the person in the output (10-200%)
110
+
111
+ ## Output Types
112
+
113
+ 1. **Segmentation Mask**: Shows the isolated person
114
+ 2. **Stereo Pair**: Side-by-side stereo image for parallel viewing
115
+ 3. **Anaglyph**: Red-cyan 3D image viewable with anaglyph glasses
116
+