Alex Hortua commited on
Commit
2f4c8a9
·
1 Parent(s): 6eb4ed6

Adding Report and additional information

Browse files
public/Project 4 Report.pdf ADDED
Binary file (91.5 kB). View file
 
public/lab_report.txt ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 3D Person Segmentation and Anaglyph Generation - Lab Report
2
+ =================================================
3
+
4
+ Introduction
5
+ ------------
6
+ In this project, I developed a sophisticated 3D image processing system that combines modern AI-powered person segmentation with classical stereoscopic image processing. The main objectives were successfully accomplished:
7
+
8
+ 1. Implementation of accurate person segmentation using SegFormer AI model
9
+ 2. Creation of stereoscopic 3D effects from 2D images
10
+ 3. Generation of red-cyan anaglyph images for 3D viewing
11
+ 4. Development of an interactive web interface
12
+ 5. Implementation of intelligent mask alignment for varying image sizes
13
+
14
+ The project is accessible at: https://huggingface.co/spaces/axelhortua/Object-segmentation
15
+
16
+ Methodology
17
+ -----------
18
+ The implementation followed a systematic approach using various tools and technologies:
19
+
20
+ 1. Tools Selection:
21
+ - SegFormer (nvidia/segformer-b0) for semantic segmentation
22
+ - PyTorch for deep learning implementation
23
+ - OpenCV for image processing
24
+ - Gradio for web interface
25
+ - NumPy for array operations
26
+ - PIL for image handling
27
+
28
+ 2. Implementation Process:
29
+
30
+ a) Person Segmentation:
31
+ - Used SegFormer model fine-tuned on ADE20K dataset
32
+ - Applied post-processing with erosion and Gaussian blur
33
+ - Implemented dynamic mask scaling and centering
34
+
35
+ b) Mask Processing:
36
+ - Developed dynamic mask resizing system
37
+ - Implemented transparent padding
38
+ - Ensured proper aspect ratio maintenance
39
+ - Created centered alignment algorithm
40
+
41
+ c) Stereoscopic Processing:
42
+ - Implemented horizontal pixel shifting for depth simulation
43
+ - Created parallel view stereo pair generation
44
+ - Added configurable interaxial distance
45
+ - Enhanced stereo pair alignment
46
+
47
+ d) Anaglyph Generation:
48
+ - Implemented color channel separation
49
+ - Created background integration system
50
+ - Developed foreground-background blending
51
+ - Optimized 3D effect quality
52
+
53
+ Results
54
+ -------
55
+ The system successfully produces three main outputs:
56
+
57
+ 1. Segmentation Mask:
58
+ - Clean person isolation
59
+ - Proper transparency handling
60
+ - Accurate edge detection
61
+ - Smooth mask transitions
62
+
63
+ 2. Stereo Pair:
64
+ - Side-by-side stereo image
65
+ - Configurable depth effect
66
+ - Proper alignment between pairs
67
+ - Maintained image quality
68
+
69
+ 3. Anaglyph Output:
70
+ - Red-cyan 3D image
71
+ - Adjustable 3D effect strength
72
+ - Clean color separation
73
+ - Minimal ghosting artifacts
74
+
75
+ Key Features Achieved:
76
+ - Person size adjustment (10-200%)
77
+ - Interaxial distance control (0-10 pixels)
78
+ - Custom background support
79
+ - Real-time processing and preview
80
+ - Intelligent mask alignment
81
+ - Transparent background handling
82
+
83
+ Discussion
84
+ ----------
85
+ Technical Challenges Faced:
86
+
87
+ 1. Mask Alignment:
88
+ - Complex handling of different image dimensions
89
+ - Maintaining proper aspect ratios
90
+ - Ensuring consistent centering
91
+ - Handling edge cases
92
+
93
+ 2. Stereo Effect Quality:
94
+ - Balancing interaxial distance
95
+ - Minimizing visual artifacts
96
+ - Maintaining comfortable viewing experience
97
+ - Preserving image details
98
+
99
+ 3. Performance Optimization:
100
+ - Efficient large image processing
101
+ - Real-time interface responsiveness
102
+ - Memory management
103
+ - Processing speed optimization
104
+
105
+ 4. Transparency Handling:
106
+ - Proper alpha channel management
107
+ - Clean edge preservation
108
+ - Consistent transparency across operations
109
+ - Background integration
110
+
111
+ Learning Outcomes:
112
+ - Deep understanding of stereoscopic image generation
113
+ - Practical experience with AI models
114
+ - Advanced image processing techniques
115
+ - Web interface development skills
116
+ - Complex system integration experience
117
+
118
+ Conclusion
119
+ ----------
120
+ The project successfully demonstrates the integration of AI-powered segmentation with classical stereoscopic techniques. The system provides an accessible way to create 3D effects from regular 2D images, with robust handling of different image sizes and proper transparency management.
121
+
122
+ Future Work:
123
+ 1. Implementation of depth-aware 3D effect generation
124
+ 2. Addition of video processing capabilities
125
+ 3. Support for additional 3D viewing formats
126
+ 4. Enhanced background replacement options
127
+ 5. Mobile device optimization
128
+ 6. Advanced depth map generation
129
+ 7. Multi-person segmentation support
130
+
131
+ The project has laid a strong foundation for future developments in 3D image processing and demonstrates the potential of combining AI with traditional image processing techniques.