# 3D Person Segmentation and Anaglyph Generation title: Object Segmentation emoji: 👁 colorFrom: gray colorTo: pink sdk: gradio sdk_version: 5.22.0 app_file: src/app.py pinned: false ## Lab Report ### Introduction This project implements a sophisticated 3D image processing system that combines person segmentation with stereoscopic and anaglyph image generation. The main objectives were to: 1. Accurately segment people from images using advanced AI models 2. Generate stereoscopic 3D effects from 2D images 3. Create red-cyan anaglyph images for 3D viewing 4. Provide an interactive web interface for real-time processing ### Methodology #### Tools and Technologies Used - **SegFormer (nvidia/segformer-b0)**: State-of-the-art transformer-based model for semantic segmentation - **PyTorch**: Deep learning framework for running the SegFormer model - **OpenCV**: Image processing operations and mask refinement - **Gradio**: Web interface development - **NumPy**: Efficient array operations for image manipulation - **PIL (Python Imaging Library)**: Image loading and basic transformations #### Implementation Steps 1. **Person Segmentation** - Utilized SegFormer model fine-tuned on ADE20K dataset - Applied post-processing with erosion and Gaussian blur for mask refinement - Implemented mask scaling and centering for various input sizes 2. **Stereoscopic Processing** - Created depth simulation through horizontal pixel shifting - Implemented parallel view stereo pair generation - Added configurable interaxial distance for 3D effect adjustment 3. **Anaglyph Generation** - Combined left and right eye views into red-cyan anaglyph - Implemented color channel separation and recombination - Added background image support with proper masking 4. **User Interface** - Developed interactive web interface using Gradio - Added real-time parameter adjustment capabilities - Implemented support for custom background images ### Results The system produces three main outputs: 1. Segmentation mask showing the isolated person 2. Side-by-side stereo pair for parallel viewing 3. Red-cyan anaglyph image for 3D glasses viewing Key Features: - Adjustable person size (10-200%) - Configurable interaxial distance (0-10 pixels) - Optional custom background support - Real-time processing and preview ### Discussion #### Technical Challenges 1. **Mask Alignment**: Ensuring proper alignment between segmentation masks and background images required careful consideration of image dimensions and aspect ratios. 2. **Stereo Effect Quality**: Balancing the interaxial distance for comfortable viewing while maintaining the 3D effect. 3. **Performance Optimization**: Efficient processing of large images while maintaining real-time interaction. #### Learning Outcomes - Deep understanding of stereoscopic image generation - Experience with state-of-the-art segmentation models - Practical knowledge of image processing techniques - Web interface development for ML applications ### Conclusion This project successfully demonstrates the integration of modern AI-powered segmentation with classical stereoscopic image processing techniques. The system provides an accessible way to create 3D effects from regular 2D images. #### Future Work - Implementation of depth-aware 3D effect generation - Support for video processing - Additional 3D viewing formats (side-by-side, over-under) - Enhanced background replacement options - Mobile device optimization ## Setup ```bash pip install -r requirements.txt ``` ## Usage ```bash cd src python app.py ``` ## Parameters - **Person Image**: Upload an image containing a person - **Background Image**: (Optional) Custom background image - **Interaxial Distance**: Adjust the 3D effect strength (0-10) - **Person Size**: Adjust the size of the person in the output (10-200%) ## Output Types 1. **Segmentation Mask**: Shows the isolated person 2. **Stereo Pair**: Side-by-side stereo image for parallel viewing 3. **Anaglyph**: Red-cyan 3D image viewable with anaglyph glasses