File size: 2,086 Bytes
b013e2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| **Component**                | **Description**                                                                                 |
|-------------------------------|-----------------------------------------------------------------------------------------------|
| **Backbone**                 | ResNet-50 with FPN (Feature Pyramid Network)                                                   |
| **Pretrained Weights**       | Trained on ImageNet for feature extraction.                                                    |
| **RPN (Region Proposal Network)** | Generates region proposals based on extracted features from the backbone.                           |
| **ROI Align**                | Aligns region proposals to a fixed size for consistent feature extraction.                     |
| **Box Head**                 | Fully connected layers for refining bounding boxes and classifying objects.                   |
| **Box Predictor**            | Replaced with a custom predictor: `FastRCNNPredictor` for handling custom classes.             |
| **Number of Classes**        | Configurable (including background).                                                          |
| **Loss Function**            | Combines classification and regression losses for multi-task optimization.                     |
| **Optimizer**                | Stochastic Gradient Descent (SGD) with momentum for optimization.                             |
| **Learning Rate Scheduler**  | StepLR to decay learning rate every few epochs for better convergence.                        |
| **Batch Normalization**      | Applied within the backbone for stable training.                                               |
| **Data Format**              | Input: Tensor of shape `(Batch Size, Channels, Height, Width)` in PyTorch's NCHW format.       |
| **Output**                   | - Class probabilities for each region proposal.                                               |
|                               | - Refined bounding box coordinates for each detected object.                                  |