File size: 7,903 Bytes
b1ea76e a0d4eff b1ea76e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
---
license: mit
library_name: transformers
tags:
- Aerial Image Segmentation
- Road Detection
- Semantic Segmentation
- U-Net-50
- Computer Vision
- Remote Sensing
- Urban Planning
- Geographic Information Systems (GIS)
- Deep Learning
datasets:
- balraj98/massachusetts-roads-dataset
---
# Model Card for spectrewolf8/aerial-image-road-segmentation-with-U-NET-xp
This model card provides an overview of a computer vision model designed for aerial image road segmentation using the U-Net-50 architecture. The model is intended to accurately identify and segment road networks from aerial imagery, crucial for applications in mapping and autonomous driving.
## Model Details
### Model Description
- **Developed by:** [spectrewolf8](https://github.com/Spectrewolf8)
- **Model type:** Computer-Vision/Semantic-segmentation
- **License:** MIT
### Model Sources
- **Repository:** https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp
## Uses
### Direct Use
This model can be used to segment road networks from aerial images without additional fine-tuning. It is applicable in scenarios where detailed and accurate road mapping is required.
### Downstream Use
When fine-tuned on additional datasets, this model can be adapted for other types of semantic segmentation tasks, potentially enhancing applications in various remote sensing domains.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# Import necessary classes
from tensorflow.keras.models import load_model
from tensorflow.python.keras import layers
from tensorflow.python.keras.models import Sequential
import random
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing.image import ImageDataGenerator
seed=24
batch_size= 8
# Load images for dataset generators from respective dataset libraries. The images and masks are returned as NumPy arrays
# Images can be further resized by adding target_size=(150, 150) with any size for your network to flow_from_directory parameters
# Our images are already cropped to 256x256 so traget_size parameter can be ignored
def image_and_mask_generator(image_dir, label_dir):
img_data_gen_args = dict(rescale = 1/255.)
mask_data_gen_args = dict()
image_data_generator = ImageDataGenerator(**img_data_gen_args)
image_generator = image_data_generator.flow_from_directory(image_dir,
seed=seed,
batch_size=batch_size,
classes = ["."],
class_mode=None #Very important to set this otherwise it returns multiple numpy arrays thinking class mode is binary.
)
mask_data_generator = ImageDataGenerator(**mask_data_gen_args)
mask_generator = mask_data_generator.flow_from_directory(label_dir,
classes = ["."],
seed=seed,
batch_size=batch_size,
color_mode = 'grayscale', #Read masks in grayscale
class_mode=None
)
# print processed image paths for vanity
print(image_generator.filenames[0:5])
print(mask_generator.filenames[0:5])
generator = zip(image_generator, mask_generator)
return generator
# Method to calculate Intersection over Union Accuracy Coefficient
def iou_coef(y_true, y_pred, smooth=1e-6):
intersection = tensorflow.reduce_sum(y_true * y_pred)
union = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred) - intersection
return (intersection + smooth) / (union + smooth)
# Method to calculate Dice Accuracy Coefficient
def dice_coef(y_true, y_pred, smooth=1e-6):
intersection = tensorflow.reduce_sum(y_true * y_pred)
total = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred)
return (2. * intersection + smooth) / (total + smooth)
# Method to calculate Dice Loss
def soft_dice_loss(y_true, y_pred):
return 1-dice_coef(y_true, y_pred)
# Method to create generator
def create_generator(zipped):
for (img, mask) in zipped:
yield (img, mask)
model_path = "path"
u_net_model = load_model(model_path, custom_objects={'soft_dice_loss': soft_dice_loss, 'dice_coef': dice_coef, "iou_coef": iou_coef})
test_generator = create_generator(image_and_mask_generator(output_test_image_dir,output_test_label_dir))
# Assuming create_generator is defined and provides images for prediction
images, ground_truth_masks = next(test_generator)
# Make predictions
predictions = u_net_model.predict(images)
# Apply threshold to predictions
thresh_val = 0.8
prediction_threshold = (predictions > thresh_val).astype(np.uint8)
# Visualize results
num_samples = min(10, len(images)) # Use at most 10 samples or the total number of images available
f = plt.figure(figsize=(15, 25))
for i in range(num_samples):
ix = random.randint(0, len(images) - 1) # Ensure ix is within range
f.add_subplot(num_samples, 4, i * 4 + 1)
plt.imshow(images[ix])
plt.title("Image")
plt.axis('off')
f.add_subplot(num_samples, 4, i * 4 + 2)
plt.imshow(np.squeeze(ground_truth_masks[ix]))
plt.title("Ground Truth")
plt.axis('off')
f.add_subplot(num_samples, 4, i * 4 + 3)
plt.imshow(np.squeeze(predictions[ix]))
plt.title("Prediction")
plt.axis('off')
f.add_subplot(num_samples, 4, i * 4 + 4)
plt.imshow(np.squeeze(prediction_threshold[ix]))
plt.title(f"Thresholded at {thresh_val}")
plt.axis('off')
plt.show()
```
## Training Details
### Training Data
The model was trained on the Massachusetts Roads Dataset, which includes high-resolution aerial images with corresponding road segmentation masks. The images were preprocessed by cropping into 256x256 patches and converting masks to binary format.
### Training Procedure
#### Preprocessing
- Images were cropped into 256x256 patches to manage memory usage and improve training efficiency.
- Masks were binarized to create clear road/non-road classifications.
#### Training Hyperparameters
- **Training regime:** FP32 precision
- **Epochs:** 2
- **Batch Size:** 8
- **Learning Rate:** 0.0001
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The model was evaluated using a separate set of aerial images and their corresponding ground truth masks from the dataset.
#### Metrics
- **Intersection over Union (IoU):** Measures the overlap between predicted and actual road areas.
- **Dice Coefficient:** Evaluates the similarity between predicted and ground truth masks.
### Results
The model achieved 71% accuracy in segmenting road networks from aerial images, with evaluation metrics indicating good performance in distinguishing road features from non-road areas.
#### Summary
The U-Net-50 model effectively segments road networks, demonstrating its potential for practical applications in urban planning and autonomous systems.
## Technical Specifications
### Model Architecture and Objective
- **Architecture:** U-Net-50
- **Objective:** Road segmentation in aerial images
### Compute Infrastructure
#### Software
- **Framework:** TensorFlow 2.x
- **Dependencies:** Keras, OpenCV, tifffile
**BibTeX:**
@misc{aerial-image-road-segmentation-with-U-NET-xp,
author = {spectrewolf8},
title = {Aerial Image Road Segmentation Using U-Net-50},
year = {2024},
howpublished = {\url{https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp}},
} |