Drone Images Semantic Segmentation

Friendly Reminder

If you use my Dataset please cite it in your work/repository with the following link: Semantic Segmentation Drone Dataset
If you use or take inspiration from this repository please cite with this link: santurini/Drone-Images-Semantic-Segmentation

Your support will be truly appreciated and feel free to contact me at my following links or just send me an email:

Repository Structure

The repository is structured as follows:

code folder: contains the notebook for image preprocessing, Binary segmentation and Multi-class segmentation
plots folder: contains two subfolders binary and multiclass with the respective plots

Dataset

The dataset used is called Semantic Segmentation Drone Dataset and can be downloaded already processed at the following link.

From the original dataset the images were processed in such a way as to reduce the resolution and rename the labels to perform both Binary and Multi-class Classification; in the second case instead of using the original 24 classes they were grouped into 5 macro-classes as follows:

binary_classes = {
	0: {0, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, # obstacles
	1: {1, 2, 3, 4, 9} # landing zones
}

grouped_classes = {
         0: {0, 6, 10, 11, 12, 13, 14, 21, 22, 23}, # obstacles
         1: {5, 7}, # water
         2: {2, 3, 8, 19, 20}, # soft-surfaces
         3: {15, 16, 17, 18}, # moving objects
         4: {1, 4, 9} # landable
}

Image	Binary	5-classes

Models and Training

The performance of 3 different Image segmentation models, each with its own particular characteristic, considered the state of the art were compared just to go to show how the different underlying concepts differed.

The models all had as their backbone an efficient-b0 pretrained on imagenet, while the decoders were trained for 25 epochs on the augmented train set. Given the limited number of images (just 400) augmentation was crucial in order to train better the models.

The criterion used for the backpropagation was the Dice Loss (Binary and Multi-class) and the model was evaluated with Recall, False Positive Rate and image-wise IoU (in the Multi-class case all the metrics beside IoU were computed per-class).