GithubHelp home page GithubHelp logo

kostasstefanidis / pytorch-segmentation Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 138 KB

Semantic Segmentation using PyTorch

License: Apache License 2.0

Dockerfile 2.04% Python 97.96%
pytorch pytorch-lightning semantic-segmentation cityscapes deep-learning

pytorch-segmentation's Introduction

Pytorch-Segmentation

For a detailed overview of the dataset visit the Cityscapes website and the Cityscapes Github repository

This repository uses the Pytorch-Lightning framework for training,evaluation and inference.

 


Script usage

Dataset Utilities

1. File parsing and decoding

Parse files which are under the following directory sctructure


<data_path> : the root directory of the Cityscapes dataset
|
├── gtFine_trainvaltest
│   └── gtFine
│       ├── test
│       │   ├── berlin
│       │   ├── bielefeld
│       │   ├── bonn
│       │   ├── leverkusen
│       │   ├── mainz
│       │   └── munich
│       ├── train
│       │   ├── aachen
│       │   ├── bochum
│       │   ├── bremen
│       │   ├── cologne
│       │   ├── darmstadt
│       │   ├── dusseldorf
│       │   ├── erfurt
│       │   ├── hamburg
│       │   ├── hanover
│       │   ├── jena
│       │   ├── krefeld
│       │   ├── monchengladbach
│       │   ├── strasbourg
│       │   ├── stuttgart
│       │   ├── tubingen
│       │   ├── ulm
│       │   ├── weimar
│       │   └── zurich
│       └── val
│           ├── frankfurt
│           ├── lindau
│           └── munster
└── leftImg8bit_trainvaltest
    └── leftImg8bit
        ├── test
        │   ├── berlin
        │   ├── bielefeld
        │   ├── bonn
        │   ├── leverkusen
        │   ├── mainz
        │   └── munich
        ├── train
        │   ├── aachen
        │   ├── bochum
        │   ├── bremen
        │   ├── cologne
        │   ├── darmstadt
        │   ├── dusseldorf
        │   ├── erfurt
        │   ├── hamburg
        │   ├── hanover
        │   ├── jena
        │   ├── krefeld
        │   ├── monchengladbach
        │   ├── strasbourg
        │   ├── stuttgart
        │   ├── tubingen
        │   ├── ulm
        │   ├── weimar
        │   └── zurich
        └── val
            ├── frankfurt
            ├── lindau
            └── munster

Each of the train,val,test directories contain subdirectories with the name of a city. To use a whole split, subfolder='all' must be passed to the Dataset.create() method in order to read the images from all the subfolders. For testing purposes a smaller number of images from the dataset can be used by passing *subfolder='<CityName>'*. For example, passing split='train' to the Dataset() constructor, and subfolder='aachen' to the create() method will make the Dataset object only read the 174 images in the folder aachen and convert them into a tf.data.Dataset. You can choose either all the subfolders or one of them, but not an arbitrary combination of them. After the images (x) and the ground truth images (y) are read and decoded, they are combined into a single object (x, y).

 

2. Preprocessing :

Generally images have a shape of (batch_size, channels, height, width) "channels_first" format

  1. Split the image into smaller patches with spatial resolution (256, 256). Because very image has a spatial resolution of (1024, 2048) 32 patches are produced and they comprise a single batch. This means that when the patching technique is used the batch size is fixed to 32. After this operation the images have a shape of (32, 256, 256, 3) while the the ground truth images have a shape of (32, 256, 256, 1). To enable patching set the use_patches arguement of the create() method, to True.

 

  1. Perform data Augmentation

    NOTE : while all augmentations are performed on the images, only horrizontal flip is performed on the ground truth images, because changing the pixel values of the ground truth images means changing the class they belong to.

 

  1. Normalize images :
    • The input pixels values are scaled between -1 and 1 as default
    • If using a pretrained backbone normalize according to what the pretrained network expects at its input. To determine what type of preprocessing will be done to the images, the name of the pretrained network must be passed as the preprocessing arguement of the Dataset constructor. For example, if a model from the EfficientNet model family (i.e EfficientNetB0, EfficientNetB1, etc) is used as a backbone, then preprocessing = "EfficientNet" must be passed.

 

  1. Preprocess ground truth images:
    • Map eval ids to train ids
    • Convert to one-hot encoding
    • After this operation ground truth images have a shape of (batch_size, 1024, 2048, num_classes)

Finally the dataset which is created is comprised of elements (image, ground_truth) with shape ((batch_size, height, width, 3), (batch_size, height, width, num_classes))

 


Segmentation Models

pytorch-segmentation's People

Contributors

kostasstefanidis avatar

Stargazers

 avatar  avatar Christina  avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.