GithubHelp home page GithubHelp logo

aleedm / real-time-domain-adaptation-in-semantic-segmentation Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 472.59 MB

This repository contains research on real-time domain adaptation in semantic segmentation, aiming at bridging the gap between synthetic and real-world imagery for urban scenes and autonomous driving, utilizing STDC models and advanced domain adaptation methods.

License: MIT License

Python 100.00%
depthwise-separable-convolutions domain-adaptation generative-adversarial-network real-time-semantic-segmentation

real-time-domain-adaptation-in-semantic-segmentation's Introduction

Real-time Domain Adaptation in Semantic Segmentation (Course Project)

This repository provides a starter-code setup for the Real-time Domain Adaptation in Semantic Segmentation project of the Advance Machine Learning Course. (Presentation of the project)

Package

  • datasets: Contains the dataset classes for the Cityscapes and GTA datasets. The train and validation images for both datasets will also be inserted here.
  • model: Contains the STDCNet model and the Discriminator model.
  • runs: Contains the tensorboard logs for all the project steps.
  • saved_models: Contains the saved models for all the project steps.
  • STDCNET_weights: Contains the pre-trained weights for the STDCNet model.

Requisites

  • Download the pre-trained weight at this link at put it in the STDCNET_weights folder.
  • Download the Cityscapes dataset and the GTA dataset at this link and put it in the datasets' folder.

Steps

  • 2.A: Train the STDCNet model on the Cityscapes dataset and evaluate it on the Cityscapes dataset.
  • 2.B: Train the STDCNet model on the GTA dataset and evaluate it on the GTA dataset.
  • 2.C.1: Evaluate the best model from step 2.B on the Cityscapes dataset.
  • 2.C.2: Train the STDCNet model on the GTA augmented dataset and evaluate it on the Cityscapes dataset.
  • 3: Train the STDCNet model with unsupervised adversarial training domain adaptation with labeled synthetic data (source GTA dataset) and unlabelled real data (target Cityscapes datasets).
  • 4.A: Train the STDCNet model with unsupervised adversarial training domain adaptation with labeled synthetic data (source GTA dataset) and unlabelled real data (target Cityscapes datasets) using a depthwise discriminator.
  • 4.B: Train the STDCNet model with unsupervised adversarial training domain adaptation with labeled synthetic data (source GTA dataset) and unlabelled real data (target Cityscapes datasets) using a diagonalwise discriminator.

Command Line Arguments

  • 2.A: --train_dataset Cityscapes --val_dataset Cityscapes --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --num_epochs 50 --learning_rate 0.01 --crop_height 512 --crop_width 1024 --tensorboard_path runs/2_A --save_model_path saved_models/2_A --optimizer sgd --loss crossentropy
  • 2.B: --train_dataset GTA --val_dataset GTA --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --num_epochs 50 --crop_height 512 --learning_rate 0.01 --crop_width 1024 --tensorboard_path runs/2_B --save_model_path saved_models/2_B --optimizer sgd --loss crossentropy
  • 2.C.1: --mode val --val_dataset Cityscapes --crop_height 512 --crop_width 1024 --save_model_path saved_models/2_B/best.pth
  • 2.C.2: --train_dataset GTA_aug --val_dataset Cityscapes --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --learning_rate 0.01 --num_epochs 50 --crop_height 512 --crop_width 1024 --tensorboard_path runs/2_C_2 --save_model_path saved_models/2_C_2 --optimizer sgd --loss crossentropy
  • 3: --mode train_adversarial --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --learning_rate 0.01 --discriminator_learning_rate 0.001 --num_epochs 50 --crop_height 512 --crop_width 1024 --tensorboard_path runs/3 --save_model_path saved_models/3
  • 4.A: --mode train_adversarial --depthwise_discriminator depthwise --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --learning_rate 0.01 --discriminator_learning_rate 0.001 --num_epochs 50 --crop_height 512 --crop_width 1024 --tensorboard_path runs/4_A --save_model_path saved_models/4_A
  • 4.B: --mode train_adversarial --depthwise_discriminator diagonalwise --pretrain_path STDCNET_weights/STDCNet813M_73.91.tar --batch_size 8 --learning_rate 0.01 --discriminator_learning_rate 0.001 --num_epochs 50 --crop_height 512 --crop_width 1024 --tensorboard_path runs/4_B --save_model_path saved_models/4_B

Results

results Comparative visualization of semantic segmentation results across different models and techniques. Each row shows the original Cityscapes image with corresponding segmentation outputs. From left to right: the raw image, segmentation without data augmentation and domain adaptation (DA), with data augmentation but no DA, with DA using a classical convolutional approach, with DA employing depthwise separable convolutions, with DA utilizing diagonal depthwise separable convolutions, and the ground truth (GT) segmentation. These visual results highlight the progressive improvement in segmentation fidelity as we move from standard methods to advanced DA techniques

Train Datasets Validation Datasets Accuracy (%) mIoU (%) Train Time (avg per-epochs)
Cityscapes Cityscapes 81 57.8 2:33 minutes
GTA GTA 80.8 62.0 3:28 minutes
GTA Cityscapes 60.1 24.6 None
GTA augmented Cityscapes 70.2 30.7 5:22 minutes
Single Layer DA
Source=GTA, Target=Cityscapes
Cityscapes 74.3 33.8 4:33 minutes
Single Layer DA
Source=GTA, Target=Cityscapes
Depthwise discriminator function
Cityscapes 73.1 32.7 4:32 minutes
Single Layer DA
Source=GTA, Target=Cityscapes
Diagonalwise discriminator function
Cityscapes 74.0 33.5 4:25 minutes
loss miou precision

real-time-domain-adaptation-in-semantic-segmentation's People

Contributors

aleedm avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.