GithubHelp home page GithubHelp logo

kunzhan / dssn Goto Github PK

View Code? Open in Web Editor NEW
16.0 3.0 1.0 2.15 MB

ACM MM 2023: Improving semi-supervised semantic segmentation with dual-level Siamese structure network

Home Page: http://arxiv.org/abs/2307.13938

Python 93.38% Shell 6.62%
cityscapes constrastive-learning image-segmentation long-tailed-learning pascal-voc semantic-segmentation semi-supervised-segmentation

dssn's Introduction

DSSN

Semi-supervised semantic segmentation (SSS) is an important task that utilizes both labeled and unlabeled data to reduce expenses on labeling training examples. However, the effectiveness of SSS algorithms is limited by the difficulty of fully exploiting the potential of unlabeled data. To address this, we propose a dual-level Siamese structure network (DSSN) for pixel-wise contrastive learning. By aligning positive pairs with a pixel-wise contrastive loss using strong augmented views in both low-level image space and high-level feature space, the proposed DSSN is designed to maximize the utilization of available unlabeled data. Additionally, we introduce a novel class-aware pseudo-label selection strategy for weak-to-strong supervision, which addresses the limitations of most existing methods that do not perform selection or apply a predefined threshold for all classes. Specifically, our strategy selects the top high-confidence prediction of the weak view for each class to generate pseudo labels that supervise the strong augmented views. This strategy is capable of taking into account the class imbalance and improving the performance of long-tailed classes. Our proposed method achieves state-of-the-art results on two datasets, PASCAL VOC 2012 and Cityscapes, outperforming other SSS algorithms by a significant margin.

Getting Started

Installation

Pascal VOC

pip install -r env181.txt

Cityscapes

pip install -r env200.txt

Parts of this code are borrowed from the baseline ST++

Pretrained Backbone

ResNet-50 | ResNet-101 | Xception-65

├── ./pretrained
    ├── resnet50.pth
    ├── resnet101.pth
    └── xception.pth

Dataset

Please modify your dataset path in configuration files.

The groundtruth masks have already been pre-processed by us. You can use them directly.

├── [Your Pascal Path]
    ├── JPEGImages
    └── SegmentationClass
    
├── [Your Cityscapes Path]
    ├── leftImg8bit
    └── gtFine
    
├── [Your COCO Path]
    ├── train2017
    ├── val2017
    └── masks

Usage

DSSN

# use torch.distributed.launch
sh scripts/train.sh <num_gpu> <port>

# or use slurm
# sh scripts/slurm_train.sh <num_gpu> <port> <partition>

To train on other datasets or splits, please modify dataset and split in train.sh.

Supervised Baseline

Modify the method from 'DSSN' to 'supervised' in train.sh, and double the batch_size in configuration file if you use the same number of GPUs as semi-supervised setting (no need to change lr).

Citation

If you find this project useful, please consider citing:

@InProceedings{DSSN2023,
  author    = {Tian, Zhibo and Zhang, Xiaolin and Zhang, Peng and Zhan, Kun},
  booktitle = {ACM Multimedia},
  title     = {Improving semi-supervised semantic segmentation with dual-level Siamese structure network},
  year      = {2023},
}

dssn's People

Contributors

kunzhan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

pipixiapipi

dssn's Issues

Question about the loss

Hi! In this line of code, MSE Loss is calculated between pred_u_s1_norm and pred_u_s2_norm. However, since two cutmix boxes are different, img_u_s1 and img_u_s2 are definitely different. So how does it make sense to calculate the MSE Loss between two kinds of predictions? Thanks!

Requirement of GPU

Hi! May I ask how many GPUs and what kind of GPU are needed in your work? Thanks!

Question about the mean-teacher setting

Hi! Have you done the ablation experiment of adopting mean-teacher to provide pseudo labels? In my opinion after various image and feature perturbations, it is no need to adopt MT, the model itself, with applying the classwise-aware threshold, can strong enough to dig out information and generate reliable pseudo labels. I think the performance boosting will be mild, am I right? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.