GithubHelp home page GithubHelp logo

hanchaoleng / shapeconv Goto Github PK

View Code? Open in Web Editor NEW
101.0 4.0 14.0 252 KB

ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation (ICCV 2021)

License: Apache License 2.0

Python 98.77% Shell 1.23%
rgbd rgbd-segmentation semantic-segmentation iccv2021 3d depth-fusion nyu-depth-v2 sun-rgbd computer-vision

shapeconv's Introduction

ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation

PWC PWC

The official implementation of Shape-aware Convolutional Layer.

Jinming Cao, Hanchao Leng, Dani Lischinski, Danny Cohen-Or, Changhe Tu, Yangyan Li, ICCV 2021.

Introduction

We design a Shape-aware Convolutional(ShapeConv) layer to explicitly model the shape information for enhancing the RGB-D semantic segmentation accuracy. Specifically, we decompose the depth feature into a shape-component and a base-component, after which two learnable weights are introduced to handle the shape and base with differentiation.

image

Usage

Installation

  1. Requirements
  • Linux
  • Python 3.6+
  • PyTorch 1.7.0 or higher
  • CUDA 10.0 or higher

We have tested the following versions of OS and softwares:

  • OS: Ubuntu 16.04.6 LTS
  • CUDA: 10.0
  • PyTorch 1.7.0
  • Python 3.6.9
  1. Install dependencies.
pip install -r requirements.txt

Dataset

Download the offical dataset and convert to a format appropriate for this project. See here.

Or download the converted dataset:

Evaluation

  1. Model

    Download trained model and put it in folder ./model_zoo. See all trained models here.

  2. Config

    Edit config file in ./configs. The config files in ./configs correspond to the model files in ./model_zoo.

    1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES. CUDA_VISIBLE_DEVICES is used to specify which GPUs should be visible to a CUDA application, e.g., inference.gpu_id = "0,1,2,3".
    2. Set dataset_root = path_to_dataset. path_to_dataset represents the path of dataset. e.g.,dataset_root = "/home/shape_conv/nyu_v2".
  3. Run

    1. Distributed evaluation, please run:
    ./tools/dist_test.sh config_path checkpoint_path gpu_num
    • config_path is path of config file;
    • checkpoint_pathis path of model file;
    • gpu_num is the number of GPUs used, note that gpu_num <= len(inference.gpu_id).

    E.g., evaluate shape-conv model on NYU-V2(40 categories), please run:

    ./tools/dist_test.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py model_zoo/nyu40_deeplabv3plus_resnext101_shape.pth 4
    1. Non-distributed evaluation
    python tools/test.py config_path checkpoint_path

Train

  1. Config

    Edit config file in ./configs.

    1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES.

      E.g.,inference.gpu_id = "0,1,2,3".

    2. Set dataset_root = path_to_dataset.

      E.g.,dataset_root = "/home/shape_conv/nyu_v2".

  2. Run

    1. Distributed training
    ./tools/dist_train.sh config_path gpu_num

    E.g., train shape-conv model on NYU-V2(40 categories) with 4 GPUs, please run:

    ./tools/dist_train.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py 4
    1. Non-distributed training
    python tools/train.py config_path

Result

For more result and pre-trained model, please see model zoo.

NYU-V2(40 categories)

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNeXt-101 False False 48.9%
DeepLabv3plus ResNeXt-101 False True 50.2%
DeepLabv3plus ResNeXt-101 True False 50.3%
DeepLabv3plus ResNeXt-101 True True 51.3%

SUN-RGBD

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNet-101 False False 46.9%
DeepLabv3plus ResNet-101 False True 47.6%
DeepLabv3plus ResNet-101 True False 47.6%
DeepLabv3plus ResNet-101 True True 48.6%

SID(Stanford Indoor Dataset)

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNet-101 False False 54.55%
DeepLabv3plus ResNet-101 False True 60.6%

Citation

If you find this repo useful, please consider citing:

@article{cao2021shapeconv,
  title={ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation},
  author={Cao, Jinming and Leng, Hanchao and Lischinski, Dani and Cohen-Or, Danny and Tu, Changhe and Li, Yangyan},
  journal={arXiv preprint arXiv:2108.10528},
  year={2021}
}

Acknowledgments

This repository is heavily based on vedaseg.

shapeconv's People

Contributors

hanchaoleng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

shapeconv's Issues

关于SUN和SID数据集的HHA

作者您好,请问您方便分享SUN和SID数据集的HHA数据连接吗?或者您是怎么将depth转为HHA的呢?需要一些参数吗?
谢谢!

关于SID数据集

作者您好,我为SID数据集制作标签时并没有发现semantic_labels.json这个文件,标注是三通道的分割图,请问您方便提供吗?十分感谢!
@hanchaoleng

有关论文中“trimp”可视化与计算代码

请问number of misclassified pixels within a narrow band (“trimap”) surrounding ground-truth object boundaries
这一段的错误像素数量计算与范围可视化的代码您能提供一下吗?

Suggest to loosen the dependency on albumentations

Hi, your project ShapeConv(commit id: 25bee65) requires "albumentations==0.4.1" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable, i.e., albumentations 0.4.0, since all functions that you directly (6 APIs: albumentations.augmentations.functional.scale, albumentations.core.transforms_interface.to_tuple, albumentations.augmentations.functional.pad_with_params, albumentations.core.transforms_interface.DualTransform.init, albumentations.core.composition.Compose.init, albumentations.augmentations.transforms.PadIfNeeded.init) or indirectly (propagate to 14 albumentations's internal APIs and 2 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.4.1" to "albumentations>=0.4.0,<=0.4.1". This will improve the applicability of ShapeConv and reduce the possibility of any further dependency conflict with other projects.

May I pull a request to further loosen the dependency on albumentations?

By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

Thank you very much for your work. I am very interested in your work. I hope you can help explain the meaning of this sentence in the code. The code is in the script 'test_runner. py ' in line 56, as follows: pred_ rgb[label == 255] = np.array((0, 0, 0))

Thank you very much for your work. I am very interested in your work. I hope you can help explain the meaning of this sentence in the code. The code is in the script 'test_runner. py ' in line 56, as follows:

pred_ rgb[label == 255] = np.array((0, 0, 0))

SUN RGBD preprocessing

Hi, thanks for your valuable work!

Could you please provide the scripts for SUN RGBD dataset preprocessing like NYU and SID datasets?

ModuleNotFoundError: No module named 'torch._six - Google Colab

"Traceback (most recent call last):
File "/content/ShapeConv/tools/test.py", line 7, in
from rgbd_seg.runners import TestRunner
File "/content/ShapeConv/tools/../rgbd_seg/runners/init.py", line 1, in
from .inference_runner import InferenceRunner
File "/content/ShapeConv/tools/../rgbd_seg/runners/inference_runner.py", line 3, in
from ..models import build_model
File "/content/ShapeConv/tools/../rgbd_seg/models/init.py", line 1, in
from .builder import build_model
File "/content/ShapeConv/tools/../rgbd_seg/models/builder.py", line 5, in
from rgbd_seg.models.decoders import build_brick
File "/content/ShapeConv/tools/../rgbd_seg/models/decoders/init.py", line 1, in
from .bricks import FusionBlock, JunctionBlock
File "/content/ShapeConv/tools/../rgbd_seg/models/decoders/bricks.py", line 8, in
from ..utils import ConvModule, build_module
File "/content/ShapeConv/tools/../rgbd_seg/models/utils/init.py", line 2, in
from .conv_module import ConvModule, ConvModules
File "/content/ShapeConv/tools/../rgbd_seg/models/utils/conv_module.py", line 10, in
from rgbd_seg.models.utils.shape_conv import ShapeConv2d
File "/content/ShapeConv/tools/../rgbd_seg/models/utils/shape_conv.py", line 9, in
from torch._six import container_abcs
ModuleNotFoundError: No module named 'torch._six'"

Scales for Test

Hello,

Thanks for the great work !

I am confused about the scales in the configs. Why you use multi-scales for testing instead of scale = 1?

For example the code on the configs/nyu/nyu40
tta=dict(
scales=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],

Can you provide some explications? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.