GithubHelp home page GithubHelp logo

swin3d_task's Introduction

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding



To be Done

  1. Release the whole training scripts with CAGroup3D+Swin3D
  2. Upload the models and configs for FCAF3D+Swin3D
  3. Upload the models and configs for CAGroup3D+Swin3D


Add Object Detction code:

  1. Update Object Detection code and configs with FCAF3D+Swin3D
  2. Update patch for CAGroup3D+Swin3D


Initial commits:

  1. The supported code and models for Semantic Segmentation on ScanNet and S3DIS are provided.


This repo contains the experiment code for Swin3D



  1. Install dependencies

       pip install -r requirements.txt
  2. Refer to this repo to compile the operation of swin3d

       git clone
       cd Swin3D
       python install

If you have problems installing the package, you can use the docker we provide:

  docker pull yukichiii/torch112_cu113:swin3d

To run the code for object detection, please refer to FCAF3D(which is based on mmdetection3d) and CAGroup3D(which is based on OpenPCDet). Install the requirements for mmdetection3d and run python install to install mmdetection3d.

Data Preparation

ScanNet Segmentation Data

Please refer to for the ScanNetv2 preprocessing. Then change the data_root entry in the yaml files in SemanticSeg/config/scannetv2.

S3DIS Segmentation Data

Please refer to for S3DIS preprocessing. Then modify the data_root entry in the yaml files in SemanticSeg/config/s3dis.

ScanNet 3D Detection Data

Please refer to for ScanNet preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/scannet_det.

S3DIS 3D Detection Data

Please refer to for S3DIS preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/s3dis_det.


ScanNet Segmentation

Change the work directory to SemanticSeg

  cd SemanticSeg

To train model on ScanNet Segmentation Task with Swin3D-S or Swin3D-L from scratch:

  python --config config/scannetv2/swin3D_RGBN_S.yaml
  python --config config/scannetv2/swin3D_RGBN_L.yaml

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB,Norm) here, and run:

  python --config config/scannetv2/swin3D_RGBN_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_S
  python --config config/scannetv2/swin3D_RGBN_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_L

S3DIS Segmentation

Change the work directory to SemanticSeg

  cd SemanticSeg

To train model on S3DIS Area5 Segmentation with Swin3D-S or Swin3D-L from scratch:

  python --config config/s3dis/swin3D_RGB_S.yaml
  python --config config/s3dis/swin3D_RGB_L.yaml

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB) here, and run:

  python --config config/s3dis/swin3D_RGB_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_S
  python --config config/s3dis/swin3D_RGB_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_L

3D Object Detection

To train from sratch with FCAF3D+Swin3D:

  python -m tools.train configs/scannet_det/

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB), and run:

  python -m tools.train configs/scannet_det/ --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_S
  python -m tools.train configs/scannet_det/ --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_L


To forward Swin3D with given checkpoint with TTA(Test Time Augmentation, we random rotate the input scan and vote the result), you can download the model below and run:

ScanNet Segmentation

  python --config config/scannetv2/swin3D_RGBN_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
  python --config config/scannetv2/swin3D_RGBN_L.yaml --vote_num 12 args.weight PATH_TO_CKPT

S3DIS Area5 Segmentation

  python --config config/s3dis/swin3D_RGB_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
  python --config config/s3dis/swin3D_RGB_L.yaml --vote_num 12 args.weight PATH_TO_CKPT

For faster forward, you can change the vote_num to 1.

3D Object Detection

For Detection task with FCAF3D+Swin3D:

  python -m tools.test configs/scannet_det/ CHECKPOINT_PATH --eval mAP --show-dir OUTPUT_PATH --out OUTPUT_PATH/result.pkl

Results and models

ScanNet Segmentation

Pretrained mIoU(Val) mIoU(Test) Model Train Eval
Swin3D-S 75.2 - model log log
Swin3D-S 75.6(76.8) - model log log
Swin3D-L 76.4(77.5) 77.9 model log log

S3DIS Segmentation

Pretrained Area 5 mIoU 6-fold mIoU Model Train Eval
Swin3D-S 72.5 76.9 model log log
Swin3D-S 73.0 78.2 model log log
Swin3D-L 74.5 79.8 model log log

ScanNet 3D Detection

Pretrained [email protected] [email protected] Model Log
Swin3D-S+FCAF3D 74.2 59.5 model log
Swin3D-L+FCAF3D 74.2 58.6 model log
Swin3D-S+CAGroup3D 76.4 62.7 model log
Swin3D-L+CAGroup3D 76.4 63.2 model log

S3DIS 3D Detection

Pretrained [email protected] [email protected] Model Log
Swin3D-S+FCAF3D 69.9 50.2 model log
Swin3D-L+FCAF3D 72.1 54.0 model log


If you find Swin3D useful to your research, please cite our work:

      title={Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding}, 
      author={Yu-Qi Yang and Yu-Xiao Guo and Jian-Yu Xiong and Yang Liu and Hao Pan and Peng-Shuai Wang and Xin Tong and Baining Guo},

swin3d_task's People


RxxS avatar  avatar Cheng Zhang avatar Derck avatar  avatar Qian Peisheng avatar  avatar Dongxu Lyu avatar  avatar TeaWhite avatar Kazami Michiru avatar Mahmoud Osman avatar  avatar Yang Tan avatar Li Wei avatar 小杜 avatar  avatar Sofia Kapsiani avatar Mino_Qin avatar Bo Sun avatar  avatar bilzard avatar  avatar Wansit Hepburn avatar aweso-hmm avatar Juncheng Yan avatar Guofan Fan avatar Xiaoyang Wu avatar



swin3d_task's Issues

Got error when training with S3DIS: "RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmEx"

Thank you for sharing a promising model. I'm now testing this repo with S3DIS following the instructions in the repo, but I got the following error. It seems to be caused by the wrong input dimension. Do you have any idea to solve this?

[09/09 09:28:13 main-logger]: #Model parameters: 26495773
[09/09 09:28:14 main-logger]: => no checkpoint found at 'runs/s3dis_Swin3D_RGB_S_123/model/model_last.pth'
[09/09 09:28:14 main-logger]: augmentation all
[09/09 09:28:14 main-logger]: jitter_sigma: 0.005, jitter_clip: 0.02
Totally 204 samples in train set.
Total repeated 204 samples in train set.
[09/09 09:28:14 main-logger]: train_data samples: '1224'
/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/utils/data/ UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 8, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
Totally 67 samples in val set.
Total repeated 67 samples in val set.
[09/09 09:28:14 main-logger]: scheduler: MultiStep. scheduler_update: epoch. milestones: [60, 80], gamma: 0.1
[09/09 09:28:14 main-logger]: lr: [0.0006, 5.9999999999999995e-05]
feat: torch.Size([80000, 3])
/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/ UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484683044/work/aten/src/ATen/native/TensorShape.cpp:2894.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/ UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
w_w_id // window_size // window_size,
/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/ UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
w_w_id // window_size % window_size,
feats.shape: torch.Size([80000, 48])
Traceback (most recent call last):
File "SemanticSeg/", line 916, in
File "SemanticSeg/", line 114, in main
main_worker(args.train_gpu, args.ngpus_per_node, args)
File "SemanticSeg/", line 510, in main_worker
loss_train, mIoU_train, mAcc_train, allAcc_train = train(
File "SemanticSeg/", line 606, in train
output = model(feat, coord, batch)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/Swin3D/SemanticSeg/model/", line 70, in forward
return self.backbone(sp, coords_sp)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/models/", line 140, in forward
sp, sp_down, coords_sp = layer(sp, coords_sp)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/", line 866, in forward
feats = blk(feats, attn_args_blk) # [N, C]
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/", line 622, in forward
feats = self.attn(feats, attn_args) # [N, c]
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/", line 504, in forward
qkv = self.qkv(feats)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/miniconda3/envs/test/lib/python3.8/site-packages/torch/nn/modules/", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)

As far as I checked, Window Attention in the first block seems waiting 48 dimensional data, but it gets 80000 dimensional data.

NaN values produced during training

Hi, thank you for sharing the code!

During training on the ScanNet dataset, I noticed that NaN values can be produced in the forward call of the Swin3DUnet module: model(sp, coords_sp). A similar issue has been reported on this page. I tried to set the fp16_mode=0 and use_amp=False, but NaN values persist.

Upon further investigation, I have identified a potential source of these NaN values, which appears to be in the self_attn_cuda_forward_device() function within the file. I tried to learn how to debug CUDA+python files referring to some documents, but it somehow didn't work out as intended.

I'd greatly appreciate it if you could shed some light on why these 'NaN' values are occurring and provide guidance on effectively debugging CUDA files.

Thank you and best regards. complie ERROR

When I executing python install ,some errors occur:

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/", line 1717, in _run_ninja_build
  File "/opt/conda/lib/python3.8/", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "", line 13, in <module>
  File "/opt/conda/lib/python3.8/site-packages/setuptools/", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/opt/conda/lib/python3.8/distutils/", line 148, in setup
  File "/opt/conda/lib/python3.8/distutils/", line 966, in run_commands
  File "/opt/conda/lib/python3.8/distutils/", line 985, in run_command
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 67, in run
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 109, in do_egg_install
  File "/opt/conda/lib/python3.8/distutils/", line 313, in run_command
  File "/opt/conda/lib/python3.8/distutils/", line 985, in run_command
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 167, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 153, in call_command
  File "/opt/conda/lib/python3.8/distutils/", line 313, in run_command
  File "/opt/conda/lib/python3.8/distutils/", line 985, in run_command
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 11, in run
  File "/opt/conda/lib/python3.8/distutils/command/", line 107, in build
  File "/opt/conda/lib/python3.8/distutils/", line 313, in run_command
  File "/opt/conda/lib/python3.8/distutils/", line 985, in run_command
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 79, in run
  File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/", line 186, in run
  File "/opt/conda/lib/python3.8/distutils/command/", line 340, in run
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/", line 735, in build_extensions
  File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/", line 194, in build_extensions
  File "/opt/conda/lib/python3.8/site-packages/setuptools/command/", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "/opt/conda/lib/python3.8/distutils/command/", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/", line 556, in unix_wrap_ninja_compile
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/", line 1399, in _write_ninja_file_and_compile_objects
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/", line 1733, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Here is one of the outputs:

[6/6] /usr/local/cuda/bin/nvcc  -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/
site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH 
-I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include 
-I/opt/conda/include/python3.8 -c -c /opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/ 
-o /opt/data/private/Swin3D_Task-main/Swin3D/build/temp.linux-x86_64-3.8/Swin3D/src/attn/self_attn_aio_bwd.o 
-D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 
-gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 
-gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 
-gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 
-gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
FAILED: /opt/data/private/Swin3D_Task-main/Swin3D/build/temp.linux-x86_64-3.8/Swin3D/src/attn/self_attn_aio_bwd.o 
/usr/local/cuda/bin/nvcc  -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/
-I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC 
-I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c 
-c /opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/ 
-o /opt/data/private/Swin3D_Task-main/Swin3D/build/temp.linux-x86_64-3.8/Swin3D/src/attn/self_attn_aio_bwd.o 
-D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 
-gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 
-gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 
-gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 
-gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
/opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/attn_utils.cuh(95): error: no operator "+=" matches these operands
            operand types are: __half2 += const __half2

/opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/attn_utils.cuh(96): error: no operator "+=" matches these operands
            operand types are: __half2 += const __half2

/opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/attn_utils.cuh(97): error: no operator "+=" matches these operands
            operand types are: __half2 += const __half2

/opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/attn_utils.cuh(106): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (__half2 *, const __half2)
... ...
54 errors detected in the compilation of "/opt/data/private/Swin3D_Task-main/Swin3D/Swin3D/src/attn/".

How can I fix this ? Thanks for your time!!!

code and model for 3D detection task

Hi, Thanks for your great job!
I am following your work to do some experimental trials on indoor scene 3D detection, so Im wondering when will you release the example codes and models for 3D detection task based on Swin3D?
Really appreciate your help and look forward your reply.

Fine-tuning on XYZ,NORM only

Thanks for providing this work!

As per the title, I am curious if it should be possible to fine-tune with your provided weights on XYZ and NORM only, as my data has no RGB available? It seems the power of Swin3D also lies the the pretraining, so want to utilize that.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.