GithubHelp home page GithubHelp logo

swin3d's Introduction

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

PWC PWC PWC PWC PWC

Updates

27/04/2023

Initial commits:

  1. Pretrained models on Structured3D are provided.
  2. The supported code for Semantic Segmentation on ScanNet and S3DIS are provided.

Introduction

We present a pretrained 3D backbone, named Swin3D, that first-time outperforms all state-of-the-art methods on downstream 3D indoor scene understanding tasks. Our backbone network is based on a 3D Swin transformer and carefully designed for efficiently conducting self-attention on sparse voxels with a linear memory complexity and capturing the irregularity of point signals via generalized contextual relative positional embedding. Based on this backbone design, we pretrained a large Swin3D model on a synthetic Structured3D dataset that is 10 times larger than the ScanNet dataset and fine-tuned the pretrained model on various downstream real-world indoor scene understanding tasks.

teaser

Overview

Data Preparation

We pretrained our Swin3D on Structured3D, please refer to this link to prepare the data.

Pretrained Models

The models pretrained on Structured3D with different cRSE are provided here.

Pretrain #params cRSE mIoU(val) Model Log
Swin3D-S Structured3D 23.57M XYZ,RGB 77.69 model log
Swin3D-S Structured3D 23.57M XYZ,RGB,NORM 79.15 model log
Swin3D-L Structured3D 60.75M XYZ,RGB 79.79 model log
Swin3D-L Structured3D 60.75M XYZ,RGB,NORM 81.04 model log

Quick Start

Install the package using

pip install -r requirements.txt
python setup.py install

Build models and load our pretrained weight, Then you can finetune your model in various task.

import torch
from Swin3D.models import Swin3DUNet
model = Swin3DUNet(depths, channels, num_heads, \
        window_sizes, quant_size, up_k=up_k, \
        drop_path_rate=drop_path_rate, num_classes=num_classes, \
        num_layers=num_layers, stem_transformer=stem_transformer, \
        upsample=upsample, first_down_stride=down_stride, \
        knn_down=knn_down, in_channels=in_channels, \
        cRSE='XYZ_RGB_NORM', fp16_mode=1)
model.load_pretrained_model(ckpt_path)

Results and models

To reproduce our results on downstream tasks, please follow the code in this repo. The results are provided here.

ScanNet Segmentation

Pretrained mIoU(Val) mIoU(Test)
Swin3D-S 75.2 -
Swin3D-S 75.6(76.8) -
Swin3D-L 76.2(77.5) 77.9

S3DIS Segmentation

Pretrained Area 5 mIoU 6-fold mIoU
Swin3D-S 72.5 76.9
Swin3D-S 73.0 78.2
Swin3D-L 74.5 79.8

ScanNet 3D Detection

Pretrained [email protected] [email protected]
Swin3D-S+FCAF3D 74.2 59.5
Swin3D-L+FCAF3D 74.2 58.6
Swin3D-S+CAGroup3D 76.4 62.7
Swin3D-L+CAGroup3D 76.4 63.2

S3DIS 3D Detection

Pretrained [email protected] [email protected]
Swin3D-S+FCAF3D 69.9 50.2
Swin3D-L+FCAF3D 72.1 54.0

Citation

If you find Swin3D useful to your research, please cite our work:

@misc{yang2023swin3d,
      title={Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding}, 
      author={Yu-Qi Yang and Yu-Xiao Guo and Jian-Yu Xiong and Yang Liu and Hao Pan and Peng-Shuai Wang and Xin Tong and Baining Guo},
      year={2023},
      eprint={2304.06906},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

swin3d's People

Contributors

microsoftopensource avatar yuxiaoguo avatar microsoft-github-operations[bot] avatar

Stargazers

 avatar  avatar 俊杰 avatar  avatar  avatar LiangXu123 avatar George Zhou avatar Gigigi avatar Xu Kai avatar Yichen Mo avatar  avatar Chad Zhan avatar DDDDDDaisy avatar chenhaomingbob avatar  avatar RxxS avatar FifteenBao avatar ZHANG Yi avatar XMD avatar  avatar Jessy-Huang avatar 王新龙 avatar  avatar  avatar WeiboWang avatar Bevan Cheeseman avatar Tykis avatar  avatar Zhe Liu avatar Remco Leijenaar avatar Huancheng Xiao avatar Jean-Marc Alkazzi avatar HOU Yuenan avatar  avatar Yunze Man avatar  avatar Kai avatar  avatar  avatar SiSun avatar  avatar Frederick Zou avatar  avatar  avatar  avatar Xuehai Zhou avatar Abderahmane avatar Junge Zhang avatar Li Jie avatar Zijiyingcai avatar Weihao XUAN avatar Cristiano Saltori avatar Jiaxiang Jiang avatar Sihan Ma avatar Patrick Styll avatar Wonyoung Jang avatar Tianqing Li avatar  avatar Xiaohan Yan avatar  avatar LXie avatar  avatar  avatar  avatar  avatar Nanrong Zeng avatar Songle Chen avatar  avatar  avatar Roman Gudchenko avatar  avatar Tarun Dutt avatar JinyuanShao avatar Liu Peng avatar Xinyuan Zhang avatar  avatar  avatar lbjcelsius avatar yangxin avatar Yongjia Ma avatar zzz avatar  avatar Xingchen Zhang avatar Zhonghui You avatar Andrew MacDonald avatar  avatar  avatar  avatar Benny avatar  avatar Monius avatar  avatar  avatar Yu Wang avatar Harry avatar Yang Tan avatar Li Wei avatar 小杜 avatar CN靓仔 avatar  avatar

Watchers

James Cloos avatar .NET Foundation Contribution License Agreements avatar Kostas Georgiou avatar  avatar  avatar  avatar 林豪佳 avatar 小白白学习 avatar  avatar

swin3d's Issues

How do I run this network with my own data?

Dear all, thank you for publishing this work.
I am curious how to use this network on my own data. For example, say I have a PCD for chairs, I do not have the ground truth for this. I just want to run inference script to give me a list of segmented objects.
How would be the best way to progress using your network? Any help on this would be very much appreciated.
Thank you for the contribution !!

Low Result without pretrained pth

          @Yukichiii I applied epochs = 100 // loop = 30 and eval_freq = 2. I get on Area 5 : 
  • Swin3d-S : 69.76
  • Swin3d-L : 69.79
    It is possible to provide the log ? For the tests do you use num_vote = 12 ? For training did you change any parameters on relation to code https://github.com/Yukichiii/Swin3D_Task
    The result in your paper Swin3D-S | ✗ | 72.5 (without pretrained weight), is obtained with RGB or RGB + Normals.
    Thank's

Originally posted by @hpc100 in #16 (comment)

S3DIS results

Thanks for sharing your work ! It's possible to provide Swin3D-L (without pretrained) results and the log for Swin3D-S (without pretrained).
I ran some this two exp with 4GPU & 100 epoch and I get :

  • for swin3D-S (without pretrained) mIoU=68.0 (Area 5 validation) (4 points lower your result (72.5))
  • for swin3D-L (without pretrained) mIoU = 69.4 (Area 5 valdiation)
    For both model (pretrained on Structured3D), I get your performance : 73 for Swin3d-S and 74.5 for Swin3d-L).
    So, in paper you announced 3000 epochs for S3DIS to train from scratch on S3DIS, is the number of epochs is 3000 or 100, for training without pretrained pth on Structured3D ?
    If the number if different from 100, which hyperparameter do you used (eval_freq = 2 or higher to save time, ...)
    Thank's

ScanNet v2 - Normals

Thanks for providing this work !

Why use Swin3D_RGB_N.py when PointGroup's preprocessing code does not provide normals ?

s3ds 6-fold

can you provide the 6-fold code on s3ds? I will appreciate it very much !

Feature extraction for point clouds

Hi, thanks for your great work. I am trying to use your Swin3D as a frozen feature extractor for new 3D point clouds, like how people use CLIP/DINOv2 for images.

Could you share some simple scripts or ideas about how we can achieve this with your codebase? This can be very helpful.

model.eval causing nan values

Thanks for sharing your work !
@Yukichiii @yuxiaoguo I tried to test your code on Semantic3D :
In validation step, i get "nan" value in output.

  • I checked points cloud data, and there is no "nan" in npy files
  • I used this : print('Nan value .... ???? ', [k for k, v in model.named_parameters() if any(torch.isnan(v.ravel()))]). There is no nan values in train (model.train()) and validation (model.eval())
  • torch.where( torch.isnan(coord) == True), torch.where( torch.isnan(feat) == True), torch.where( torch.isnan(batch) == True) return empty list
  • So : neither nan value in data nor weigths/biais -> however output filled with nan

image

Do you have any idea where the problem could come from (layer norm, ....) ?

Details about Using Swin3D Backbone with CAGroup in 3D Detection

Hi! I am curious about the training details for the Swin3D encoder and CAGroup. Since the CAGroup uses a repeated ScanNet dataset during the training process, I am wondering if there are different implementations in the dataset. Additionally, when using a 48-dimensional upsampling output as the CAGroup input, the CUDA memory overflows even on a 24GB 4090 GPU, and this occurs despite setting the batch size to 1. Could you provide some insights into the training environment for the 3D detection task? Thank you!

Codes of finetuing in downstream task

According to Sec.5.2 in the paper, it seems that for fine-tuning on the 3D Detection task, I just need to replace the backbone in the FCAF3D and CAGroup3D code repositories with Swin3D and load the weights, is that correct?

As for fine-tuning on the semantic segmentation task on S3DIS and ScanNet, the paper doesn't mention which code repository it was developed based on. Could you please consider opening the fine-tuning codes?

error

Traceback (most recent call last):
File "***/sparse_dl/attn/attn_coff.py", line 12, in
import sparse_dl.attn_cuda as attn_module
ModuleNotFoundError: No module named 'sparse_dl.attn_cuda'

How to prepare custom data?

Hi, thanks for sharing your great work! I was tring to use the pretrained model to segment my own point cloud data, with only position and color of each point provided. I expect the model to output the segmentation label of each point. How should I prepare my data? Can I just input the coords and the color of the points into the model?

Question about Memory-efficient self-attention

Hi, it is a nice job about utilizing swin transformer to point cloud. However, I really don't understand the content of Memory-efficient self-attention.

$f_{i,h}^{*}=\frac{\sum_{j=1}^{N}(exp(e_{ij,h})f_{j}W_{V,h})}{\sum_{j=1}^{N}exp(e_{ij},h)}----(3)$

how can I understand the idea of allowing to postpone the SoftMax normalization and avoid constructing and storing ${αij,h}$ explicitly.

Calculating the denominator and numerator of Eq. (3) simultaneously is also a question that hard to fully understand.

Could you please give me some tips about how to grasp the idea of Memory-efficient self-attention.

MemoryError: std::bad_alloc: cudaErrorMemoryAllocation: out of memory

when I use 2GPU to train the s3ds segmentation it errs:
Traceback (most recent call last):
File "train.py", line 919, in
main()
File "train.py", line 114, in main
mp.spawn(
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/data/zhangqi/HZ/Swin3D_Task-main/SemanticSeg/train.py", line 514, in main_worker
loss_train, mIoU_train, mAcc_train, allAcc_train = train(
File "/data/zhangqi/HZ/Swin3D_Task-main/SemanticSeg/train.py", line 609, in train
output = model(feat, coord, batch)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1008, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 969, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0]) File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/data/zhangqi/HZ/Swin3D_Task-main/SemanticSeg/model/Swin3D_RGB.py", line 70, in forward
return self.backbone(sp, coords_sp)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/models/Swin3D.py", line 132, in forward
sp = self.stem_layer(sp)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/Swin3D-0.0.0-py3.8-linux-x86_64.egg/Swin3D/modules/mink_layers.py", line 77, in forward
x = self.conv_layers(x)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiConvolution.py", line 314, in forward
outfeat = self.conv.apply(
File "/home/lthpc/.conda/envs/Swin3d/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiConvolution.py", line 72, in forward
return fw_fn(
MemoryError: std::bad_alloc: cudaErrorMemoryAllocation: out of memory

Does anyone know how to fix it?
I can use singe GPU to train now by setting train_gpu to [0] in Swin3D_Task-main/SemanticSeg/config/s3dis/swin3D_RGB_L.yaml

How to apply Swin3D as backbone to detect 3d objects?

hey! I want to know how to apply SWIN3D for 3d object detection? The provided code seems to be designed for semantic segmentation. I would like to know if it can be directly used as a backbone for object detection?

Cannot access a tensor

Hello, I'm facing a pretty weird problem with the swin3dunet. I could run my training code for around 20 epochs on a v100 GPU, but eventually I will have the error below (I've set export CUDA_LAUNCH_BLOCKING=1 when running the code):

  File "/home/chenz0f/Swin-3D/decoder/backbone/swin3dunet.py", line 212, in forward
    sp, sp_down, coords_sp = layer(sp, coords_sp)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/chenz0f/Swin-3D/decoder/backbone/modules/swin3d_layers.py", line 866, in forward
    sp_down, coords_sp = self.downsample(sp, coords_sp)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/chenz0f/Swin-3D/decoder/backbone/modules/swin3d_layers.py", line 302, in forward
    feats = query_knn_feature(self.k, xyz, n_xyz, sp.F, offset, n_offset)
  File "/home/chenz0f/Swin-3D/decoder/backbone/modules/swin3d_layers.py", line 45, in query_knn_feature
    grouped_feat = src_feat[idx.view(-1).long(), :].view(m, K, c)
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

This error would occur together with several lines of CUDA assertion error message:

/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [1,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [4,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [5,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [6,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [7,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [8,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1678402411778/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [2318,0,0], thread: [9,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.

However, when I tried to check what was wrong, I was only able to check the shape of idx and src_feat. If I try to print their values, the CUDA assert error would also be triggered:

  File "/home/chenz0f/Swin-3D/decoder/backbone/modules/swin3d_layers.py", line 308, in forward
    feats = query_knn_feature(self.k, xyz, n_xyz, sp.F, offset, n_offset)
  File "/home/chenz0f/Swin-3D/decoder/backbone/modules/swin3d_layers.py", line 48, in query_knn_feature
    print("idx:", idx)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor.py", line 426, in __repr__
    return torch._tensor_str._str(self, tensor_contents=tensor_contents)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 636, in _str
    return _str_intern(self, tensor_contents=tensor_contents)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 567, in _str_intern
    tensor_str = _tensor_str(self, indent)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 327, in _tensor_str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 361, in get_summarized_data
    return torch.stack([get_summarized_data(x) for x in (start + end)])
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 361, in <listcomp>
    return torch.stack([get_summarized_data(x) for x in (start + end)])
  File "/home/chenz0f/anaconda3/envs/v100/lib/python3.10/site-packages/torch/_tensor_str.py", line 353, in get_summarized_data
    return torch.cat(
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I've also tried to use detach() and cpu(), but they all got the same error. May I ask if you've had similar problems before, or do you know what have might caused it?

ModuleNotFoundError: No module named 'Swin3D.sparse_dl.attn_cuda'

thanks for your work~!
but when i run:

import torch
from Swin3D.models import Swin3DUNet
model = Swin3DUNet(depths, channels, num_heads,
window_sizes, quant_size, up_k=up_k,
drop_path_rate=drop_path_rate, num_classes=num_classes,
num_layers=num_layers, stem_transformer=stem_transformer,
upsample=upsample, first_down_stride=down_stride,
knn_down=knn_down, in_channels=in_channels,
cRSE='XYZ_RGB_NORM', fp16_mode=1)
model.load_pretrained_model(ckpt_path)

i got this bug: ModuleNotFoundError: No module named 'Swin3D.sparse_dl.attn_cuda'
emmm, what can i do???

Problem when predicting sample data input.npz

Hi, I am working with your source code and faced a problem. With your sample data "input.npz", I got below results. This result like a mess. I don't know where am I wrong, so can you please share with me your code for making prediction on own point cloud? Thank you very much.

Note:
image1 is point cloud of "input.npz"
image2 is prediction
image3 is points of label "wall"
image1
image2
image3

Action required: migrate or opt-out of migration to GitHub inside Microsoft

Migrate non-Open Source or non-External Collaboration repositories to GitHub inside Microsoft

In order to protect and secure Microsoft, private or internal repositories in GitHub for Open Source which are not related to open source projects or require collaboration with 3rd parties (customer, partners, etc.) must be migrated to GitHub inside Microsoft a.k.a GitHub Enterprise Cloud with Enterprise Managed User (GHEC EMU).

Action

✍️ Please RSVP to opt-in or opt-out of the migration to GitHub inside Microsoft.

❗Only users with admin permission in the repository are allowed to respond. Failure to provide a response will result to your repository getting automatically archived.🔒

Instructions

Reply with a comment on this issue containing one of the following optin or optout command options below.

✅ Opt-in to migrate

@gimsvc optin --date <target_migration_date in mm-dd-yyyy format>

Example: @gimsvc optin --date 03-15-2023

OR

❌ Opt-out of migration

@gimsvc optout --reason <staging|collaboration|delete|other>

Example: @gimsvc optout --reason staging

Options:

  • staging : This repository will ship as Open Source or go public
  • collaboration : Used for external or 3rd party collaboration with customers, partners, suppliers, etc.
  • delete : This repository will be deleted because it is no longer needed.
  • other : Other reasons not specified

Need more help? 🖐️

GPU memory and training cost on ScanNet_v2

Thanks for your inspiring work! I noticed that in main paper the pre-training stage on Structured3D cost 488 and 703 GPU hours for Swin3D-S and Swin3D-L. May I also know the approximately GPU hours and memory cost when training from scratch on ScanNet_v2 that leads to 75.2(Swin3D-S*)/74.2(Swin3D-L*) val mIoU?

Request for input of inference

Hello, I'm trying to run the segmentation.py and found that the input file examples/input.npz is not in this repo, could you please upload it?

vote?

the great work,now,i am following your work for my work. but i don’t understand a sentence from paper. the sentence is “On the ScanNet benchmark(test dataset), we ensembled the results of three trained models by voting the prediction on over-segmented meshes”.
you set vote_num=12,but here is 3?

I have an understanding, but I don't know if it's correct or not. vote_num means same weight trained model but different input. “three trained model” mean same model but different weight,every trained model have vote_num=12,therefore 3 x12=36?

Does the pretrained model apply for both GridKNNDownsample and GridDownsample?

Hello, I have a question about the pretrained model.

When I set knn_down to False for Swin3DUNet, the IoU appeared to be pretty low even after loading the pretrained model. Does it apply for both GridKNNDownsample and GridDownsample or only GridKNNDownsample?

If it's the latter, may I ask for a GridDownsample version? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.