Comments (11)
Thank you for your interest in our work. May I ask have you set the right data path and the right training GPUs?
from stratified-transformer.
Thank you for your prompt response. I have modified the data path and GPUs.
The error above is when I specify only one GPU, when I use multiple GPUs the error is the following:
Traceback (most recent call last):
File "train.py", line 547, in
main()
File "train.py", line 84, in main
mp.spawn(main_worker, nprocs=args.ngpus_per_node, args=(args.ngpus_per_node, args))
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/mmvc/Congcong/Stratified-Transformer/train.py", line 308, in main_worker
loss_train, mIoU_train, mAcc_train, allAcc_train = train(train_loader, model, criterion, optimizer, epoch, scaler, scheduler)
File "/home/mmvc/Congcong/Stratified-Transformer/train.py", line 380, in train
output = model(feat, coord, offset, batch, neighbor_idx)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/Congcong/Stratified-Transformer/model/stratified_transformer.py", line 449, in forward
feats, xyz, offset, feats_down, xyz_down, offset_down = layer(feats, xyz, offset)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/Congcong/Stratified-Transformer/model/stratified_transformer.py", line 291, in forward
new_window_size = 2 * torch.tensor([self.window_size]*3).type_as(xyz).to(xyz.device)
RuntimeError: CUDA error: invalid device function
from stratified-transformer.
May I ask have you compiled the pointops
in /lib
? And can you locate which line causes this error?
from stratified-transformer.
I only compiled pointops2 according to your instruction.
For one gpu,
File "/***/Stratified-Transformer/model/stratified_transformer.py", line 357, in forward
feats = self.kpconv(xyz, xyz, neighbor_idx, feats)
For multi gpu,
File "/***/Stratified-Transformer/model/stratified_transformer.py", line 291, in forward
new_window_size = 2 * torch.tensor([self.window_size]*3).type_as(xyz).to(xyz.device)
from stratified-transformer.
The error may be caused by the kpconv provided by torch-points3d. I wonder whether you successfully install it? Can you double check that torch-points3d can work smoothly?
from stratified-transformer.
Thanks. I have checked it. But when I use multi gpus, this error has not appeared. Kpconv provided by torch-points3d can work well.
from stratified-transformer.
Can you run successfully now? If you use one GPU, remember to add CUDA_VISIBLE_DEVICES=0
before your python command.
from stratified-transformer.
I have solved the bug of one gpu by modifing this.
But now I still have the error of line 291 both using one gpu and multi gpus.
from stratified-transformer.
I have solved the bug of one gpu by modifing this.
But now I still have the error of line 291 both using one gpu and multi gpus.
Hi, How do you modify it?
from stratified-transformer.
I have solved the bug of one gpu by modifing this.
But now I still have the error of line 291 both using one gpu and multi gpus.Hi, How do you modify it?
model = torch.nn.DataParallel(model.cuda()) ----> model = model.cuda()
from stratified-transformer.
I have solved the bug of one gpu by modifing this.
But now I still have the error of line 291 both using one gpu and multi gpus.Hi, How do you modify it?
model = torch.nn.DataParallel(model.cuda()) ----> model = model.cuda()
But, that way we can't use multi-GPU training. I am also getting this error when using model = torch.nn.DataParallel(model.cuda()).
any suggestions?
from stratified-transformer.
Related Issues (20)
- AssertionError when testing the pre-trained model s3dis_model_best.pth on S3DIS HOT 5
- Clarification Questions
- Inconsistency between provided checkpoints and current model HOT 1
- 您好,能把你的环境打包成docker镜像吗 HOT 1
- Parameter settings
- OSError: [WinError 127] The specified procedure could not be found HOT 1
- No evaluation on Hallway_6 on S3DIS Area5 HOT 1
- issue with SharedArray
- Regarding voxelization preprocessing HOT 1
- new dataset
- Supplementary documents on the paper HOT 1
- Error for install torch_points3d HOT 3
- 可以分享给我一下吗,感谢感谢!我的邮箱是[email protected]> > > > > > HOT 1
- TypeError: can't pickle _thread.RLock objects
- How to finetune Stratified-Transformer
- Visualization of cRPE and MLP based positional encoding
- Caught an unknown exception!
- RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 67568 but got size 3 for tensor number 1 in the list.
- I have 10.1 CUDA and 1.7.1 PyTorch in the virtual environment, and my graphics card is 4070ti,
- FLOPs calculations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stratified-transformer.