Comments (5)
This problem I have no idea, I will try to create a docker for this project to provide a reproducible environment for errors.
Multi GPU: currently not supported. The major reason is I only have one GPU. If you want to use multi gpu training, you need to pad the input (or just not slice array in point_to_voxel), then slice points inside module.
from second.pytorch.
when I training the model, the middle evaluation occurred the error below:
Traceback (most recent call last):
File "vox_gluon/train_gluon.py", line 759, in
fire.Fire()
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "vox_gluon/train_gluon.py", line 504, in train
raise e
File "vox_gluon/train_gluon.py", line 486, in train
result = get_official_eval_result(gt_annos[:len(gt_annos)-1], dt_annos, class_names)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/utils/eval.py", line 824, in get_official_eval_result
mAPbbox, mAPbev, mAP3d, mAPaos = do_eval_v2(gt_annos, dt_annos, current_classes, min_overlaps, compute_aos, difficultys)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/utils/eval.py", line 701, in do_eval_v2
ret = eval_class_v3(gt_annos, dt_annos, current_classes, difficultys, 1, min_overlaps)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/utils/eval.py", line 574, in eval_class_v3
rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/utils/eval.py", line 384, in calculate_iou_partly
overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype(np.float64)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/utils/eval.py", line 126, in bev_box_overlap
riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)
File "/mnt/data-3/data/wenyong.zheng/vxlnet/second.pytorch/second/core/non_max_suppression/nms_gpu.py", line 652, in rotate_iou_gpu_eval
N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/compiler.py", line 484, in call
sharedmem=self.sharedmem)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/compiler.py", line 558, in _kernel_call
cu_func(*kernelargs)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/cudadrv/driver.py", line 1301, in call
self.sharedmem, streamhandle, args)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/cudadrv/driver.py", line 1345, in launch_kernel
None)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/cudadrv/driver.py", line 288, in safe_cuda_api_call
self._check_error(fname, retcode)
File "/home/users/wenyong.zheng/anaconda3/lib/python3.6/site-packages/numba/cuda/cudadrv/driver.py", line 323, in _check_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [400] Call to cuLaunchKernel results in CUDA_ERROR_INVALID_HANDLE
By the way, have you done the training by multi GPUs ?
from second.pytorch.
what do you mean in "slice array in point_to_voxel" and "slice points inside module"?
as the code shows that you put all the points in one single batch together, how can I recognize how many points in a sample and others ?
from second.pytorch.
The number of voxels converted from points is not fixed, you can see a slice operation in point_to_voxel
. For multi-gpu, you need to return voxel_num
in point_to_voxel
, use fixed-size input before nn.DataParallel, passvoxel_num
as a Tensor and gather all valid voxels inside nn.Module in nn.DataParallel.
from second.pytorch.
@zwyzwy have you any solution?
from second.pytorch.
Related Issues (20)
- KeyError: 'annotations' when using nuscenes dataset
- KeyError: 'annotations' when using nuscenes dataset HOT 3
- How to start training from interrupted step HOT 1
- About gt_sampling
- ModuleNotFoundError: No module named 'second' HOT 1
- Kitti web viewer backend issue HOT 1
- second.pytorch 1.6.0 Alpha and spconv 2.1 HOT 2
- Convert custom Lidar point cloud data to KITTI format HOT 1
- Summary name eval.kitti/official/Car/[email protected]/1 is illegal; using eval.kitti/official/Car/3d_0.70/1 instead.
- Issues while using Kitti viewer. HOT 1
- Need tips on improving performance on custom dataset
- Need suggestions on how to generate onnx files using this repo
- Source of torchplus package
- OSS License compatibility question
- About kitti viewer
- second.data HOT 2
- I would like to know how many samples should be used for validation if 3517 samples are used for training?
- Is the sparse library writed by the autuor?
- Input only the desired scene
- 有人试过把这个代码在windows上运行吗
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from second.pytorch.