jeffwang987 / openoccupancy Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2023] OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
License: Apache License 2.0
[ICCV 2023] OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
License: Apache License 2.0
OpenOccupancy/tools/gen_data/gen_depth_gt.py
Line 135 in 2c0dc83
lacks ['infos']
Also typo in
mv nuScenes-Occupancy-v0.0.7z ./data
cd ./data
7za x nuScenes-Occupancy
it should be 7za e nuScenes-Occupancy-v0.0.7z
Please can you provide further guidance on how to visualize the occupancy.
Hi, I am trying to adapt BEVDet4D style multi-frame inputs to OpenOcc. However, the results are not satisfying (with a slight performance drop: 8.7->8.6 mIoU). I am wondering if you have tried this. Thanks.
I see that lifting operation used LSS method in the code, but it has changed a lot compared with the vinilla version. Can you roughly describe the changes in thinking or related documents? Thanks a lot.
Hi! I want to ask what is stored in the provided occupancy dataset, because I find it very difficult to visualize as point clouds. And is there any loader for the provided data?
Hi, thx for your great work. Is there any plan to provide the training log of the model?
Hi, can you provide the script to generate pkl files?
Great work! I would like to know that if the occ_pooling
support pooling multiple frustums into whole voxel with z-dim >1? Which is different from pure LSS.
Hi!
Because i only have 1 RTX 3090 so far,i use the python ./tools/train.py projects/configs/baselines/CAM-R50_img1600_128x128x10.py to train the model.
However, I encountered the following error during verification.
2023-03-15 09:29:00,871 - mmdet - INFO - Epoch [1][28000/28130] lr: 2.000e-04, eta: 9 days, 6:50:28, time: 1.236, data_time: 0.019, memory: 12727, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss: 5.0000, grad_norm: 9.6413
2023-03-15 09:30:02,689 - mmdet - INFO - Epoch [1][28050/28130] lr: 2.000e-04, eta: 9 days, 6:49:23, time: 1.236, data_time: 0.018, memory: 12727, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss: 5.0000, grad_norm: 10.6449
2023-03-15 09:31:04,539 - mmdet - INFO - Epoch [1][28100/28130] lr: 2.000e-04, eta: 9 days, 6:48:18, time: 1.237, data_time: 0.021, memory: 12727, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss: 5.0000, grad_norm: 10.4283
2023-03-15 09:31:42,041 - mmdet - INFO - Saving checkpoint at 1 epochs
[ ] 0/6019, elapsed: 0s, ETA:/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Traceback (most recent call last):
File "./tools/train.py", line 262, in
main()
File "./tools/train.py", line 251, in main
custom_train_model(
File "/media/re/2384a6b4-4dae-400d-ad72-9b7044491b55/data/OpenOccupancy-main/projects/occ_plugin/occupancy/apis/train.py", line 27, in custom_train_model
custom_train_detector(
File "/media/re/2384a6b4-4dae-400d-ad72-9b7044491b55/data/OpenOccupancy-main/projects/occ_plugin/occupancy/apis/mmdet_train.py", line 199, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
self._do_evaluate(runner)
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmdet/core/evaluation/eval_hooks.py", line 17, in _do_evaluate
results = single_gpu_test(runner.model, self.dataloader, show=False)
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmdet/apis/test.py", line 59, in single_gpu_test
if isinstance(result[0], tuple):
KeyError: 0
Dear authors,
In the config files, I saw: point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]. I assume that the occupancy range for z is -5.0 to 3.0. However, this is different from what is reported in the paper: -3.0 to 5.0. Could you please provide some explanation? Thanks a lot.
Another question is: What is the origin of the voxel grid coordinate system? Is it the same as the car coordinate system or the lidar coordinate system?
I am curious about the inference speed of the method. Could you shed some light on the three modalities (C, L, M) and the M-CONet models?
Hi, I see visible_mask in config is set False, so during training and evaluation, the voxels which is unobserved are calculated. Is my understanding correct?
At first, Thanks for your stunning work!
I note that the nuScenes-Occupancy annotations for trainval dataset has been released, but can you provide the mini dataset occupancy annotations or the scripts for annotations generation?
Hi! After I added "fp16 = dict(loss_scale=512.)" in config file and added attribute "fp16_enabled" in occnet and bevdepth, the time and memory used when training were same as before. How can I make some changes to make FP16 truly work?
Hi,I have a question regarding whether it is possible to use the dataset collected from Carla to run your project. Could you please tell me how to generate a pickle file?
Hello,
The annotated voxel with unlabeled LiDAR points from intermediate frame based on the surrounding labeled voxels to further improve the data density, but how to eliminate the noise problem caused by unmarked information, especially dynamic objects.
Found that the provided nuscenes_occ_infos_train.pkl file does not match the information required by gen_depth_gt?Running the program will directly report error
Hi, scholar, thank you very much for your outstanding contribution to the development of surrounding occupancy perception algorithms.
When I replicat your OpenOccupancy project, I want to reduce the resolution of the input, which is the volume size.
So I need to regenerate the occupancy_nuscenes dataset.
Could you please provide the relevant conversion code from nuscenes dataset to occupancy_nuscenes dataset.
Besides, the code to generate the train/val pickle file is also all I need
Finally, thank you again for your work.
@JeffWang987 Thanks for your work. I'd like to know OccDefaultFormatBundle3D
only works on 'gt_occ'
, right?
Congratulations on the amazing work.
I am curious about the efficiency of the method. Could you shed some light on the inference time and the memory requirement?
Hi, is voxels GT in lidar coordinate system or in ego-frame coordinate system?
Hi, I am wondering if the shape of the gt is WHD or HWD?
After I've been running gen_depth_gt.py for a while, and I've generated some ground truths (maybe not all), I suddenly get the following error:
Traceback (most recent call last):
File "./tools/gen_data/gen_depth_gt.py", line 137, in <module>
po.apply_async(func=worker, args=(info, ))
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/multiprocessing/pool.py", line 455, in apply_async
self._check_running()
File "/home/re/anaconda3/envs/open-mmlab/lib/python3.8/multiprocessing/pool.py", line 350, in _check_running
raise ValueError("Pool not running")
ValueError: Pool not running
What is causing this problem? Is my CPU performance insufficient or is there some bug in the project code? I sincerely hope that you could help.Thank you!
Hi Xiaofeng/Jeff,
First of all, great work for putting together such a solid project for semantic occupancy prediction. The academia needs such a dataset and benchmark.
There is a discrepancy between the Z-range described in the paper and that in the config.
It seems to me that [-5m, 3m] should be the correct one, as the Z origin is on top of the car (aligned with the top lidar mounting position). Could you confirm?
Thanks,
Patrick
Thanks for sharing your code and pretrained models with us. I find the IoU of mutil-modal-baseline.pth/mutil-modal-CONet.pth/lidar-baseline.pth is significantly lower than the numbers reported in your paper. Is this phenomenon caused by the update of occupancy annotation? Looking forward your reply!
Hi,
Thanks for this amazing work. I especially enjoyed going through the experiment analysis as it was so thorough!
I have one question regarding the CONet baseline. If I understand correctly, the M-baseline
reported in Table 5 is at a lower resolution of 10x128x128.
However, since the intermediate output is a feature tensor, it should be possible to interpolate in that space -- and is indeed one of the advantages of implicit representations. I am wondering if you tried a baseline where you simply upsample the feature tensor by interpolation and use that to generate high-resolution occupancy output. Or if you have any thoughts on this matter.
Best,
Akshay
At first, Thanks for your stunning work!
I used Mayavi to visualize some scenes, and found that there are a lot of noise points with un-noise labels in the visulization, just like below
front_left image
front image
This situation exists in a large number of labeling scenarios. It seems that the label files were "pseudo labels" described in the paper.How can I solve this problem?
Are there some problems with the pretrained model? I evaluate the model with camera-based-Conet.pth,but IoU and mIoU are much lower than that published in the paper.
'SC_non-empty': 0.17, 'SSC_fine_free': 0.976, 'SSC_fine_barrier': 0.041, 'SSC_fine_bicycle': 0.031, 'SSC_fine_bus': 0.073, 'SSC_fine_car': 0.125, 'SSC_fine_construction_vehicle': 0.047, 'SSC_fine_motorcycle': 0.055, 'SSC_fine_pedestrian': 0.093, 'SSC_fine_traffic_cone': 0.07, 'SSC_fine_trailer': 0.02, 'SSC_fine_truck': 0.082, 'SSC_fine_driveable_surface': 0.309, 'SSC_fine_other_flat': 0.134, 'SSC_fine_sidewalk': 0.143, 'SSC_fine_terrain': 0.137, 'SSC_fine_manmade': 0.014, 'SSC_fine_vegetation': 0.027, 'SSC_fine_mean': 0.088}
Hi, thank you very much for your great work. Is there any plan to upload weights of trained models to google drive?
Because It is not so easy to download them from baidu disk without Chinese telephone number.
all of my environment are the same with readme, that's why
the eval performence is weird, reproduce based on commit
epoch 19 (set loss_norm=False)
2023-03-15 06:03:11,898 - mmdet - INFO - SC Evaluation
2023-03-15 06:03:11,899 - mmdet - INFO - +-----------+-------+
| class | IoU |
+-----------+-------+
| non-empty | 0.163 |
+-----------+-------+
2023-03-15 06:03:11,899 - mmdet - INFO - SSC Evaluation
2023-03-15 06:03:11,899 - mmdet - INFO - +----------------------+-------+
| class | IoU |
+----------------------+-------+
| free | 0.917 |
| barrier | 0.085 |
| bicycle | 0.044 |
| bus | 0.099 |
| car | 0.116 |
| construction_vehicle | 0.048 |
| motorcycle | 0.071 |
| pedestrian | 0.07 |
| traffic_cone | 0.041 |
| trailer | 0.042 |
| truck | 0.093 |
| driveable_surface | 0.215 |
| other_flat | 0.139 |
| sidewalk | 0.134 |
| terrain | 0.126 |
| manmade | 0.062 |
| vegetation | 0.092 |
| mean | 0.092 |
+----------------------+-------+
Hi,
first of all, let me thank you for this great work. Can I please ask you when do you intend to share the pre-trained models?
Thank you very much in advance.
Best,
Antonin.
hi,there are two questions that I really need your help.
Firstly, where is the type = 'Collect3D' in pipline, I only find the 'class CustomOccCollect3D'.
Secondly, There are some contradictions in 'LoadOccupancy':"
def call(self, results):
rel_path = 'scene_{0}/occupancy/{1}.npy'.format(results['scene_token'], results['lidar_token'])
# [z y x cls] or [z y x vx vy vz cls]
pcd = np.load(os.path.join(self.occ_path, rel_path))
pcd_label = pcd[..., -1:]
pcd_label[pcd_label==0] = 255
pcd_np_cor = self.voxel2world(pcd[..., [2,1,0]] + 0.5) # x y z
untransformed_occ = copy.deepcopy(pcd_np_cor) # N 4
# bevdet augmentation
pcd_np_cor = (results['bda_mat'] @ torch.from_numpy(pcd_np_cor).unsqueeze(-1).float()).squeeze(-1).numpy()
pcd_np_cor = self.world2voxel(pcd_np_cor)
# make sure the point is in the grid
pcd_np_cor = np.clip(pcd_np_cor, np.array([0,0,0]), self.grid_size - 1)
transformed_occ = copy.deepcopy(pcd_np_cor)
pcd_np = np.concatenate([pcd_np_cor, pcd_label], axis=-1)
# velocity
if self.use_vel:
pcd_vel = pcd[..., [3,4,5]] # x y z
pcd_vel = (results['bda_mat'] @ torch.from_numpy(pcd_vel).unsqueeze(-1).float()).squeeze(-1).numpy()
pcd_vel = np.concatenate([pcd_np, pcd_vel], axis=-1) # [x y z cls vx vy vz]
results['gt_vel'] = pcd_vel
# 255: noise, 1-16 normal classes, 0 unoccupied
pcd_np = pcd_np[np.lexsort((pcd_np_cor[:, 0], pcd_np_cor[:, 1], pcd_np_cor[:, 2])), :]
pcd_np = pcd_np.astype(np.int64)
processed_label = np.ones(self.grid_size, dtype=np.uint8) * self.unoccupied
processed_label = nb_process_label(processed_label, pcd_np)
results['gt_occ'] = processed_label"
As far as I know, the rel_path should be the lidar-seg label.
Thanks for the good work! I want to know how long does it takes for training the model using 8 GPUs?
And which GPU do you use to train the model?
Hi, I wonder what is the effect of the loss_norm?
According to the training log ,the loss value of each item is 1 when self.loss_norm is True.
Could you provide me some explanation about the motivation and the effect ?
Thanks!
Hi! I found that the loss of the first epoch has not decreased for a long time, and was always 9.000, is this reasonable?
Here are some logs during training:
2023-04-21 22:56:43,536 - mmdet - INFO - Epoch [1][8050/28130] lr: 3.000e-04, eta: 14 days, 13:10:48, time: 2.615, data_time: 0.090, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 9.1886
2023-04-21 22:58:51,067 - mmdet - INFO - Epoch [1][8100/28130] lr: 3.000e-04, eta: 14 days, 12:47:33, time: 2.551, data_time: 0.085, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 10.1773
2023-04-21 23:01:01,270 - mmdet - INFO - Epoch [1][8150/28130] lr: 3.000e-04, eta: 14 days, 12:26:50, time: 2.604, data_time: 0.090, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 10.2346
2023-04-21 23:03:09,146 - mmdet - INFO - Epoch [1][8200/28130] lr: 3.000e-04, eta: 14 days, 12:04:22, time: 2.557, data_time: 0.084, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 10.6167
2023-04-21 23:05:19,048 - mmdet - INFO - Epoch [1][8250/28130] lr: 3.000e-04, eta: 14 days, 11:43:52, time: 2.598, data_time: 0.092, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 10.9726
2023-04-21 23:07:30,623 - mmdet - INFO - Epoch [1][8300/28130] lr: 3.000e-04, eta: 14 days, 11:24:57, time: 2.631, data_time: 0.103, memory: 17458, loss_depth: 1.0000, loss_voxel_ce_c_0: 1.0000, loss_voxel_sem_scal_c_0: 1.0000, loss_voxel_geo_scal_c_0: 1.0000, loss_voxel_lovasz_c_0: 1.0000, loss_voxel_ce_fine: 1.0000, loss_voxel_sem_scal_fine: 1.0000, loss_voxel_geo_scal_fine: 1.0000, loss_voxel_lovasz_fine: 1.0000, loss: 9.0000, grad_norm: 10.8935
fatal: not a git repository (or any of the parent directories): .git
2023-09-26 21:37:31,804 - mmdet - INFO - Environment info:
sys.platform: linux
Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (GCC) 6.1.0
PyTorch: 1.10.1
PyTorch compiling details: PyTorch built with:
We thought this was a SyncBN problem, so we changed it to BN in the configuration file, but encountered the following problems:(It is worth noting that while we are still using sysnBN, it is possible to debug through tesy.py)
fatal: not a git repository (or any of the parent directories): .git
2023-09-26 21:27:49,543 - mmdet - INFO - Environment info:
sys.platform: linux
Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (GCC) 6.1.0
PyTorch: 1.10.1
PyTorch compiling details: PyTorch built with:
We know that this is a dimensional error, but there is no idea about this problem. Do you have any good suggestions?
Hi, may I ask does this repository support FP16 training? I have tried to modify by myself but the training speed turns very slow and the loss may be nan.
I regenerated the ground_truth with the latest version of the code.
And i use the
python ./tools/test.py ./projects/configs/baselines/CAM-R50_img1600_128x128x10.py ./work_dirs/CAM-R50_img1600_128x128x10/latest.pth --deterministic --eval bbox
to test the c-baseline model.
The weights I used were obtained after training two epochs. The test exceeded the set maximum test dataset length.
How can I solve this problem?
Thanks.
你好,有一个问题请教一下,就是你们这个depth是多帧lidar叠加生成的吗?我看gen_depth_gt.py文件中 points = np.fromfile(lidar_path ..., 好像只用了一帧lidar,但是你们数据集中有sweeps,这里有点不太理解,望解答
HI, I find that samples_per_gpu is set as 1 in all your config, so is there any reason to do so? I think the memory is enough in some tasks if you set samples_per_gpu to 2 or 4.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.