ashawkey / dimr Goto Github PK

[ECCV 2022] Disentangled Instance Mesh Reconstruction

License: Apache License 2.0

Python 74.76% Cython 4.35% C++ 10.11% Cuda 8.32% C 2.09% Shell 0.37%

dimr's Introduction

Disentangled Instance Mesh Reconstruction (ECCV 2022)

This repository contains the official implementation for the paper: Point Scene Understanding via Disentangled Instance Mesh Reconstruction.

Project Page | Arxiv | Data

Installation

Clone the repository:

git clone --recursive  https://github.com/ashawkey/dimr
cd dimr

pip install -r requirements.txt

Install dependent libraries:

The repository depends on a modified spconv from pointgroup for sparse convolution, which requires CUDA version < 11 and pytorch < 1.5.

spconv

cd lib/spconv
python setup.by bdist_wheel
cd dist
# may need to change the filename
pip install spconv-1.0-cp37-cp37m-linux_x86_64.whl

pointgroup_ops

cd lib/pointgroup_ops
python setup.py develop

bspt_ops
```
cd lib/bspt
python setup.py develop
```

light-field-distance

cd lib/light-field-distance
python setup.py develop

Data Preparation

Full folder structure

.
├──datasets
│   ├── scannet
│   │   ├── scans # scannet scans
│   │   │   ├── scene0000_00 # only these 4 files are used.
│   │   │   │   ├── scene0000_00.txt
│   │   │   │   ├── scene0000_00_vh_clean_2.ply
│   │   │   │   ├── scene0000_00.aggregation.json
│   │   │   │   ├── scene0000_00_vh_clean_2.0.010000.segs.json
│   │   │   ├── ......
│   │   │   ├── scene0706_00
│   │   ├── scan2cad # scan2cad, only the following 1 file is used.
│   │   │   ├── full_annotations.json
│   │   ├── scannetv2-labels-combined.tsv # scannet label mappings
│   │   ├── processed_data # preprocessed data
│   │   │   ├── scene0000_00 
│   │   │   │   ├── bbox.pkl
│   │   │   │   ├── data.npz
│   │   │   ├── ......
│   │   │   ├── scene0706_00
│   │   ├── rfs_label_map.csv # generated label mappings
│   ├── ShapeNetCore.v2 # shapenet core v2 dataset
│   │   ├── 02954340
│   │   ├── ......
│   │   ├── 04554684
│   ├── ShapeNetv2_data # preprocessed shapenet dataset
│   │   ├── watertight_scaled_simplified
│   ├── bsp # the pretrained bsp model
│   │   ├── zs
│   │   ├── database_scannet.npz
│   │   ├── model.pth
│   ├── splits # data splits
│   │   ├── train.txt
│   │   ├── val.txt
│   │   ├── test.txt

Prepare the data

download the preprocesssed data here (~3.3G) and label map here, and put them under ./datasets/scannet.

If you want to process the data by yourself, please:
- download the ScanNet dataset, Scan2CAD dataset, and the ShapeNet dataset, and put them to the corresponding locations.
- preprocess ScanNet data by:
```
python data/generate_data_relabel.py
```
  By default it launches 16 processes to accelerate processing, which will finish in about 10 minutes. Due to mismatching of instance labels and CAD annotations, it may log some warnings, but will not affect the process. It will generate the label map ./datasets/scannet/rfs_label_map.csv, and save data.npz, bbox.pkl for each scene under ./datasets/scannet/processed_data/.
download the preprocessed ShapeNet (simplified watertight mesh) following RfDNet into ShapeNetv2_data, only the watertight_scaled_simplified is used for the mesh retrieval and evaluation.
download the pretrained BSP-Net checkpoint and extracted GT latent codes here.

If you want to generate them by yourself, please check the BSP_CVAE repository to generate the ground truth latent shape codes (zs folder), the pretrained model (model.pth), and the assistant code database (database_scannet.npz).

Training

# train phase 1 (point-wise)
python train.py --config config/rfs_phase1_scannet.yaml

# train phase 2 (proposal-wise)
python train.py --config config/rfs_phase2_scannet.yaml

Please check the config files for more options.

Testing

Generate completed instance meshes:

# test after training phase 2
python test.py --config config/rfs_phase2_scannet.yaml
# example path for the meshes: ./exp/scannetv2/rfs/rfs_phase2_scannet/result/epoch256_nmst0.3_scoret0.05_npointt100/val/trimeshes/

# test with a speficied checkpoint
python test.py --config config/rfs_pretrained_scannet.yaml --pretrain ./checkpoint.pth

We provide the pretrained model here.

To visualize the intermediate point-wise results:

python util/visualize.py --task semantic_gt --room_name all
python util/visualize.py --task instance_gt --room_name all
# after running test.py, may need to change `--result_root` to the output directory, check the script for more details.
python util/visualize.py --task semantic_pred --room_name all
python util/visualize.py --task instance_pred --room_name all

Evaluation

We provide 4 metrics for evaulation the instance mesh reconstruction quality. For the IoU evaluation, we rely on binvox to voxelize meshes (via trimesh's API), so make sure it can be found in the system path.

## first, prepare GT instance meshes 
python data/scannetv2_inst.py # prepare at "./datasets/gt_meshes"

## assume the generated meshes are under "./pred_meshes"
# IoU
python evaluation/iou/eval.py ./datasets/gt_meshes ./pred_meshes

# CD
python evaluation/cd/eval.py ./datasets/gt_meshes ./pred_meshes

# LFD
python evaluation/lfd/eval.py ./datasets/gt_meshes ./pred_meshes

# PCR
python evaluation/pcr/eval.py ./pred_meshes

We provide the meshes used in our evaluation for reproduction here, it includes the output meshes from RfD-Net, Ours, Ours-projection, and Ours-retrieval.

The GT meshes can also be found here.

Citation

If you find our work useful, please use the following BibTeX entry:

@article{tang2022point,
  title={Point Scene Understanding via Disentangled Instance Mesh Reconstruction},
  author={Tang, Jiaxiang and Chen, Xiaokang and Wang, Jingbo and Zeng, Gang},
  journal={arXiv preprint arXiv:2203.16832},
  year={2022}
}

Acknowledgement

We would like to thank RfD-Net and pointgroup authors for open-sourcing their great work!

dimr's People

Contributors

Stargazers

Watchers

Forkers

saladcat ykkawana dawnborn

dimr's Issues

What is the metric used in Table 3. Object Recognition Precision?

Hi author!
Thank you for this amazing work!
I have a question about Table 3 in the paper namely Object Recognition Precision. I'm not sure what's the IoU used here. Is it the IoU between proposed instance points and GT instance points? Or the proposed bounding boxes and GT boxes?

Thank you !!

There is no log.py in util file?

Thanks for your amazing work!
Here is the question: from util.log import logger, this line exists in test.py and train.py, but there is no log.py in util file, which cause error when run train.py or test.py.

pointgroup install error

/PG_OP.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E

Windows suport

First of all, congratulations for your great work. I currently use softgroup to perform instance detection and want to replace it with your work. I only need to make the inference with a 3d scene obtained by Bundlefusion. I am a windows user and have installed all the python packages with no problem. The only problem I have found is the use of spconv since it is not compatible with Windows (spconv2 yes it is compatible with windows) . My question is if you know if I can make the dimr method compatible with Windows, taking into account that I am only going to perform the inference of the model and that the spconv only works on Linux. Thanks for your attention

Something Error when I run the train.py

Thanks for your work!
But when I try to run the train.py, there is an error"RuntimeError: running_mean should contain 96 elements not 192".
It seems to be caused by the layers channel mismatching. I want to know do you test the script? Or just me has the problem.

[BSP] generating mesh 14 / 16 | chair | z = 0.2947016954421997
[BSP] generating mesh 15 / 16 | display | z = 0.29926255345344543
WARNING - 2022-08-31 11:13:16,860 - sample - only got 1923/2048 samples!
WARNING - 2022-08-31 11:13:17,950 - sample - only got 1282/2048 samples!
WARNING - 2022-08-31 11:13:18,048 - sample - only got 1147/2048 samples!
WARNING - 2022-08-31 11:13:18,771 - sample - only got 1440/2048 samples!
WARNING - 2022-08-31 11:13:19,764 - sample - only got 1350/2048 samples!
WARNING - 2022-08-31 11:13:19,813 - sample - only got 1274/2048 samples!
WARNING - 2022-08-31 11:13:19,851 - sample - only got 1746/2048 samples!
WARNING - 2022-08-31 11:13:19,895 - sample - only got 2028/2048 samples!
WARNING - 2022-08-31 11:13:19,986 - sample - only got 1628/2048 samples!
WARNING - 2022-08-31 11:13:20,613 - sample - only got 1719/2048 samples!
WARNING - 2022-08-31 11:13:20,655 - sample - only got 1897/2048 samples!
WARNING - 2022-08-31 11:13:20,753 - sample - only got 1522/2048 samples!
[Mesh] save 0 / 16, label = chair
[Mesh] save 1 / 16, label = sofa
[Mesh] save 2 / 16, label = display
[Mesh] save 3 / 16, label = chair
[Mesh] save 4 / 16, label = bookshelf
[Mesh] save 5 / 16, label = chair
[Mesh] save 6 / 16, label = display
[Mesh] save 7 / 16, label = sofa
[Mesh] save 8 / 16, label = trash_bin
[Mesh] save 9 / 16, label = table
[Mesh] save 10 / 16, label = trash_bin
[Mesh] save 11 / 16, label = chair
[Mesh] save 12 / 16, label = trash_bin
[Mesh] save 13 / 16, label = table
[Mesh] save 14 / 16, label = chair
[Mesh] save 15 / 16, label = display
[2022-08-31 11:13:21,015  INFO  test.py  line 160  3479014]  instance iter: 2/311 point_num: 195084 ncluster: 16 time: total 11.51s inference 11.28s save 0.23s
INFO - 2022-08-31 11:13:21,015 - test - instance iter: 2/311 point_num: 195084 ncluster: 16 time: total 11.51s inference 11.28s save 0.23s

When I interrupted by ctrl+C, it shows that there is an exception in data acquisition

^CTraceback (most recent call last):
  File "test.py", line 197, in <module>
    test(model, model_fn, data_name, cfg.test_epoch)
  File "test.py", line 62, in test
    for i, batch in enumerate(dataloader):
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1182, in _next_data
    idx, data = self._get_data()
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1138, in _get_data
    success, data = self._try_get_data()
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 986, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/queue.py", line 179, in get
    self.not_empty.wait(remaining)
  File "/home/zsy/anaconda3/envs/dimr/lib/python3.7/threading.py", line 300, in wait
    gotit = waiter.acquire(True, timeout)
KeyboardInterrupt
^C

What could be the reason for this? Is my data incomplete?

Confusion in RfsNet.forward()

There is such code:


            # single-scale proposal gen (pointgroup)
            if self.training:

                ### BFS clustering on shifted coords
                idx_shift, start_len_shift = pointgroup_ops.ballquery_batch_p(coords_ + pt_offsets_, batch_idxs_, batch_offsets_, self.cluster_radius, self.cluster_shift_meanActive)
                proposals_idx_shift, proposals_offset_shift = pointgroup_ops.bfs_cluster(semantic_preds_, idx_shift.cpu(), start_len_shift.cpu(), self.cluster_npoint_thre)
                proposals_idx_shift[:, 1] = object_idxs[proposals_idx_shift[:, 1].long()].int() # remap: sumNPoint --> N

                # proposals_idx_shift: (sumNPoint, 2), int, dim 0 for cluster_id, dim 1 for corresponding point idxs in N
                # proposals_offset_shift: (nProposal + 1), int, start/end index for each proposal, e.g., [0, c1, c1+c2, ..., c1+...+c_nprop = sumNPoint], same information as cluster_id, just in the convinience of cuda operators.

                ### BFS clustering on original coords
                idx, start_len = pointgroup_ops.ballquery_batch_p(coords_, batch_idxs_, batch_offsets_, self.cluster_radius, self.cluster_meanActive)
                proposals_idx, proposals_offset = pointgroup_ops.bfs_cluster(semantic_preds_, idx.cpu(), start_len.cpu(), self.cluster_npoint_thre)
                proposals_idx[:, 1] = object_idxs[proposals_idx[:, 1].long()].int()

                # proposals_idx: (sumNPoint, 2), int, dim 0 for cluster_id, dim 1 for corresponding point idxs in N
                # proposals_offset: (nProposal + 1), int

                ### merge two type of clusters
                proposals_idx_shift[:, 0] += (proposals_offset.size(0) - 1)
                proposals_offset_shift += proposals_offset[-1]
            
                proposals_idx = torch.cat((proposals_idx, proposals_idx_shift), dim=0)#.long().cuda()
                proposals_offset = torch.cat((proposals_offset, proposals_offset_shift[1:]), dim=0)#.cuda()
                # why [1:]: offset is (0, c1, c2), offset_shift is (0, d1, d2) + c2, output is (0, c1, c2, c2+d1, c2+d2)

            # multi-scale proposal gen (naive, maskgroup)
            else:

                idx_shift, start_len_shift = pointgroup_ops.ballquery_batch_p(coords_ + pt_offsets_, batch_idxs_, batch_offsets_, 0.01, self.cluster_shift_meanActive)
                proposals_idx_shift_0, proposals_offset_shift_0 = pointgroup_ops.bfs_cluster(semantic_preds_, idx_shift.cpu(), start_len_shift.cpu(), self.cluster_npoint_thre)
                proposals_idx_shift_0[:, 1] = object_idxs[proposals_idx_shift_0[:, 1].long()].int()
                
                idx_shift, start_len_shift = pointgroup_ops.ballquery_batch_p(coords_ + pt_offsets_, batch_idxs_, batch_offsets_, 0.03, self.cluster_shift_meanActive)
                proposals_idx_shift_1, proposals_offset_shift_1 = pointgroup_ops.bfs_cluster(semantic_preds_, idx_shift.cpu(), start_len_shift.cpu(), self.cluster_npoint_thre)
                proposals_idx_shift_1[:, 1] = object_idxs[proposals_idx_shift_1[:, 1].long()].int()
                
                idx_shift, start_len_shift = pointgroup_ops.ballquery_batch_p(coords_ + pt_offsets_, batch_idxs_, batch_offsets_, 0.05, self.cluster_shift_meanActive)
                proposals_idx_shift_2, proposals_offset_shift_2 = pointgroup_ops.bfs_cluster(semantic_preds_, idx_shift.cpu(), start_len_shift.cpu(), self.cluster_npoint_thre)
                proposals_idx_shift_2[:, 1] = object_idxs[proposals_idx_shift_2[:, 1].long()].int()
                    
                idx, start_len = pointgroup_ops.ballquery_batch_p(coords_, batch_idxs_, batch_offsets_, 0.03, self.cluster_meanActive)
                proposals_idx_0, proposals_offset_0 = pointgroup_ops.bfs_cluster(semantic_preds_, idx.cpu(), start_len.cpu(), self.cluster_npoint_thre)
                proposals_idx_0[:, 1] = object_idxs[proposals_idx_0[:, 1].long()].int()
                    
                _offset = proposals_offset_0.size(0) - 1
                proposals_idx_shift_0[:, 0] += _offset
                proposals_offset_shift_0 += proposals_offset_0[-1]
                
                _offset += proposals_offset_shift_0.size(0) - 1
                proposals_idx_shift_1[:, 0] += _offset
                proposals_offset_shift_1 += proposals_offset_shift_0[-1]
                
                _offset += proposals_offset_shift_1.size(0) - 1
                proposals_idx_shift_2[:, 0] += _offset
                proposals_offset_shift_2 += proposals_offset_shift_1[-1]
                
                proposals_idx = torch.cat((proposals_idx_0, proposals_idx_shift_0, proposals_idx_shift_1, proposals_idx_shift_2), dim=0)
                proposals_offset = torch.cat((proposals_offset_0, proposals_offset_shift_0[1:], proposals_offset_shift_1[1:], proposals_offset_shift_2[1:]))

        
            #### proposals voxelization again
            input_feats, inp_map, proposal_angle, point_center, point_scale, proposal_semantics = self.clusters_voxelization(proposals_idx, proposals_offset, output_feats, semantic_scores_CAD, pt_angles, coords, self.score_fullscale, self.score_scale, self.mode)

I would like to know how to find the original batch_idx corresponding to each Proposal BBOX?