ywyue / roomformer Goto Github PK

View Code? Open in Web Editor NEW

137.0 8.0 16.0 73.97 MB

[CVPR 2023] RoomFormer: Two-level Queries for Single-stage Floorplan Reconstruction

Home Page: https://ywyue.github.io/RoomFormer/

License: MIT License

Python 89.47% C++ 3.21% Cuda 7.20% Shell 0.12%

transformer two-level-queries deep-learning pytorch floorplan-reconstruction floorplan

roomformer's Introduction

Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries

CVPR 2023

Yuanwen Yue, Theodora Kontogianni, Konrad Schindler, Francis Engelmann

ETH Zurich

This repository provides code, data and pretrained models for RoomFormer, a Transformer model for single-stage floorplan reconstruction.

[Project Webpage] [Paper] [Video]

Table of Contents

Abstract
Method
Preparation
Evaluation
Training
Semantically-rich Floorplan
Citation
Acknowledgment

Abstract

We address 2D floorplan reconstruction from 3D scans. Existing approaches typically employ heuristically designed multi-stage pipelines. Instead, we formulate floorplan reconstruction as a single-stage structured prediction task: find a variable-size set of polygons, which in turn are variable-length sequences of ordered vertices. To solve it we develop a novel Transformer architecture that generates polygons of multiple rooms in parallel, in a holistic manner without hand-crafted intermediate stages. The model features two-level queries for polygons and corners, and includes polygon matching to make the network end-to-end trainable. Our method achieves a new state-of-the-art for two challenging datasets, Structured3D and SceneCAD, along with significantly faster inference than previous methods. Moreover, it can readily be extended to predict additional information, i.e., semantic room types and architectural elements like doors and windows.

Method

Illustration of the RoomFormer model. Given a top-down-view density map of the input point cloud, (a) the feature backbone extracts multi-scale features, adds positional encodings, and flattens them before passing them into the (b) Transformer encoder. (c) The Transformer decoder takes as input our two-level queries, one level for the room polygons (up to M) and one level for their corners (up to N per room polygon). A feed-forward network (FFN) predicts a class c for each query to accommodate for varying numbers of rooms and corners. During training, the polygon matching guarantees optimal assignment between predicted and groundtruth polygons.

Preparation

Environment

The code has been tested on Linux with python 3.8, torch 1.9.0, and cuda 11.1.

We recommend an installation through conda:

Create an environment:

conda create -n roomformer python=3.8
conda activate roomformer

Install pytorch and other required packages:

# adjust the cuda version accordingly
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Compile the deformable-attention modules (from deformable-DETR) and the differentiable rasterization module (from BoundaryFormer):

cd models/ops
sh make.sh

# unit test for deformable-attention modules (should see all checking is True)
# python test.py

cd ../../diff_ras
python setup.py build develop

Data

We directly provide the processed data in the required format below. For details on data preprocessing, please refer to data_preprocess.

Structured3D

We convert multi-view RGB-D panoramas to point clouds, and project the point clouds along the vertical axis into density images. Please download our processed Structured3D dataset (update: 03/28/2023) in COCO format and organize them as following:

code_root/
└── data/
    └── stru3d/
        ├── train/
        ├── val/
        ├── test/
        └── annotations/
            ├── train.json
            ├── val.json
            └── test.json

SceneCAD

SceneCAD contains 3D room layout annotations on real-world RGB-D scans of ScanNet. We convert the layout annotations to 2D floorplan polygons. We use the same procedure as in Structured3D to project RGB-D scans to density maps. Please download our processed SceneCAD dataset in COCO format and organize them as following:

code_root/
└── data/
    └── scenecad/
        ├── train/
        ├── val/
        └── annotations/
            ├── train.json
            ├── val.json

Checkpoints

Please download and extract the checkpoints of our model from this link.

Evaluation

Structured3D

We use the same evaluation scripts with MonteFloor. Please first download the ground truth data used by MonteFloor and HEAT with this link (required by the evaluation code) and extract it as ./s3d_floorplan_eval/montefloor_data. Then run following command to evaluate the model on Structured3D test set:

./tools/eval_stru3d.sh

If you want to evaluate our model trained on a "tight" room layout (see paper appendix), please run:

./tools/eval_stru3d_tight.sh

Please note the evaluation still runs on the unmodified groundtruth floorplans from MonteFloor. However, we also provide our processed "tight" room layout here in case one wants to retrain the model on it.

SceneCAD

We adapt the evaluation scripts from MonteFloor to evaluate SceneCAD:

./tools/eval_scenecad.sh

Training

The command for training RoomFormer on Structured3D is as follows:

./tools/train_stru3d.sh

Similarly, to train RoomFormer on SceneCAD, run the following command:

./tools/train_scenecad.sh

Semantically-rich Floorplan

RoomFormer can be easily extended to predict room types, doors and windows. We provide the implementation and model for SD-TQ (The variant with minimal changes to our original architecture). To evaluate or train on the semantically-rich floorplans of Structured3D, run the following commands:

### Evaluation:
./tools/eval_stru3d_sem_rich.sh
### Train:
./tools/train_stru3d_sem_rich.sh

Citation

If you find RoomFormer useful in your research, please cite our paper:

@inproceedings{yue2023connecting,
  title     = {{Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries}},
  author    = {Yue, Yuanwen and Kontogianni, Theodora and Schindler, Konrad and Engelmann, Francis},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2023}
}

Acknowledgment

We thank the authors of HEAT and MonteFloor for providing results on Structured3D for better comparison. Theodora Kontogianni and Francis Engelmann are postdoctoral research fellows at the ETH AI Center. We also thank for the following excellent open source projects:

roomformer's People

Contributors

Stargazers

Watchers

Forkers

slakshmi-enphase lhq1630798 ishohei220 peterzs wattanakorn495 submagr liuxinren456852 peiyiniu meyerjo peace-zy mushanf 3dlg-hcvc woshicqy woogie-boogie jt551 amogh-sawant

roomformer's Issues

Could you provide the data preprocessing codes? Thanks a lot!

The indicators of training results are abnormal

When I trained directly with the processed Structured3D dataset, the final training result seemed to be wrong. I don't know what I did wrong that led to this.

When training, I used the command:
python main.py --dataset_name=stru3d --dataset_root=dataru3d --num_queries=800 --num_polys=20 --semantic_classes=-1 --job_name=train_stru3d --num_workers=0

The relevant indicators obtained after the training are as follows:
Averaged stats: room_prec: 0.0000 (0.1604) room_rec: 0.0000 (0.1415) corner_prec: 0.0000 (0.0777) corner_rec: 0.0000 (0.0607) angles_prec: 0.0000 (0.0626) angles_rec: 0.0000 (0.0500) loss: 0.8760 (0.8640) loss_ce: 0.1437 (0.1420) loss_coords: 0.3569 (0.3627) loss_raster: 0.3638 (0.3593) loss_ce_unscaled: 0.0719 (0.0710) loss_coords_unscaled: 0.0714 (0.0725) loss_raster_unscaled: 0.3638 (0.3593) cardinality_error_unscaled: 9.7000 (9.8400)

The charts in wandb are as follows:

My dataset file directory is shown below:
$C_{TL6G( 68%}1V0P7 W_JV$

Segmentation fault (core dumped)

Hi Team,

Thank you for the code and pre-trained model.
But when I am trying to load model to GPU memory I am getting "Segmentation fault (core dumped)" message. I am currently using 16 GB GPU machine with the same environment as the repository suggested. Also tried with 24 GB machine but still getting the same error. What could be the reason for this? And if anyone has any solution kindly suggest the same.

Is the DETR implementation only works on v100 GPUs?
Do I need single GPU with almost 32 GB of memory?
Is there a way to implement the same with CPU?

issue about corners_pad[:len(corners)] = corners

I got this problem when training on my own dataset

It looks like the length of the corners is 298, but the length of the corners_pad is only 80
Is this caused by my training data?
be like:

thank you very much!

no kernel image is available for execution on the device

i run this script:
eval_stru3d_tight.sh

get:

and in the folder checkpoints/eval_stru3d_tight:
gt:

pred:

map:

my CUDA 11.6

what is wrong ??
thanks

detectron2.data

ModuleNotFoundError:No module named 'detectron2.data'
Could you please provide the corresponding code?

Can you share the demo code? Enter DensityMap to obtain the results, as well as the result visualization code?

Code Release

Where's the code to test this out?

Training got stuck

Hi, ywyue!
Thank you for your wonderful work.
I tried to train on the Structured3D dataset, however, the training got stuck midway without any error being reported.

I tried to set --num_workers=0, but the problem hasn't been resolved.
I've tried terminating and resuming training multiple times, but the epoch at which it gets stuck varies each time. Do you have any suggestions for a solution?

I'm using 1.9.0+cu111 and running the main.py in a WSL2 Ubuntu 20.04

MSDeformAttnFunction.apply, can be implemented by pytorch?

i want to save the model by jit.trace:
model = torch.jit.trace()
model.save(xxx.pt)

but failed

thanks.

run eval_stru3d_sem_rich.sh failed

it is normal to run
eval_stru3d.sh, eval_stru3d_tight.sh, eval_scenecas.sh.

and get fine results.

But when i run eval_stru3d_sem_rich.sh,
it failed.

also my env is different.
torch 1.13.1
cuda 11.6

i donot know if i must change my env

Train on my dataset

Thank you for your open source work. When I tried to use my own three-channel image data set, it seemed that I could not train, there were no bugs in the training, and the loss was constantly decreasing, but the accuracy of the valuation set was always 0. Do you know the possible reasons for this?

Some advice on training the model

Hi yuewen, thanks for your perfect work and congratulations that roomformer is accepted by CVPR!
However, I have some problems trainning roomformer to get the good resluts, expecially with the angle precision and angle recall, and about 2 points lower than the results in the paper in the six evaluation metrics. Besides, in the code, the checkpoint is saved per 20 epoch, i wonder if there are any better way to save the checkpoints?
Can you give me some advice on how to train the model?
Thanks a lot if you can help me!

About data preprocessing

Hi~ ywyue, thank you for your work.
When I execute the code "generate_point_cloud_stru3d.py", I get the following error.
What should be added is that some files in the series of compressed packages named "Structured3D_panorama" that I downloaded were corrupted, so I had to delete part of the folder whose name format is "scene_xxxxx". I don't know if this will have any impact. At the time of the error, the code has processed all four of the zipped data.

77%|████████████████████████████████████████████████████████████████████████████████████████████████████Pointcloud size: 1036436█████████████████████████▋ | 152/198 [3:36:07<47:42, 62.23s/it]
77%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 153/198 [3:38:25<1:04:14, 85.65s/it]
26%|███████████████████▍ | 5/19 [28:54:28<80:56:31, 20813.65s/it]
Traceback (most recent call last):
File "generate_point_cloud_stru3d.py", line 29, in
main(config())
File "generate_point_cloud_stru3d.py", line 22, in main
reader = PointCloudReaderPanorama(scene_path, random_level=0, generate_color=True, generate_normal=False)
File "E:\YangMeiQi\git\RoomFormer\data_preprocess\stru3d\PointCloudReaderPanorama.py", line 25, in init
self.point_cloud = self.generate_point_cloud(self.random_level, color=self.generate_color, normal=self.generate_normal)
File "E:\YangMeiQi\git\RoomFormer\data_preprocess\stru3d\PointCloudReaderPanorama.py", line 75, in generate_point_cloud
coords[:,:2] = np.round(coords[:,:2] / 10) * 10.
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

code

This paper is great ! Can you publish the code? I will thank you very much!

How to visualize the result?

Hi, thanks for your beautiful work. How can I visualize the results from the processed scenecad data?

Actual 2D floor plan images as Density Map

Thanks for the great work!

This is not an issue but more of a question. Can your work and implementation be utilized to detect the rooms and walls directly from a 2D floor plan image? Meaning that, if I bypass the 3D part, can actual 2D floorplan images or pdfs be somehow used as a "Density Map" in your setting?

Structured3D Data preprocessing problem

Some data in Structured3D is lost, resulting in the failure of generating point cloud data. Have you used this tool to convert point clouds?

generate_point_cloud_stru3d.py

How to get the semantic-rich floorplan result?

generate_point_cloud_stru3d.py error

An error occurred while trying to convert the stru3d dataset to a point cloud. It seems that the dataset I downloaded did not have the same directory as in the code after extracting it.

error：

Creating point cloud from perspective views...
0%| | 0/3500 [00:00<?, ?it/s]
Traceback (most recent call last):
File "data_preprocess/stru3d/generate_point_cloud_stru3d.py", line 29, in
main(config())
File "data_preprocess/stru3d/generate_point_cloud_stru3d.py", line 19, in main
scenes = os.listdir(os.path.join(data_root, part, 'Structured3D'))
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/data2/数据集/Structured3D/Structured3D/scene_02893/Structured3D'

stru3d data set unzipped directory：

ywyue / roomformer Goto Github PK

roomformer's Introduction

Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries

CVPR 2023

Abstract

Method

Preparation

Environment

Data

Structured3D

SceneCAD

Checkpoints

Evaluation

Structured3D

SceneCAD

Training

Semantically-rich Floorplan

Citation

Acknowledgment

roomformer's People

Contributors

Stargazers

Watchers

Forkers

roomformer's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs