GithubHelp home page GithubHelp logo

ywyue / roomformer Goto Github PK

View Code? Open in Web Editor NEW
137.0 8.0 16.0 73.97 MB

[CVPR 2023] RoomFormer: Two-level Queries for Single-stage Floorplan Reconstruction

Home Page: https://ywyue.github.io/RoomFormer/

License: MIT License

Python 89.47% C++ 3.21% Cuda 7.20% Shell 0.12%
transformer two-level-queries deep-learning pytorch floorplan-reconstruction floorplan

roomformer's Introduction

Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries

CVPR 2023

Yuanwen Yue, Theodora Kontogianni, Konrad Schindler, Francis Engelmann

ETH Zurich

This repository provides code, data and pretrained models for RoomFormer, a Transformer model for single-stage floorplan reconstruction.

[Project Webpage] [Paper] [Video]

Table of Contents
  1. Abstract
  2. Method
  3. Preparation
  4. Evaluation
  5. Training
  6. Semantically-rich Floorplan
  7. Citation
  8. Acknowledgment

Abstract

We address 2D floorplan reconstruction from 3D scans. Existing approaches typically employ heuristically designed multi-stage pipelines. Instead, we formulate floorplan reconstruction as a single-stage structured prediction task: find a variable-size set of polygons, which in turn are variable-length sequences of ordered vertices. To solve it we develop a novel Transformer architecture that generates polygons of multiple rooms in parallel, in a holistic manner without hand-crafted intermediate stages. The model features two-level queries for polygons and corners, and includes polygon matching to make the network end-to-end trainable. Our method achieves a new state-of-the-art for two challenging datasets, Structured3D and SceneCAD, along with significantly faster inference than previous methods. Moreover, it can readily be extended to predict additional information, i.e., semantic room types and architectural elements like doors and windows.

Method

space-1.jpg

Illustration of the RoomFormer model. Given a top-down-view density map of the input point cloud, (a) the feature backbone extracts multi-scale features, adds positional encodings, and flattens them before passing them into the (b) Transformer encoder. (c) The Transformer decoder takes as input our two-level queries, one level for the room polygons (up to M) and one level for their corners (up to N per room polygon). A feed-forward network (FFN) predicts a class c for each query to accommodate for varying numbers of rooms and corners. During training, the polygon matching guarantees optimal assignment between predicted and groundtruth polygons.

Preparation

Environment

  • The code has been tested on Linux with python 3.8, torch 1.9.0, and cuda 11.1.
  • We recommend an installation through conda:
    • Create an environment:
    conda create -n roomformer python=3.8
    conda activate roomformer
    • Install pytorch and other required packages:
    # adjust the cuda version accordingly
    pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
    pip install -r requirements.txt
    cd models/ops
    sh make.sh
    
    # unit test for deformable-attention modules (should see all checking is True)
    # python test.py
    
    cd ../../diff_ras
    python setup.py build develop

Data

We directly provide the processed data in the required format below. For details on data preprocessing, please refer to data_preprocess.

Structured3D

We convert multi-view RGB-D panoramas to point clouds, and project the point clouds along the vertical axis into density images. Please download our processed Structured3D dataset (update: 03/28/2023) in COCO format and organize them as following:

code_root/
└── data/
    └── stru3d/
        ├── train/
        ├── val/
        ├── test/
        └── annotations/
            ├── train.json
            ├── val.json
            └── test.json

SceneCAD

SceneCAD contains 3D room layout annotations on real-world RGB-D scans of ScanNet. We convert the layout annotations to 2D floorplan polygons. We use the same procedure as in Structured3D to project RGB-D scans to density maps. Please download our processed SceneCAD dataset in COCO format and organize them as following:

code_root/
└── data/
    └── scenecad/
        ├── train/
        ├── val/
        └── annotations/
            ├── train.json
            ├── val.json

Checkpoints

Please download and extract the checkpoints of our model from this link.

Evaluation

Structured3D

We use the same evaluation scripts with MonteFloor. Please first download the ground truth data used by MonteFloor and HEAT with this link (required by the evaluation code) and extract it as ./s3d_floorplan_eval/montefloor_data. Then run following command to evaluate the model on Structured3D test set:

./tools/eval_stru3d.sh

If you want to evaluate our model trained on a "tight" room layout (see paper appendix), please run:

./tools/eval_stru3d_tight.sh

Please note the evaluation still runs on the unmodified groundtruth floorplans from MonteFloor. However, we also provide our processed "tight" room layout here in case one wants to retrain the model on it.

SceneCAD

We adapt the evaluation scripts from MonteFloor to evaluate SceneCAD:

./tools/eval_scenecad.sh

Training

The command for training RoomFormer on Structured3D is as follows:

./tools/train_stru3d.sh

Similarly, to train RoomFormer on SceneCAD, run the following command:

./tools/train_scenecad.sh

Semantically-rich Floorplan

RoomFormer can be easily extended to predict room types, doors and windows. We provide the implementation and model for SD-TQ (The variant with minimal changes to our original architecture). To evaluate or train on the semantically-rich floorplans of Structured3D, run the following commands:

### Evaluation:
./tools/eval_stru3d_sem_rich.sh
### Train:
./tools/train_stru3d_sem_rich.sh

Citation

If you find RoomFormer useful in your research, please cite our paper:

@inproceedings{yue2023connecting,
  title     = {{Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries}},
  author    = {Yue, Yuanwen and Kontogianni, Theodora and Schindler, Konrad and Engelmann, Francis},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2023}
}

Acknowledgment

We thank the authors of HEAT and MonteFloor for providing results on Structured3D for better comparison. Theodora Kontogianni and Francis Engelmann are postdoctoral research fellows at the ETH AI Center. We also thank for the following excellent open source projects:

roomformer's People

Contributors

submagr avatar ywyue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

roomformer's Issues

The indicators of training results are abnormal

When I trained directly with the processed Structured3D dataset, the final training result seemed to be wrong. I don't know what I did wrong that led to this.

When training, I used the command:
python main.py --dataset_name=stru3d --dataset_root=dataru3d --num_queries=800 --num_polys=20 --semantic_classes=-1 --job_name=train_stru3d --num_workers=0

The relevant indicators obtained after the training are as follows:
Averaged stats: room_prec: 0.0000 (0.1604) room_rec: 0.0000 (0.1415) corner_prec: 0.0000 (0.0777) corner_rec: 0.0000 (0.0607) angles_prec: 0.0000 (0.0626) angles_rec: 0.0000 (0.0500) loss: 0.8760 (0.8640) loss_ce: 0.1437 (0.1420) loss_coords: 0.3569 (0.3627) loss_raster: 0.3638 (0.3593) loss_ce_unscaled: 0.0719 (0.0710) loss_coords_unscaled: 0.0714 (0.0725) loss_raster_unscaled: 0.3638 (0.3593) cardinality_error_unscaled: 9.7000 (9.8400)

The charts in wandb are as follows:
image
image

My dataset file directory is shown below:
C_{TL6G( 68%}1V0P7 W_JV

Segmentation fault (core dumped)

Hi Team,

Thank you for the code and pre-trained model.
But when I am trying to load model to GPU memory I am getting "Segmentation fault (core dumped)" message. I am currently using 16 GB GPU machine with the same environment as the repository suggested. Also tried with 24 GB machine but still getting the same error. What could be the reason for this? And if anyone has any solution kindly suggest the same.

  1. Is the DETR implementation only works on v100 GPUs?
  2. Do I need single GPU with almost 32 GB of memory?
  3. Is there a way to implement the same with CPU?

issue about corners_pad[:len(corners)] = corners

I got this problem when training on my own dataset

1701153750936
It looks like the length of the corners is 298, but the length of the corners_pad is only 80
Is this caused by my training data?
be like:
image

thank you very much!

detectron2.data

ModuleNotFoundError:No module named 'detectron2.data'
Could you please provide the corresponding code?

Training got stuck

Hi, ywyue!
Thank you for your wonderful work.
I tried to train on the Structured3D dataset, however, the training got stuck midway without any error being reported.
image
I tried to set --num_workers=0, but the problem hasn't been resolved.
I've tried terminating and resuming training multiple times, but the epoch at which it gets stuck varies each time. Do you have any suggestions for a solution?

I'm using 1.9.0+cu111 and running the main.py in a WSL2 Ubuntu 20.04

run eval_stru3d_sem_rich.sh failed

it is normal to run
eval_stru3d.sh, eval_stru3d_tight.sh, eval_scenecas.sh.

and get fine results.

But when i run eval_stru3d_sem_rich.sh,
it failed.
image

also my env is different.
torch 1.13.1
cuda 11.6

i donot know if i must change my env

Train on my dataset

Thank you for your open source work. When I tried to use my own three-channel image data set, it seemed that I could not train, there were no bugs in the training, and the loss was constantly decreasing, but the accuracy of the valuation set was always 0. Do you know the possible reasons for this?

Some advice on training the model

Hi yuewen, thanks for your perfect work and congratulations that roomformer is accepted by CVPR!
However, I have some problems trainning roomformer to get the good resluts, expecially with the angle precision and angle recall, and about 2 points lower than the results in the paper in the six evaluation metrics. Besides, in the code, the checkpoint is saved per 20 epoch, i wonder if there are any better way to save the checkpoints?
Can you give me some advice on how to train the model?
Thanks a lot if you can help me!

About data preprocessing

Hi~ ywyue, thank you for your work.
When I execute the code "generate_point_cloud_stru3d.py", I get the following error.
What should be added is that some files in the series of compressed packages named "Structured3D_panorama" that I downloaded were corrupted, so I had to delete part of the folder whose name format is "scene_xxxxx". I don't know if this will have any impact. At the time of the error, the code has processed all four of the zipped data.

77%|████████████████████████████████████████████████████████████████████████████████████████████████████Pointcloud size: 1036436█████████████████████████▋ | 152/198 [3:36:07<47:42, 62.23s/it]
77%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 153/198 [3:38:25<1:04:14, 85.65s/it]
26%|███████████████████▍ | 5/19 [28:54:28<80:56:31, 20813.65s/it]
Traceback (most recent call last):
File "generate_point_cloud_stru3d.py", line 29, in
main(config())
File "generate_point_cloud_stru3d.py", line 22, in main
reader = PointCloudReaderPanorama(scene_path, random_level=0, generate_color=True, generate_normal=False)
File "E:\YangMeiQi\git\RoomFormer\data_preprocess\stru3d\PointCloudReaderPanorama.py", line 25, in init
self.point_cloud = self.generate_point_cloud(self.random_level, color=self.generate_color, normal=self.generate_normal)
File "E:\YangMeiQi\git\RoomFormer\data_preprocess\stru3d\PointCloudReaderPanorama.py", line 75, in generate_point_cloud
coords[:,:2] = np.round(coords[:,:2] / 10) * 10.
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

code

This paper is great ! Can you publish the code? I will thank you very much!

Actual 2D floor plan images as Density Map

Thanks for the great work!

This is not an issue but more of a question. Can your work and implementation be utilized to detect the rooms and walls directly from a 2D floor plan image? Meaning that, if I bypass the 3D part, can actual 2D floorplan images or pdfs be somehow used as a "Density Map" in your setting?

Structured3D Data preprocessing problem

Some data in Structured3D is lost, resulting in the failure of generating point cloud data. Have you used this tool to convert point clouds?

generate_point_cloud_stru3d.py

generate_point_cloud_stru3d.py error

An error occurred while trying to convert the stru3d dataset to a point cloud. It seems that the dataset I downloaded did not have the same directory as in the code after extracting it.

error:

Creating point cloud from perspective views...
0%| | 0/3500 [00:00<?, ?it/s]
Traceback (most recent call last):
File "data_preprocess/stru3d/generate_point_cloud_stru3d.py", line 29, in
main(config())
File "data_preprocess/stru3d/generate_point_cloud_stru3d.py", line 19, in main
scenes = os.listdir(os.path.join(data_root, part, 'Structured3D'))
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/data2/数据集/Structured3D/Structured3D/scene_02893/Structured3D'

stru3d data set unzipped directory:
2023-06-27 11-29-02屏幕截图

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.