The 360bev from jackzhousz

360BEV: Panoramic Semantic Mapping for Indoor Bird’s-Eye View (IEEE/CVF WACV 2024)

Introduction

In this work, mapping from 360° panoramas to BEV semantics, the 360BEV task, is established for the first time to achieve holistic representations of indoor scenes in a top-down view. Instead of relying on narrow-FoV image sequences, a panoramic image with depth information is sufficient to generate a holistic BEV semantic map. To benchmark 360BEV, we present two indoor datasets, 360BEV-Matterport and 360BEV-Stanford, both of which include egocentric panoramic images and semantic segmentation labels, as well as allocentric semantic maps.

For more details, please check our paper.

🔥 Update

05/2023, release datasets.
04/2023, release code and models.
03/2023, init repository.

360BEV datasets

Prepare datasets:

Our extended datasets:

360BEV-Stanford (214MB, ~3GB after extracted. Download from Google Drive)
360BEV-Matterport (2GB, ~23GB after extracted. Download from Google Drive)
360FV-Matterport (50GB, ~51GB after extracted. Download from Google Drive)

Data statistics of 360BEV datasets:

Dataset	Scene	Room	Frame	Category
train	5	215	1,040	13
val	1	55	373	13
360BEV-Stanford	6	270	1,413	13
train	61	--	7,829	20
val	7	--	772	20
test	18	--	2,014	20
360BEV-Matterport	86	2,030	10,615	20

Dataset structure:

data/
├── Stanford2D3D
│   └── area_[1|2|3|4|5a|5b|6]
│       ├── rgb/*png
│       └── semantic/*png
│
├── 360BEV-Stanford
│   ├── training
│   └── valid
│       ├── data_base_with_rotationz_realdepth/*h5
│       └── ground_truth/*h5
│
├── 360BEV-Matterport
│   ├── training
│   ├── testing
│   └── valid
│       ├── smnet_training_data_zteng/*h5
│       └── topdown_gt_real_height/*h5
│
└── 360FV-Matterport
    ├── 17DRP5sb8fy
    │   ├── depth/*png
    │   ├── rgb/*png
    │   └── semantic/*png   
    └── ...

360Mapper model

Results and weights

360FV Stanford-2D3D

Model	Backbone	Input	mIoU	weights
Trans4PASS	MiT-B2	RGB	52.1
CBFC	ResNet-101	RGB	52.2
Ours	MiT-B2	RGB	54.3	B2

360FV-Matterport

Model	Backbone	Input	mIoU	weights
HoHoNet	ResNet-101	RGB-D	44.85
SegFormer	MiT-B2	RGB	45.53
Ours	MiT-B2	RGB	46.35	B2

360BEV-Stanford

Method	Backbone	Acc	mRecall	mPrecision	mIoU	weights
Trans4Map	MiT-B0	86.41	40.45	57.47	32.26
Trans4Map	MiT-B2	86.53	45.28	62.61	36.08
Ours	MiT-B0	92.07	50.14	65.37	42.42	B0
Ours	MiT-B2	92.80	53.56	67.72	45.78	B2
Ours	MSCA-B	92.67	55.02	68.02	46.44	MSCA-B

360BEV-Matterport

Method	Backbone	Acc	mRecall	mPrecision	mIoU	weights
Trans4Map	MiT-B0	70.19	44.31	50.39	31.92
Trans4Map	MiT-B2	73.28	51.60	53.02	36.72
Ours	MiT-B0	75.44	48.80	56.01	36.98	B0
Ours	MiT-B2	78.80	59.54	59.97	44.32	B2
Ours	MSCA-B	78.93	60.51	62.83	46.31	MSCA-B

Installation

#### To create conda env:
    conda create -n 360BEV python=3.8
    conda activate 360BEV
    cd /path/to/360BEV
    pip install -r requirements.txt

To make the model run successful, we need to install mmdetection.

Train

For example, use 4 2080Ti GPUs to run the experiments:

# 360BEV_Matterport
python train_360BEV_Matterport.py --config configs/model_360BEV_mp3d.yml

# 360BEV_S2d3d
python train_360BEV_S2d3d.py --config configs/model_360BEV_s2d3d.yml

# Stanford2D3D
python train_pano_360Attention_S2d3d.py --config configs/model_fv_s2d3d.yml

# 360FV-Matterport
python train_pano_360Attention_Matterport.py --config configs/model_fv_mp3d.yml

Test

# 360BEV_Matterport
python test_360BEV_Matterport.py --config configs/model_360BEV_mp3d.yml

# 360BEV_S2d3d
python test_360BEV_S2d3d.py --config configs/model_360BEV_s2d3d.yml

# Stanford2D3D
python test_pano_360Attention_S2d3d.py --config configs/model_fv_s2d3d.yml

# 360FV-Matterport
python test_pano_360Attention_Matterport.py --config configs/model_fv_mp3d.yml

References

We appreciate the previous open-source works.

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citation

If you are interested in this work, please cite the following work:

@inproceedings{teng2024_360bev,
  title={360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View}, 
  author={Teng, Zhifeng and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Shi, Hao and Reiß, Simon and Cao, Ke and Stiefelhagen, Rainer},
  booktitle={2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2024}
}

jackzhousz / 360bev Goto Github PK

360bev's Introduction