GithubHelp home page GithubHelp logo

360bev's Introduction

360BEV: Panoramic Semantic Mapping for Indoor Bird’s-Eye View (IEEE/CVF WACV 2024)


360BEV_paradigms

Introduction

In this work, mapping from 360° panoramas to BEV semantics, the 360BEV task, is established for the first time to achieve holistic representations of indoor scenes in a top-down view. Instead of relying on narrow-FoV image sequences, a panoramic image with depth information is sufficient to generate a holistic BEV semantic map. To benchmark 360BEV, we present two indoor datasets, 360BEV-Matterport and 360BEV-Stanford, both of which include egocentric panoramic images and semantic segmentation labels, as well as allocentric semantic maps.

For more details, please check our paper.

🔥 Update

  • 05/2023, release datasets.
  • 04/2023, release code and models.
  • 03/2023, init repository.

360BEV datasets

Prepare datasets:

Our extended datasets:

  • 360BEV-Stanford (214MB, ~3GB after extracted. Download from Google Drive)
  • 360BEV-Matterport (2GB, ~23GB after extracted. Download from Google Drive)
  • 360FV-Matterport (50GB, ~51GB after extracted. Download from Google Drive)

Data statistics of 360BEV datasets:

Dataset Scene Room Frame Category
train 5 215 1,040 13
val 1 55 373 13
360BEV-Stanford 6 270 1,413 13
train 61 -- 7,829 20
val 7 -- 772 20
test 18 -- 2,014 20
360BEV-Matterport 86 2,030 10,615 20

Dataset structure:

data/
├── Stanford2D3D
│   └── area_[1|2|3|4|5a|5b|6]
│       ├── rgb/*png
│       └── semantic/*png
│
├── 360BEV-Stanford
│   ├── training
│   └── valid
│       ├── data_base_with_rotationz_realdepth/*h5
│       └── ground_truth/*h5
│
├── 360BEV-Matterport
│   ├── training
│   ├── testing
│   └── valid
│       ├── smnet_training_data_zteng/*h5
│       └── topdown_gt_real_height/*h5
│
└── 360FV-Matterport
    ├── 17DRP5sb8fy
    │   ├── depth/*png
    │   ├── rgb/*png
    │   └── semantic/*png   
    └── ...

360Mapper model

360BEV_model

Results and weights

360FV Stanford-2D3D

Model Backbone Input mIoU weights
Trans4PASS MiT-B2 RGB 52.1
CBFC ResNet-101 RGB 52.2
Ours MiT-B2 RGB 54.3 B2

360FV-Matterport

Model Backbone Input mIoU weights
HoHoNet ResNet-101 RGB-D 44.85
SegFormer MiT-B2 RGB 45.53
Ours MiT-B2 RGB 46.35 B2

360BEV-Stanford

Method Backbone Acc mRecall mPrecision mIoU weights
Trans4Map MiT-B0 86.41 40.45 57.47 32.26
Trans4Map MiT-B2 86.53 45.28 62.61 36.08
Ours MiT-B0 92.07 50.14 65.37 42.42 B0
Ours MiT-B2 92.80 53.56 67.72 45.78 B2
Ours MSCA-B 92.67 55.02 68.02 46.44 MSCA-B

360BEV-Matterport

Method Backbone Acc mRecall mPrecision mIoU weights
Trans4Map MiT-B0 70.19 44.31 50.39 31.92
Trans4Map MiT-B2 73.28 51.60 53.02 36.72
Ours MiT-B0 75.44 48.80 56.01 36.98 B0
Ours MiT-B2 78.80 59.54 59.97 44.32 B2
Ours MSCA-B 78.93 60.51 62.83 46.31 MSCA-B

Installation

#### To create conda env:
    conda create -n 360BEV python=3.8
    conda activate 360BEV
    cd /path/to/360BEV
    pip install -r requirements.txt

To make the model run successful, we need to install mmdetection.

Train

For example, use 4 2080Ti GPUs to run the experiments:

# 360BEV_Matterport
python train_360BEV_Matterport.py --config configs/model_360BEV_mp3d.yml

# 360BEV_S2d3d
python train_360BEV_S2d3d.py --config configs/model_360BEV_s2d3d.yml

# Stanford2D3D
python train_pano_360Attention_S2d3d.py --config configs/model_fv_s2d3d.yml

# 360FV-Matterport
python train_pano_360Attention_Matterport.py --config configs/model_fv_mp3d.yml

Test

# 360BEV_Matterport
python test_360BEV_Matterport.py --config configs/model_360BEV_mp3d.yml

# 360BEV_S2d3d
python test_360BEV_S2d3d.py --config configs/model_360BEV_s2d3d.yml

# Stanford2D3D
python test_pano_360Attention_S2d3d.py --config configs/model_fv_s2d3d.yml

# 360FV-Matterport
python test_pano_360Attention_Matterport.py --config configs/model_fv_mp3d.yml

References

We appreciate the previous open-source works.

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citation

If you are interested in this work, please cite the following work:

@inproceedings{teng2024_360bev,
  title={360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View}, 
  author={Teng, Zhifeng and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Shi, Hao and Reiß, Simon and Cao, Ke and Stiefelhagen, Rainer},
  booktitle={2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2024}
}

360bev's People

Contributors

brucetend avatar elnino9ykl avatar jamycheung avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.