GithubHelp home page GithubHelp logo

yinyunie / pose2room Goto Github PK

View Code? Open in Web Editor NEW
84.0 2.0 5.0 41.14 MB

Implementation of ECCV'2022: Pose2Room: Understanding 3D Scenes from Human Activities

Home Page: https://yinyunie.github.io/pose2room-page/

License: MIT License

Python 99.89% Shell 0.11%
3d-vision eccv2022 robotics scene-understanding

pose2room's Introduction

Pose2Room: Understanding 3D Scenes from Human Activities
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner
In ECCV, 2022.

input.gif pred.gif gt.gif


Install

Our repository is developed under Ubuntu 20.04.

git clone https://github.com/yinyunie/Pose2Room.git
cd ./Pose2Room
  1. We recommend to install with conda by
conda env create -f environment.yml
conda activate p2rnet
  1. Install PointNet++ utilities.
export CUDA_HOME=/usr/local/cuda-X.X  # replace cuda-X.X with your cuda version.
cd external/pointnet2_ops_lib
pip install .
  1. (Optional) If you would like to synthesize pose and scene data on your own, the VirtualHome platform is required. Please refer to link for installation details.

Demo

The pretrained model can be downloaded here. Put script_level.pth under the folder of out/p2rnet/train/pretrained_weight/. A demo is illustrated below to see how our method works.

python main.py --config configs/config_files/p2rnet_test.yaml --mode demo

VTK is used to visualize the 3D scene. If everything goes smooth, there will be a GUI window popped up and you can interact with the scene.

demo.png


Dataset

We synthesize our dataset using VirtualHome platform. You can either download and extract the dataset from link to

/home/ynie/Projects/pose2room/datasets/virtualhome/samples/*.hdf5

or synthesize the dataset with our scripts (please follow link).

After obtained the dataset, you can visualize a GT sample following

python utils/virtualhome/vis_gt_vh.py

and a GT sample will be visualized as below if everything is working well so far.

verify_dataset.png


Training and Testing

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training and testing process. You can check and modify the configurations in specifc files for your need.

Training

Here is an example of training on sequence-level split:

For training on multiple GPUs, we use distributed data parallel and run

python -m torch.distributed.launch --nproc_per_node=4 --use_env --master_port=$((RANDOM + 9000)) main.py --config configs/config_files/p2rnet_train.yaml --mode train

You can also train on a single GPU by

python main.py --config configs/config_files/p2rnet_train.yaml --mode train

If you would like to train on room-level split, you can modify the data split to in p2rnet_train.yaml file

data:
  split: datasets/virtualhome_22_classes/splits/room_level

It will save the network weights to ./out/p2rnet/train/a_folder_with_time_stamp/model_best.pth

You can monitor the training process using tensorboard --logdir=runs. The training log is saved in ./out/p2rnet/train/a_folder_with_time_stamp/log.txt.

Testing

After training, you can copy the trained weight path to configs/config_files/p2rnet_test.yaml file as

weight: ['out/p2rnet/train/a_folder_with_time_stamp/model_best.pth']

and evaluate it by

python main.py --config configs/config_files/p2rnet_test.yaml --mode test

It will save the evaluation scores and the prediction results to ./out/p2rnet/train/a_folder_with_time_stamp/log.txt and ./out/p2rnet/train/a_folder_with_time_stamp/visualization respectively.

You can visualize a prediction result by

python ./utils/virtualhome/vis_results.py pred --pred-path out/p2rnet/test/a_folder_with_time_stamp/visualization

If everything goes smooth, it will output a visualization window as below.

verify_pred.png

(Optional) We also provide virtual scanned VirtualHome scenes in link, you can download & extract it to datasets/virtualhome_22_classes/scenes/*, and visualize it with poses and GT boxes by

python ./utils/virtualhome/vis_results.py gt --pred-path out/p2rnet/test/a_folder_with_time_stamp/visualization --vis_scene_geo

Citation

If you find our code and data helpful, please consider citing

@article{nie2021pose2room,
title={Pose2Room: Understanding 3D Scenes from Human Activities},
author={Yinyu Nie and Angela Dai and Xiaoguang Han and Matthias Nie{\ss}ner},
journal={arXiv preprint arXiv:2112.03030},
year={2021}
}

Acknowledgments

We synthesize our data using VirtualHome platform. If you find our data helpful, please also cite VirtualHome properly.

License

This repository is relased under the MIT License. See the LICENSE file for more details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.