GithubHelp home page GithubHelp logo

zhouhaonanshens / robo3d-a_demo Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 10.97 MB

This repository is some demo code from the Robo3D project, which only includes some code of the Central Host, while the robot's code is not included.

License: MIT License

Jupyter Notebook 51.43% Python 36.00% C++ 0.28% Cuda 11.05% C 0.74% Shell 0.51%

robo3d-a_demo's Introduction

Robo3D-Central Host Part

This repository is some demo code from the Robo3D project, which only includes some code of the Central Host, while the robot's code is not included.

Tested on Ubuntu-20.04---Nvidia RTX2080---cuda toolkit 11.3

Dependencies (click to expand)

Dependencies

  • PyTorch 1.4
  • matplotlib
  • numpy
  • imageio
  • imageio-ffmpeg
  • configargparse

Project:

This project explores the exciting field of 3D scene reconstruction from 2D images, with a focus on active mapping and planning using implicit representations. Leveraging the power of Neural Radiance Fields (NeRF), we've developed an online 3D reconstruction and planning system for active vision tasks.

NeRF has shown tremendous potential in photorealistic rendering and offline 3D reconstruction. Recently, it's also been used for online reconstruction and localization, forming the backbone of implicit SLAM systems. However, its use in active vision tasks, particularly active mapping and planning, is largely unexplored.

In this project, we introduce a novel RGB-D active vision framework that utilizes the radiance field representation for active 3D reconstruction and planning. This framework is implemented as an iterative dual-stage optimization problem, with alternating optimization for the radiance field representation and path planning.

We extend the existing work in several significant ways:

  • Depth Supervision: We incorporate depth supervision into our model to further refine the 3D reconstruction process. This addition helps the model generate more accurate and detailed 3D structures.

  • Active Policy: We introduce more active policy to our framework to guide the path planning process. The policy is designed to optimize for effective exploration and data collection in the 3D environment.

  • Robot's Reachable Space Model: We incorporate a model of the robot's reachable space into our system. This model helps in path planning by taking into account the robot's physical constraints when exploring the environment.

Our experimental results indicate that our proposed method performs competitively with existing offline methods and even surpasses active reconstruction methods using NeRFs. We believe our project will contribute to the ongoing research in active 3D reconstruction and planning and open up new possibilities for real-world applications.

For one sentence summarization, our training and reconstructing process only spends less than 10 minutes, while the original version of NeRF spends hours or even tens of hours.

File Structure

├── nerf
│   ├── gui.py
│   ├── clip_utils.py
│   ├── network.py      # NeRFNetwork class
│   ├── provider.py     # Dataset provider
│   ├── renderer.py     # NeRFRenderer class for rendering the rays
│   ├── utils.py        # Main body, including Trainers and test Meters
├── scripts
│   ├── colmap2nerf.py  # Convert COLMAP poses to NeRF poses
│   ├── run_COLMAP.sh   # Script to run COLMAP
│   ├── run_dInstant.sh # Script to start training or testing
├── environment.yml
├── hz_nerf.py          # Entry script. 
├── requirements.txt
└── LICENSE

Installation

Clone the repository

git clone --recursive https://github.com/ZhouHaonanShens/Robo3D-A_Demo.git
cd Robo3D-A_Demo

Install with pip

pip install -r requirements.txt

Install with conda

conda env create -f environment.yml
conda activate Robo3D

Usage

Run COLMAP for images only data to get pose

python scripts/colmap2nerf.py --images path/to/images --hold 0  --run_colmap
# or
./scripts/run_COLMAP.sh # Please edit parameters in the script

Run with only 2D images dataset

python hz_nerf.py ${data_path} --workspace ${workspace_path}  --HZ --bound 1.0 --scale 0.25

Run with 2D images and depths (KL Divergence)

python hz_nerf.py ${data_path} --workspace ${workspace_path}  --HZ --bound 1.0 --scale 0.25 --depth_supervise

Run with 2D images and depths (L2 Loss)

python hz_nerf.py ${data_path} --workspace ${workspace_path}  --HZ --bound 1.0 --scale 0.25 --depth_supervise_E2

Parameters for hz_train.py script

Parameter Type Description Default Value
@path str/path The path to data N/A
@workspace str/path The path to workspace for logs and results 'workspace'
@ckpt str/path The path to the checkpoint 'latest'
@bound float Assume the scene is bounded in box[-bound, bound]^3, if > 1, will invoke adaptive ray marching 2
@scale float Scale camera location into box[-bound, bound]^3 0.33
@max_ray_batch int Batch size of rays at inference to avoid OOM, only valid when NOT using '--cuda_ray' 4096
@cuda_ray store true Use CUDA raymarching instead of pytorch if true False
@depth_supervise store true Introduce depth supervision via KL divergence False
@test_entropy store true Enable entropy test only mode False
@hz_train store true Enable the active selecting policy False

robo3d-a_demo's People

Contributors

jasonchow97 avatar zhouhaonanshens avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.