GithubHelp home page GithubHelp logo

3dregnet's Introduction

3DRegNet: A Deep Neural Network for 3D Point Registration

G. Dias Pais1, Pedro Miraldo1,2, Srikumar Ramalingam3,
Jacinto C. Nascimento1, Venu Madhav Govindu4, and Rama Chellappa5

1Instituto Superior Tecnico, Lisboa 2KTH Royal Institute of Technology 3University of Utah
4Indian Institute of Science, Bengaluru 5University of Maryland, College Park
E-Mail: [email protected]

intro fig

This project provides the opensource code for 3DRegNet, a deep learning algorithm for the registration of 3D scans. Given a set of 3D point correspondences, we build a deep neural network using deep residual layers and convolutional layers to achieve two tasks: (1) classification of the point correspondences into correct/incorrect ones, and (2) regression of the motion parameters that can align the scans into a common reference frame.

If you want to use this open source, please cite:

@article{pais19,
  title={3DRegNet: A Deep Neural Network for 3D Point Registration},
  author={G. Dias Pais and Pedro Miraldo and Srikumar Ramalingam and
            Jacinto C. Nascimento and Venu Madhav Govindu and Rama Chellappa},
  journal={arXiv:1904.01701},
  year={2019}
}

Method

We build a deep neural network using deep residual layers and convolutional layers to achieve. The network architecture is shown below.

Classification Block: The input to our network is a set of 3D point correspondences. Each point correspondence (6-tuples) is processed by a fully connected layer with 128 ReLU activation functions. There is weight sharing for each of the individual N point correspondences, and the output is of dimension Nx128 where we generate 128 dimensional features from every point correspondence. The Nx128 output is then passed through 12 deep ResNet blocks, with weight-shared fully connected layers instead of convolutional layers. At the end, we use another fully connected layer with ReLU followed by tanh units to produce the weights.

Registration Block: The input to this block are the features extracted from the point correspondences. We use pooling to extract meaningful features of dimensions 128x1 from each layer of the classification block. We extract features at 13 stages of the classification, i.e., the first one is extracted before the first ResNet block and the last one is extracted after the last (12th) ResNet block. After the pooling is completed, we apply context normalization and concatenate the 13 feature maps (size 13x128), which is then passed on to a convolutional layer, with 8 channels. The output of the convolution will then be injected in two fully connected layers with 256 filters each, with ReLU between the layers, that generate the output of six transformation parameters, in which the rotation is parameterized by the axis-angle representation (other representations for the rotation are available in the config.py).

Configuration

The project requires pip and CUDA 9 or 10. To install the required python libraries run the code below:

pip install -r requirements.txt

Train

The network hyperparameters and other configurations can be change in the config.py. To train the network, run the following code:

makedir logs

python3 main.py --log_dir=[NAME] 

The weights will be saved in the log directory and can be later used for testing.

The SUN3D dataset ready for training, i.e. the 3D correspondences given by FPFH, and the weights for a pre-trained network using this dataset are available for download in the following this link.

To train and test the network with this dataset, one should unzip the sun3d.zip file into the data folder to have a folder tree such as

data/sun3d/(test, train)

Test

To test the trained network, use the code below:

python3 main.py --log_dir=[NAME] --run_mode=test

where the name of the log directory is the name of the derised tested directory.

Some Results

Examples of the registration of 30 scans using the 3DRegNet, the FGR, and the RANSAC methods. Instead of just considering the alignment of a pair of 3D scans, we aim at aligning 30 3D scans. We use the same threescenes in the SUN3D data-set (no additional datasets were used, and no additional parameter tuning was done): MIT, Brown, and Harvard sequences in theSUN3D data-set. These sequences were not used in the training.

3dregnet's People

Contributors

goncalo120 avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.