GithubHelp home page GithubHelp logo

ethz-asl / hierarchical_loc Goto Github PK

View Code? Open in Web Editor NEW
174.0 16.0 38.0 219.49 MB

Deep image retrieval for efficient 6-DoF localization

License: BSD 3-Clause "New" or "Revised" License

Python 82.49% Makefile 0.05% Shell 0.18% CMake 0.77% C++ 16.51%
localization learning descriptors mobilenetvlad

hierarchical_loc's Introduction

Hierarchical Localization

⚠️ ⚠️ For a clean and research-friendly implementation of Hierarchical Localization, please refer to our CVPR 2019 paper at ethz-asl/hfnet. ⚠️ ⚠️

This repository contains the training and deployment code used in our paper Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization presented at CoRL 2018. This work introduces MobileNetVLAD, a mobile-friendly image retrieval deep neural network that significantly improves the performance of classical 6-DoF visual localization through a hierarchical search.


The approach is described in details in our video (click to play).

We introduce here two main features:

  • The deployment code of MobileNetVLAD: global-loc, a C++ ROS/Catkin package that can
    • load any trained image retrieval model,
    • efficiently perform the inference on GPU or CPU,
    • index a given map and save it as a protobuf,
    • and retrieve keyframes given a query image;
  • The training code: retrievalnet, a modular Python+Tensorflow package that allows to
    • train the model on any target image domain,
    • using the supervision of any existing teacher network.

The modularity of our system allows to train a model and index a map on a powerful workstation while performing the retrieval on a mobile platform. Our code has thus been extensively tested on an NVIDIA Jetson TX2, widely used for robotics research.


Retrieval on our Zurich dataset: strong illumination and viewpoint changes.

Deployment

The package relies on map primitives provided by maplab, but can be easily adapted to other SLAM frameworks. We thus do not release the code performing the local matching. The trained MobileNetVLAD is provided in global-loc/models/ and is loaded using tensorflow_catkin.

Installation

Both Ubuntu 14.04 and 16.04 are supported. First install the system packages required by maplab.

Then setup the Catkin workspace:

export ROS_VERSION=kinetic #(Ubuntu 16.04: kinetic, Ubuntu 14.04: indigo)
export CATKIN_WS=~/maplab_ws
mkdir -p $CATKIN_WS/src
cd $CATKIN_WS
catkin init
catkin config --merge-devel # Necessary for catkin_tools >= 0.4.
catkin config --extend /opt/ros/$ROS_VERSION
catkin config --cmake-args \
	-DCMAKE_BUILD_TYPE=Release \
	-DENABLE_TIMING=1 \
	-DENABLE_STATISTICS=1 \
	-DCMAKE_CXX_FLAGS="-fext-numeric-literals -msse3 -msse4.1 -msse4.2" \
	-DCMAKE_CXX_STANDARD=14
cd src

If you want to perform the inference on GPU (see the requirements of tensorflow_catkin), add:

catkin config --append-args --cmake-args -DUSE_GPU=ON

Finally clone the repository and build:

git clone https://github.com/ethz-asl/hierarchical_loc.git --recursive
touch hierarchical_loc/catkin_dependencies/maplab_dependencies/3rd_party/eigen_catkin/CATKIN_IGNORE
touch hierarchical_loc/catkin_dependencies/maplab_dependencies/3rd_party/protobuf_catkin/CATKIN_IGNORE
cd $CATKIN_WS && catkin build global_loc

Run the test examples:

./devel/lib/global_loc/test_inference
./devel/lib/global_loc/test_query_index

Indexing

Given a VI map in global-loc/maps/, an index of global descriptors can be created in global-loc/data/:

./devel/lib/global_loc/build_index \
	--map_name <map_name> \
	--model_name mobilenetvlad_depth-0.35 \
	--proto_name <index_name.pb>

As an example, we provide the Zurich map used in our paper. Several indexing options are available in place-retrieval.cc, such as subsampling or mission selection.

Retrieval

An example of query is provided in test_query_index.cc. Descriptor indexes for the Zurich dataset are included in global-loc/data/ and can be used to time the queries:

./devel/lib/global_loc/time_query \
	--map_name <map_name> \
	--model_name mobilenetvlad_depth-0.35 \
	--proto_name lindenhof_afternoon_aligned_mobilenet-d0.35.pb \
	--query_mission f6837cac0168580aa8a66be7bbb20805 \
	--use_pca --pca_dims 512 --max_num_queries 100

Use the same indexes to evaluate and visualize the retrieval: install retrievalnet, generate the Python protobuf interface, and refer to tango_evaluation.ipynb and tango_visualize_retrieval.ipynb.

Training

We use distillation to compress the original NetVLAD model into a smaller MobileNetVLAD with mobile real-time inference capability.

Installation

Python 3.5 is required. It is advised to run the following installation commands within a virtual environment. You will be prompted to provide the path to a data folder (subsequently referred as $DATA_PATH) containing the datasets and pre-trained models and to an experiment folder ($EXPER_PATH) containing the trained models, training logs, and exported descriptors for evaluation.

cd retrievalnet && make install

Exporting the target descriptors

If you wish to train MobileNetVLAD on the Google Landmarks dataset as done in our paper, you first need to download the index of images and then download the dataset itself with download_google_landmarks.py. The weights of the original NetVLAD model are provided by netvlad_tf_open and should be extracted in $DATA_PATH/weights/.

Finally export the descriptors of Google Landmarks:

python export_descriptors.py config/netvlad_export_distill.yaml google_landmarks/descriptors --as_dataset

Training MobileNetVLAD

Extract the MobileNet encoder pre-trained on ImageNet in $DATA_PATH/weights/ and run:

python train.py config/mobilenetvlad_train_distill.yaml mobilenetvlad

The training can be interrupted at any time using Ctrl+C and can be monitored with Tensorboard summaries saved in $EXPER_PATH/mobilenetvlad/. The weights are also saved there.

Exporting the model for deployment

python export_model.py config/mobilenetvlad_train_distill.yaml mobilenetvlad

will export the model in $EXPER_PATH/saved_models/mobilenetvlad/.

Evaluating on the NCLT dataset

Download the NCLT sequences in $DATA_PATH/nclt/ along with the corresponding pose files (generated with nclt_generate_poses.ipynb). Export the NCLT descriptors, e.g. for MobileNetVLAD:

python export_descriptors.py configs/mobilenetvlad_export_nclt.yaml mobilenetvlad

These can be used to evaluate and visualize the retrieval (see nclt_evaluation.ipynb and nclt_visualize_retrieval.ipynb).

Citation

Please consider citing the corresponding publication if you use this work in an academic context:

@inproceedings{sarlin2018leveraging,
  title={Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization},
  author={Sarlin, Paul-Edouard and Debraine, Frederic and Dymczyk, Marcin and Siegwart, Roland and Cadena, Cesar},
  booktitle={Conference on Robot Learning (CoRL)},
  year={2018}
}

hierarchical_loc's People

Contributors

dymczykm avatar sarlinpe avatar skydes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hierarchical_loc's Issues

Where to apply descriptor dimension reduction using PCA?

I am not clearly about which place the descriptor dimension reduction is located in training process of MobileNetVLAD. Is it applied to global descriptor computed by teacher network, and then use the reduced descriptor as the supervision for student network? Or other places?

Looking forward to your reply. Thanks!

Has we any Covisibility Clustering codes?

Dear Paul,

I met some trouble within understanding the Covisibility Clustering , I cannot find it's related codes because of my trouble code reading level. Could you help me? Please point out the position of related codes (of Covisibility Clustering), or some algorithm we can run independently. Thanks so much.
98006757-99f0f880-1e2d-11eb-9f99-69ff7c464efd

How to calculate pose by a query image

hi,skydes
Thank you for your public code. I found that I didn't calculate the pose of the image after reading it. and I tried to use pnp but there is a mismatch of points, how do you calculate the camera pose?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.