GithubHelp home page GithubHelp logo

orienternet's Introduction

OrienterNet
Visual Localization in 2D Public Maps
with Neural Matching

Paul-Edouard Sarlin · Daniel DeTone · Tsun-Yi Yang · Armen Avetisyan · Julian Straub
Tomasz Malisiewicz · Samuel Rota Bulo · Richard Newcombe · Peter Kontschieder · Vasileios Balntas

CVPR 2023

teaser
OrienterNet is a deep neural network that can accurately localize an image
using the same 2D semantic maps that humans use to orient themselves.

This repository hosts the source code for OrienterNet, a research project by Meta Reality Labs. OrienterNet leverages the power of deep learning to provide accurate positioning of images using free and globally-available maps from OpenStreetMap. As opposed to complex existing algorithms that rely on 3D point clouds, OrienterNet estimates a position and orientation by matching a neural Bird's-Eye-View with 2D maps.

Installation

OrienterNet requires Python >= 3.8 and PyTorch. To run the demo, clone this repo and install the minimal requirements:

git clone https://github.com/facebookresearch/OrienterNet
python -m pip install -r requirements/demo.txt

To run the evaluation and training, install the full requirements:

python -m pip install -r requirements/full.txt

Demo ➡️ Open In Colab

Check out the Jupyter notebook demo.ipynb (run it on Colab!) for a minimal demo - take a picture with your phone in any city and find its exact location in a few seconds!

demo
OrienterNet positions any image within a large area - try it with your own images!

Evaluation

Mapillary Geo-Localization dataset

[Click to expand]

To obtain the dataset:

  1. Create a developper account at mapillary.com and obtain a free access token.
  2. Run the following script to download the data from Mapillary and prepare it:
python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN

By default the data is written to the directory ./datasets/MGL/. Then run the evaluation with the pre-trained model:

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL model.num_rotations=256

This downloads the pre-trained models if necessary. The results should be close to the following:

Recall xy_max_error: [14.37, 48.69, 61.7] at (1, 3, 5) m/°
Recall yaw_max_error: [20.95, 54.96, 70.17] at (1, 3, 5) m/°

This requires a GPU with 11GB of memory. If you run into OOM issues, consider reducing the number of rotations (the default is 256):

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL \
    model.num_rotations=128

To export visualizations for the first 100 examples:

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL \
    --output_dir ./viz_MGL/ --num 100 

To run the evaluation in sequential mode (by default with 10 frames):

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL --sequential

KITTI dataset

[Click to expand]
  1. Download and prepare the dataset to ./datasets/kitti/:
python -m maploc.data.kitti.prepare
  1. Run the evaluation with the model trained on MGL:
python -m maploc.evaluation.kitti --experiment OrienterNet_MGL

You should expect the following results:

Recall directional_error: [[50.33, 85.18, 92.73], [24.38, 56.13, 67.98]] at (1, 3, 5) m/°
Recall yaw_max_error: [29.22, 68.2, 84.49] at (1, 3, 5) m/°

You can similarly export some visual examples:

python -m maploc.evaluation.kitti --experiment OrienterNet_MGL \
    --output_dir ./viz_KITTI/ --num 100 

Aria Detroit & Seattle

We are currently unable to release the dataset used to evaluate OrienterNet in the CVPR 2023 paper.

Training

MGL dataset

We trained the model on the MGL dataset using 3x 3090 GPUs (24GB VRAM each) and a total batch size of 12 for 340k iterations (about 3-4 days) with the following command:

python -m maploc.train experiment.name=OrienterNet_MGL_reproduce

Feel free to use any other experiment name. Configurations are managed by Hydra and OmegaConf so any entry can be overridden from the command line. You may thus reduce the number of GPUs and the batch size via:

python -m maploc.train experiment.name=OrienterNet_MGL_reproduce
  experiment.gpus=1 data.loading.train.batch_size=4

Be aware that this can reduce the overall performance. The checkpoints are written to ./experiments/experiment_name/. Then run the evaluation:

# the best checkpoint:
python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL_reproduce
# a specific checkpoint:
python -m maploc.evaluation.mapillary \
    --experiment OrienterNet_MGL_reproduce/checkpoint-step=340000.ckpt

KITTI

To fine-tune a trained model on the KITTI dataset:

python -m maploc.train experiment.name=OrienterNet_MGL_kitti data=kitti \
    training.finetune_from_checkpoint='"experiments/OrienterNet_MGL_reproduce/checkpoint-step=340000.ckpt"'

Interactive development

We provide several visualization notebooks:

OpenStreetMap data

[Click to expand]

To make sure that the results are consistent over time, we used OSM data downloaded from Geofabrik in November 2021. By default, the dataset scripts maploc.data.[mapillary,kitti].prepare download pre-generated raster tiles. If you wish to use different OSM classes, you can pass --generate_tiles, which will download and use our prepared raw .osm XML files. You may alternatively download more recent files.

License

The MGL dataset is made available under the CC-BY-SA license following the data available on the Mapillary platform. The model implementation and the pre-trained weights follow a CC-BY-NC license. Keep in mind that OpenStreetMap follows a different license.

BibTex citation

Please consider citing our work if you use any code from this repo or ideas presented in the paper:

@inproceedings{sarlin2023orienternet,
  author    = {Paul-Edouard Sarlin and
               Daniel DeTone and
               Tsun-Yi Yang and
               Armen Avetisyan and
               Julian Straub and
               Tomasz Malisiewicz and
               Samuel Rota Bulo and
               Richard Newcombe and
               Peter Kontschieder and
               Vasileios Balntas},
  title     = {{OrienterNet: Visual Localization in 2D Public Maps with Neural Matching}},
  booktitle = {CVPR},
  year      = {2023},
}

orienternet's People

Contributors

sarlinpe avatar

Watchers

Joel Navez CT avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.