GithubHelp home page GithubHelp logo

visual-geolocation's Introduction

Rethinking Visual Geo-localization for Large-Scale Applications

PWCPWC PWC PWC PWC PWC PWC

This is the official repository for the CVPR 2022 paper Rethinking Visual Geo-localization for Large-Scale Applications. The paper presents a new dataset called San Francisco eXtra Large (SF-XL, go here to download it), and a highly scalable training method (called CosPlace), which allows to reach SOTA results with compact descriptors.

The images below represent respectively:

  1. the map of San Francisco eXtra Large
  2. a visualization of how CosPlace Groups (read datasets) are formed
  3. results with CosPlace vs other methods on Pitts250k (CosPlace trained on SF-XL, others on Pitts30k)

Train

After downloading the SF-XL dataset, simply run

$ python3 train.py --dataset_folder path/to/sf-xl/processed

the script automatically splits SF-XL in CosPlace Groups, and saves the resulting object in the folder cache. By default training is performed with a ResNet-18 with descriptors dimensionality 512 is used, which fits in less than 4GB of VRAM.

To change the backbone or the output descriptors dimensionality simply run

$ python3 train.py --dataset_folder path/to/sf-xl/processed --backbone resnet50 --fc_output_dim 128

You can also speed up your training with Automatic Mixed Precision (note that all results/statistics from the paper did not use AMP)

$ python3 train.py --dataset_folder path/to/sf-xl/processed --use_amp16

Run $ python3 train.py -h to have a look at all the hyperparameters that you can change. You will find all hyperparameters mentioned in the paper.

Reproducibility

Results from the paper are fully reproducible, and we followed deep learning's best practices (average over multiple runs for the main results, validation and hyperparameter search on the val set).

Test

You can test a trained model as such

$ python3 eval.py --dataset_folder path/to/sf-xl/processed --backbone resnet50 --fc_output_dim 128 --resume_model path/to/best_model.pth

You can download plenty of trained models below.

Model Zoo

Models with different backbones and dimensionality of descriptors, trained on SF-XL
Pretained networks employing different backbones.

Model Dimension of Descriptors
32 64 128 256 512 1024 2048
ResNet-18 Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon - -
ResNet-50 Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon
ResNet-101 Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon
ResNet-152 Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon
VGG-16 Coming Soon Coming Soon Coming Soon Coming Soon Coming Soon - -

Cite

Here is the bibtex to cite our paper

@inProceedings{Berton_CVPR_2022_cosPlace,
  author = {Berton, Gabriele and Masone, Carlo and Caputo, Barbara},
  title = {Rethinking Visual Geo-localization for Large-Scale Applications}, 
  booktitle = {CVPR},
  month = {June}, 
  year = {2022}, }

Issues

If you find some problems in our code, or have any advice or questions, feel free to open an issue or send an email to [email protected]

visual-geolocation's People

Contributors

zerothb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.