GithubHelp home page GithubHelp logo

amaralibey / gsv-cities Goto Github PK

View Code? Open in Web Editor NEW
66.0 1.0 11.0 11.88 MB

GSV-Cities: a dataset and framework for visual place recognition

Python 73.33% Jupyter Notebook 26.67%
computer-vision dataset deep-learning image-based-localisation image-retrieval metric-learning visual-localization visual-place-recognition

gsv-cities's Introduction

GSV-Cities

Official repo for Neurocomputing 2022 paper GSV-Cities: Toward Appropriate Supervised Visual Place Recognition

[ArXiv] [ScienceDirect] [Bibtex] [Dataset]

  • The dataset is hosted on [Kaggle].
  • Training can be run from main.py, the code is commented and should be clear. Feel free to open an issue if you have any question.
  • To evaluate trained models, we rewrote the evaluation script for 8 benchmarks, check the included Jupyter Notebook

Summary of the paper

  1. We collected GSV-Cities, a large-scale dataset for the task of Visual Place Recognition, with highly accurate ground truth.
    • It contains ~530k images.
    • There are more than 62k different places, spread across multiple cities around the globe.
    • Each place is depited by at least 4 images (up to 20 images).
    • All places are physically distant (at least 100 meters between any pair of places).
  2. We proposed a fully convolutional aggregation technique (called Conv-AP) that outperforms NetVLAD and most existing SotA techniques.
  3. We consider representation learning for visual place recognition as a three components pipeline as follows:

pipeline

What can we do with GSV-Cities dataset and the code base in this repo?

  • Obtain new state-of-the-art performance.
  • Train visual place recognition models extremely rapidly.
  • No offline triplet mining: GSV-Cities contains highly accurate ground truth. Batches are formed in a traightforward way, bypassing all the hassle of triplet preprocessing.
  • Rapid prototyping: no need to wait days for convergence (expect 10-15 minutes of per epoch).
  • All existing techniques can benefit from training on GSV-Cities.

Trained models

Please refer to the following Jupyter Notebook for evaluation.

Backbone Output
dimension
Pitts250k-test Pitts30k-test MSLS-val Nordland
R@1 R@5 R@1 R@5 R@1 R@5 R@1 R@5
ResNet50 8192
[2048x2x2]
92.8 97.7 90.5 95.2 83.1 90.3 42.7 58.8 LINK
4096
[1024x2x2]
92.5 97.7 90.5 95.3 83.5 89.7 42.6 59.8
2048
[512x2x2]
92.3 97.5 90.6 95.1 83.4 90.3 40.3 56.6
512
[128x2x2]
90.7 96.6 89.1 94.6 82.6 90.0 36.3 53.1

Code to load the pretrained weights is as follows:

from main import VPRModel

# Note that these models have been trained with images resized to 320x320
# Also, either use BILINEAR or BICUBIC interpolation when resizing.
# The model with 4096-dim output has been trained with images resized with bicubic interpolation
# The model with 8192-dim output with bilinear interpolation
# ConvAP works with all image sizes, but best performance can be achieved when resizing to the training resolution

model = VPRModel(backbone_arch='resnet50', 
                 layers_to_crop=[],
                 agg_arch='ConvAP',
                 agg_config={'in_channels': 2048,
                            'out_channels': 1024,
                            's1' : 2,
                            's2' : 2},
                )


state_dict = torch.load('./LOGS/resnet50_ConvAP_1024_2x2.ckpt')
model.load_state_dict(state_dict)
model.eval()

GSV-Cities dataset overview

  • GSV-Cities contains ~530,000 images representing ~62,000 different places, spread across multiple cities around the globe.
  • All places are physically distant (at least 100 meters between any pair of places).

example

Database organisation

Unlike existing visual place recognition datasets where images are organised in a way that's not (so humanly) explorable. Images in GSV-Cities are named as follows:

city_placeID_year_month_bearing_latitude_longitude_panoid.JPG

This way of naming has the advantage of exploring the dataset using the default Image Viewer of the OS, and also, adding redondancy of the metadata in case the Dataframes get lost or corrupt.

The dataset is organised as follows:

├── Images
│   ├── Paris
│   │   ├── ...
│   │   ├── PRS_0000003_2015_05_584_48.79733778544615_2.231461206488333_7P0FnGV3k4Fmtw66b8_-Gg.JPG
│   │   ├── PRS_0000003_2018_05_406_48.79731397404108_2.231417994064803_R2vU9sk2livhkYbhy8SFfA.JPG
│   │   ├── PRS_0000003_2019_07_411_48.79731121699659_2.231424930041198_bu4vOZzw3_iU5QxKiQciJA.JPG
│   │   ├── ...
│   ├── Boston
│   │   ├── ...
│   │   ├── Boston_0006385_2015_06_121_42.37599246498178_-71.06902130162344_2MyXGeslIiua6cMcDQx9Vg.JPG
│   │   ├── Boston_0006385_2018_09_117_42.37602467319898_-71.0689666533628_NWx_VsRKGwOQnvV8Gllyog.JPG
│   │   ├── ...
│   ├── Quebec
│   │   ├── ...
│   ├── ...
└── Dataframes
    ├── Paris.csv
    ├── London.csv
    ├── Quebec.csv
    ├── ...

Each datadrame contains the metadata of the its corresponding city. This will help access the dataset almost instantly using Pandas. For example, we show 5 rows from London.csv:

place_id year month northdeg city_id lat lon panoid
130 2018 4 15 London 51.4861 -0.0895151 6jFjb3wGyCkcBfq4k559ag
6793 2016 7 2 London 51.5187 -0.160767 Ff3OtsS4ihGSPdPjtlpEUA
9292 2018 1 289 London 51.531 -0.12702 0t-xcCsazIGAjdNC96IF0w
7660 2015 6 97 London 51.5233 -0.158693 zFbmpj8jt8natu7IPYrh_w
8713 2008 9 348 London 51.5281 -0.127114 W3KMPec54NBqLMzmZmGv-Q

And If we want only places that are depicted by at least 8 images each, we can simply filter the dataset using pandas as follows:

df = pd.read_csv('London.csv')
df = df[df.groupby('place_id')['place_id'].transform('size') >= 8]

Notice that given a Dataframe row, we can directly read its corresponding image (the first row of the above example corresponds to the image named ./Images/London/London_0000130_2018_04_015_51.4861_-0.0895151_6jFjb3wGyCkcBfq4k559ag.JPG)

We can, for example, query the dataset with only places that are in the northern hemisphere, taken between 2012 and 2016 during the month of July, each depicted by at least 16 images.

Stay tuned for tutorials in the comming weeks.

Cite

Use the following bibtex code to cite our paper

@article{ali2022gsv,
  title={GSV-Cities: Toward appropriate supervised visual place recognition},
  author={Ali-bey, Amar and Chaib-draa, Brahim and Gigu{\`e}re, Philippe},
  journal={Neurocomputing},
  volume={513},
  pages={194--203},
  year={2022},
  publisher={Elsevier}
}

gsv-cities's People

Contributors

amaralibey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

gsv-cities's Issues

about aggregators

Hi, Amar
Thanks for the great work.
I just noticed that the implementation of the NetVLAD aggregator is missing in the current code.
I would like to ask how to use GSV-Cities to train NetVLAD.
Would you please release the code for it?
image

pitts dataset

Hi? May I ask how the gt(pitts30k_test_gt.npy) is arranged? It doesn't seem to be sorted by distance.

main.py's bug?

Hi, Amar
Thanks for the great work.
I found a bug. In main.py, def validation_ epoch(),
the length of feats is far less than num_ References results in utils.get_ validation_ Recall() divisor is 0 and appearing NAN
image
image

A Question About Batch-Sizes

Hi There,

Thank you for this excellent code base.

I wanted to ask about batch sizes. In the paper you report results on a batch of 100*4 = 400.

I can't achieve that due to GPU Memory. I am using a batch size of 32 but cannot get my models to converge. Did you experiment with smaller batch sizes ?

lacking of MSLS files

Hi, Amar
Thanks for the great work.
I just noticed that /datasets/msls_val/msls_val_dbImages.npy、msls_val_qImages.npy、msls_val_pIdx.npy is missing in the current code.
Would you please release the code for it?
image

LICENSE

Hi I did not find a license file in your repo. Can you add one please? is it MIT license?
Fantastic work!

Dataset availability

The paper says that the dataset and code are available. But I don't see the dataset link anywhere on this repo. Am I missing something?

about running problem

Hello,I have some question about training. I met a problem When I tried to run the main.py . According to the prompt, I tried modifying the value of the max_steps, but it still shows max_steps=1. Can you give me some suggestions about this problem?
3EXOF GRN}%VP68N13U{{QY

No Implementation of ConvAP module

Hi, Amar
Thanks for the great work.
I just noticed that the implementation of the ConvAP module is missing in the current code.
Would you please release the code for it?

About training and migration

Hello, since I am a novice, I have some questions about training. I made an error when running the result main.py file: Missing Logger folder: LOGS/RESNET50/lightning, do I need to change the default path? How? Can I use the MLP feature aggregation method for image retrieval using other datasets?

Full release of code and dataset

Hi amar,

Thank you for the amazing repo.

I am working on a similar project involving VPR , your data introduced in paper fit me well.

I just wondering when will the full release the code and dataset.

Thx again.

traindataset

Hello @amaralibey.
When constructing the training set, the order of places within a city is shuffled, whereas there is an order between cities because buildings within a city share a similar architectural style, whereas architectural styles differ between cities.

some questions about this experiments

pdJwyWSF00
In my device,this result in Nordland is like not good ,I think maybe is parameters is not correct,but I don't know how to adjust which parameters .In addition,I want to ask for you about the groundtruth of dataset how to make.Thank you!

Datasets construction

Hi,

Thanks for the awesome work! I am new here. There are some problems during reading your code.

What's the vars about GT_ROOT? In your default setting, GT_ROOT is "/home/USER/work/gsv-cities/datasets/"

But how to construct this directory? I am confused.

E.g., I downloaded ESSEX3IN1 dataset. But there are no ESSEX_dbImages.npy, ESSEX_qImages.npy and ESSEX/ESSEX_gt.npy. What should I do?

release pre-trained model

Hi amar,

Thank you for the amazing repo.

I'm interested in VPR, however, restricted by machine(GPU), it's a little hard for me to train all the data. So I want to know if you will release the pre-training model.

Thanks

Dataset query image generation rule

Hi? May I ask how to select the query images and to create the following npy file? using MSLS dataset as an example.

self.dbImages = np.load(GT_ROOT+'msls_val/msls_val_dbImages.npy')
self.qIdx = np.load(GT_ROOT+'msls_val/msls_val_qIdx.npy')
self.qImages = np.load(GT_ROOT+'msls_val/msls_val_qImages.npy')
self.ground_truth = np.load(GT_ROOT+'msls_val/msls_val_pIdx.npy', allow_pickle=True)

I am trying to build my own datasets and test your models

Some questions in the reproduction process

Hello, @amaralibey ,first of all,many thanks for your work!I really appreciate for your great work :) When I reproduced, I had some problems and needed help.

1、How to test the pitts250k val/test dataset with the trained weights?
2、If something unexpected happens that stops the training, how to continue training on the basis of the existing trained weight?
3、If I want to obtain a list of the top N images retrieved for each query in the test dataset in the test stage.It just need to output a txt file( including the prediction results ).How to to achieve this?
4、I noticed that PCA algorithm was used in the paper, but I didn't find it in the code. May I ask how to implement it?

I am always looking forward to your kind response:)
Best regards.

Using the model

Hi! Thank you so much for your work! I am very new to the space of VPR and would like to ask how a VPR model can be deployed. I have some experience with YOLO computer vision where there is a script to run predictions. May I ask how I can do the same for VPR?

Thank you so much for your help!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.