GithubHelp home page GithubHelp logo

glichill / cosplace-extended-insights-into-architectural-and-feature-choices-for-visual-geo-localization Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 2.41 MB

Extend CosPlace for Visual Geo-localization research, exploring architectures, data augmentation, and optimization for improved performance.

License: MIT License

Python 100.00%
adda data-augmentation deep-neural-networks domain-adaptation geolocation machine-learning netvlad pytorch resnet-18

cosplace-extended-insights-into-architectural-and-feature-choices-for-visual-geo-localization's Introduction

CosPlace Extended: Insights into Visual Geo-localization

This project builds upon CosPlace, originally created by Gabriele Berton. For installation and usage details, please refer to the original repository.

Abstract

Our research aims to enhance the CosPlace framework for Visual Geo-localization (VG) by exploring various architectural and feature choices. We focus on constructing, training, and evaluating different models to gain insights into the effectiveness of specific design decisions in the VG pipeline. Our objective is to establish a systematic evaluation protocol for method comparison. Leveraging our framework, we conduct extensive experiments to benchmark and optimize model parameters while assessing the impact of engineering techniques on model performance.

Implementations

Data Augmentation

We've implemented several data augmentation techniques, including:

  • Horizontal flipping
  • Blurring
  • Color jittering

Each technique offers different parameter configurations. Refer to the accompanying table for results.

SF_XS R@1 SF_XS R@5 SF_XS R@10 SF_XS R@20 Tokyo_XS R@1 Tokyo_XS R@5 Tokyo_XS R@10 Tokyo_XS R@20
Baseline 16.3 28.1 34.0 40.1 28.9 46.0 59.0 71.1
Random Horizontal Flip 15.1 27.1 32.6 37.9 27.6 51.7 61.9 72.1
Gaussian Blur (kernel_size=5, sigma=(0.5,1)) 14.5 25.3 32.1 38.3 26.1 49.8 60.0 70.1
Color-Jitter with contrast [1.0, 1.5] 19.7 33.0 37.9 43.6 37.8 53.7 59.0 70.2
Color-Jitter with contrast [1.5, 2.0] 19.5 32.1 38.3 43.8 36.5 52.7 62.2 70.8
Color-Jitter with contrast [3.0, 4.0] 16.9 30.1 35.8 41.9 30.8 49.2 54.0 66.0

Table: Augmentation results

Aggregation Layer

We explore three final aggregation layers:

  1. GeM Pooling Layer: The default layer in the original 'CosPlace.'
  2. MixVPR Pooling Layer: Inspired by the paper "Brahim Chaib-draa Amar Ali-bey and Philippe Giguere, Mixvpr: Feature Mixing for Visual Place Recognition (2023)." We made customizations to suit our specific requirements.
  3. NetVLAD Layer: This layer is initialized using an unsupervised learning approach to determine cluster centroids. The optimal number of clusters (15) is established through an elbow search using the k-means algorithm.

Table results demonstrate that MixVPR consistently improves model performance across various recall metrics.

Aggregator SF_XS R@1 SF_XS R@5 SF_XS R@10 SF_XS R@20 Tokyo_XS R@1 Tokyo_XS R@5 Tokyo_XS R@10 Tokyo_XS R@20
GeM 19.1 30.4 36.2 43.1 38.1 55.9 63.5 71.4
NetVLAD 22.8 39.0 44.7 51.0 36.2 55.2 64.4 72.7
MixVPR 30.9 44.4 51.8 57.4 48.9 68.9 75.9 81.6

Table: Aggregation results

Domain Adaptation (ADDA)

To address the domain shift problem, we incorporated Adversarial Domain Adaptation (ADDA) using Tokyo_XS and St_Lucia as target domains while testing on SF_XS. In this setup, we employed the NetVLAD aggregation layer and utilized a domain discriminator network during training. The goal was to minimize the domain distribution discrepancy between the source and target domains.

Table results show improved model generalization when Tokyo is chosen as the target domain.

Source Domain Target Domain SF_XS Test
R@1: 24.7
SF_XS Tokyo_XS R@5: 39.0
train database R@10: 46.4
R@20: 54.0
R@1: 24.8
SF_XS St_Lucia R@5: 38.3
train database R@10: 44.2
R@20: 51.3

Table: Domain Adaptation results

Optimizers

In our final set of experiments, we evaluated the performance of the Adam and AdamW optimizers using both the NetVLAD and MixVPR aggregation layers. However, the results did not demonstrate a significant improvement in performance. For detailed optimizer results, please refer to the ReportML_DL.pdf file.

This research extends the CosPlace framework, providing valuable insights into architectural and feature choices for visual geo-localization. Our experiments and findings contribute to a better understanding of how different design decisions impact model performance in this domain.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.