GithubHelp home page GithubHelp logo

anyloc / anyloc Goto Github PK

View Code? Open in Web Editor NEW
399.0 9.0 35.0 45.75 MB

AnyLoc: Universal Visual Place Recognition (RA-L 2023)

Home Page: https://anyloc.github.io/

License: BSD 3-Clause "New" or "Revised" License

Python 90.22% Shell 9.78%
deep-learning image-retrieval localization place-recognition robotics descriptors vlad anyloc

anyloc's People

Contributors

gmberton avatar nik-v9 avatar oravus avatar theprojectsguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anyloc's Issues

NetVLAD Results

Hi,

Thank you so much for releasing your code for this interesting paper and documenting it. Quick question, you include code for MixVPR and CosPlace approaches that you used for comparison in your paper, but I can't find any references to NetVLAD. How did you produce the results for that approach, is there code for that in this repository?

How to achieve demo positioning effect?

Hi I'm a newbie, how can I use anyloc to achieve the positioning effect in the demo demo locally? Specifically, given the left image, locate the position from the image library as below.
image

vpr bench error

Hi, thanks for your great work.
I am trying to run dino_v2_global_vpr.py from scripts with my own dataset (database + queries ). but faced error with generate_positives_and_utms function from datasets_ws.py.it asks for ground_truth_new.npy.

join(self.dataset_folder,'ground_truth_new.npy'))

Would u plz, tell me how to get/generate this file from my dataset and what it should contain?

need help when run run the cmd python dino_v2_vlad.py on terminal

there is an error
I add print(db_vlads) in script->dino_v2_vlad.py row of 368th

whilch show tensor([NAN]) after run the cmd python dino_v2_vlad.py on terminal

when I modify script->dino_v2_vlad.py row of 93th desc_facet: Literal["query", "key", "value", "token"] = "query" or "value"
and modify anyloc (main directory)-> config.py row of 82th vg_dataset_name = "st_lucia"

the dataset download from the onedriver mentioned in the readme.md

tree ./st_lucia is

./st_lucia
|-- [  26]  images
|   `-- [  49]  test
|       |-- [148K]  database [1549 entries exceeds filelimit, not opening dir]
|       `-- [140K]  queries [1464 entries exceeds filelimit, not opening dir]
`-- [695K]  map_st_lucia.png

I would greatly appreciate it if you could reply

VLAD Caching for the database and query

Hi,
I am tying to cache my own database and query, but faced this bug:

Unhandled exception
Traceback (most recent call last):
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/serialization.py", line 423, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/serialization.py", line 650, in _save
    zip_file.write_record(name, storage.data_ptr(), num_bytes)
RuntimeError: [enforce fail at inline_container.cc:445] . PytorchStreamWriter failed writing file data/0: file write failed

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "dino_v2_global_vocab_vlad.py", line 737, in <module>
    main(largs)
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "dino_v2_global_vocab_vlad.py", line 528, in main
    db_vlads, qu_vlads = build_vlads_fm_global(largs, vpr_ds,
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "dino_v2_global_vocab_vlad.py", line 406, in build_vlads_fm_global
    db_vlads: torch.Tensor = vlad.generate_multi(full_db,
  File "/home/yazan/workspace/AnyLoc/utilities.py", line 1188, in generate_multi
    res = [self.generate(q, c) \
  File "/home/yazan/workspace/AnyLoc/utilities.py", line 1188, in <listcomp>
    res = [self.generate(q, c) \
  File "/home/yazan/workspace/AnyLoc/utilities.py", line 1107, in generate
    residuals = self.generate_res_vec(query_descs, cache_id)
  File "/home/yazan/workspace/AnyLoc/utilities.py", line 1242, in generate_res_vec
    torch.save(residuals,f"{self.cache_dir}/{cache_id}_r.pt")
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/serialization.py", line 424, in save
    return
  File "/home/yazan/.local/lib/python3.8/site-packages/torch/serialization.py", line 290, in __exit__
    self.file_like.write_end_of_file()
RuntimeError: [enforce fail at inline_container.cc:325] . unexpected pos 320 vs 245 

Any ideas how to fix this?

Question about MSLS

Great work!
May I ask why MSLS dataset is not used in your paper? Will you use it in future work?

Some questions that need help

I have been reading this code recently. What I found is that the image input for extract_patch_descriptors is processed one by one. Would it be better to improve efficiency using batch size (if there are more GPUs)?

Another question is, instead of using dinov2 to train the benchmark's training dataset (e.g. Pitts30K/images/train) and then using dinov2 to test the benchmark's test dataset (e.g. Pitts310K/images/test), the AnyLoc project Just loaded the pretrained dinov2_vitg14 model to extract patch descriptors?

I would greatly appreciate it if you could reply

Recall reproduce

Hi thanks for your amazing work, the performance is absolutely impressive. just one question, I am trying to reproduce the result of Recall rate, but find there is no code for it? any plan to release this part of code? your reply is very appreciate.

Rotation invariant features

Good day! At first I want to say thank you for your impressive results.

I am currently researching a dataset of UAV images and I am wondering. Have you tried to investigate a problem when you have random rotated images and you still want to find your location?

Recall reproduce on Baidu Mall dataset

Thanks for sharing this great work!
While trying to reproduce the recall rates for the Baidu Mall dataset, I observed a drop in performance and couldn't match the recall rates mentioned in the paper
top@1= 43.1% (75.2%)
top@5 = 62.3% (87.6%)
The recall rates shown in the paper for AnyLoc-VLAD-DINOv2 are in parentheses. I am using the script from the scripts directory named dino_v2_vlad.py, using 32 clusters, 31 desc layer, dinov2_vitg14 architecture, key facet, and hard assignment. I got satisfactory results on the St.Lucia dataset though using the same config.

Also tried using vocabulary from the cache directory and loading the VLAD cluster centers as given in the demo folder, tried using indoor vocabulary for dinov2_vitg14 with 32 clusters but the performance on the Baidu mall dataset decreased further. Not sure what's the problem causing this drop in performance, any ideas on how to fix this performance issue?

Question about SAM

Amazing job!

I noticed that in addition to dino, you also tried other fundamental models such as sam and clip, and sam's model selection is reflected in the following two lines in the document:

model_types=("vit_b")

out_layer_numbers=(12)

num_clusters=(128)

I'm curious about that what is the consideration behind these choices? Why isn't the cluster center set to 32 like dinov2? And have you considered to use larger sam ​like sam-vit-L?

Looking forward to your reply.

About GT.npy of the datasets in the paper

Hello, @TheProjectsGuy ,thank you very much for your great work!
At present, the GT.npy of the datasets in the repo is not very clear for me to understand.
Can you provide GT.npy for all datasets? They are in the format of the Gardens Point dataset, or can you provide scripts for other datasets that generate GT.npy similar to Gardens Point? (Such GT.npy can reflect the relationship between the query and the ref, that is, one query corresponds to several refs ).

Hope to receive your reply soon :)
Thanks & Best Regards!

AttributeError: 'DinoV2ExtractFeatures' object has no attribute 'fh_handle'

custom data, when run : python ./demo/anyloc_vlad_generate.py raise
Exception ignored in: <function DinoV2ExtractFeatures.del at 0x7f8ce6d56b80>
Traceback (most recent call last):
File "./AnyLoc/demo/utilities.py", line 101, in del
self.fh_handle.remove()
AttributeError: 'DinoV2ExtractFeatures' object has no attribute 'fh_handle'

Distance used for query (image retrieval)

Thanks for sharing your work! The paper is very interesting and this repo looks extremely good and detailed.

I have a question that I couldn't find the answer in the paper.

I see there are several option for an image descriptor:

  • DinoV2 CLS token
  • GeM pooling over DINOv2 ViT features
  • VLAD over DINOv2 ViT feature

My question is:
For each one of these descriptors.. what is the distance you used for image retrieval (to fetch the most relevant image in dataset).
Cosine distance? L2 distance?

Missing datasets

The public release link does not contain the datasets. Could you please share the dataset you used?

Best,
Robert

Question about VP_AIR Dataset and Table 4 results.

Thank you for sharing your great work.
I have a quetion about vpair_gt.npy file and Experiment results(Table4).

< Question 1 >
I want to use the dino_v2_vlad.py code to check the recall results for the VPAir dataset, but VP-Air dataset does not provide the vpair_gt.npy file.
How can I get this file?

./VPAir
├── [ 677] camera_calibration.yaml
├── [420K] distractors [10000 entries exceeds filelimit, not opening dir]
├── [4.0K] distractors_temp
├── [ 321] License.txt
├── [177K] poses.csv
├── [ 72K] queries [2706 entries exceeds filelimit, not opening dir]
├── [160K] reference_views [2706 entries exceeds filelimit, not opening dir] ├── [ 96K] poses.csv
├── [ 96K] reference_views_npy [2706 entries exceeds filelimit, not opening dir] ├── [ 96K] reference_views_npy [2706 entries exceeds filelimit, not opening dir] â
└── [ 82K] vpair_gt.npy

< Question 2 >
In Table 4, the results of calculating the recall using Dinov2 are presented. Since the VLAD layer is not used here, I wonder how the recall result was obtained by processing the features obtained by the Dino feature extractor.

How to reproduce Fig.3 Qualitative result about similarity map?

Hello. Thank you for sharing your great work!
I've been trying to visualize a similarity map like the image in Fig. 3 from Anyloc paper for a few days now. However, I have not been able to get the right result.
The code below is the code I was working with. I am performing PCA on the norm_patchtoken obtained from dino, and generating a similarity map on the feature map obtained. However, the result is shown below.

Anyloc's repository doesn't seem to include the code to generate this similarity map. I was wondering if you could provide the code to reproduce the visualization result in Fig.3, or tell me how to visualize it.
Thank you.

Sample image

dog_2

Feature map after PCA, Interpolation

pca_resized_dog_2

Similarity map

dog_2_similarity_map

import torch
import torch.nn.functional as F
import torchvision.transforms as T
import os
import cv2
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

patch_h = 16
patch_w = 16
feat_dim = 1536 # vitg14

transform = T.Compose([
    T.Resize((patch_h * 14, patch_w * 14)),
    T.CenterCrop((patch_h * 14, patch_w * 14)),
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])

dinov2_vitg14 = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitg14')

print(dinov2_vitg14)

features = torch.zeros(1, patch_h * patch_w, feat_dim)
imgs_tensor = torch.zeros(1, 3, patch_h * 14, patch_w * 14)
for i in range(1):
    img_path = "/root/workspace/test_debug/dog_2.jpg"
    img = Image.open(img_path).convert('RGB')
    imgs_tensor[i] = transform(img)[:3]
with torch.no_grad():
    # features_dict = dinov2_vits14.forward_features(imgs_tensor) # torch.Size([1, 256, 384])
    features_dict = dinov2_vitg14.forward_features(imgs_tensor) # torch.Size([1, 256, 384])
    features = features_dict['x_norm_patchtokens']

from sklearn.decomposition import PCA

features = features.reshape(1 * patch_h * patch_w, feat_dim)

pca = PCA(n_components=3)  
pca.fit(features)  
pca_features = pca.transform(features).reshape(3, 16, 16)
print(f"pca_feature_shape : {pca_features.shape}")   # 368*256 => 256*3 

pca_features_tensor = torch.tensor(pca_features).unsqueeze(0)  # Add batch dimension
print(f"pca_features_tensor.shape : {pca_features_tensor.shape}")
pca_features_resized = F.interpolate(pca_features_tensor, size=(576,1024), mode='bilinear', align_corners=True)
pca_features_resized = pca_features_resized.squeeze(0).permute(1, 2, 0).numpy()

pca_features_resized = (pca_features_resized - pca_features_resized.min()) / (pca_features_resized.max() - pca_features_resized.min())

print(f"pca_features_resized shape ?? : {pca_features_resized.shape}")
pca_image = (pca_features_resized * 255).astype(np.uint8)
pca_image = Image.fromarray(pca_image)
pca_image.save('/root/workspace/result/pca_resized_dog_2.png')

def compute_similarity_map(feature_map, ref_point):
    ref_value = feature_map[ref_point]
    diff_map = np.abs(feature_map - ref_value)
    similarity_map = 1 - (diff_map / diff_map.max())
    return similarity_map

ref_point = (280, 800)
similarity_map = compute_similarity_map(pca_features_resized, ref_point)
output_dir = "/root/workspace/aerial_pr/dinov2_mixvpr_template/test_debug/result/"

plt.imshow(similarity_map, cmap='hot')
plt.colorbar()
plt.axis('off')
similarity_map_path = os.path.join(output_dir, 'dog_2_similarity_map.png')
plt.savefig(similarity_map_path, bbox_inches='tight', pad_inches=0)
plt.close()

Question about Nardo-Air datasets

Thanks for sharing your great work. This isn't a question about AnyLoc, but I'd like to know how you were able to access the Nardo-Air dataset you used for testing.
I couldn't find an access link when I googled it, is there any chance you could shed some light on that?

Thanks in advance.

Conda environment installation error

Hi,
I am installing the environment via setup_conda.sh and when I run to the code conda install -y -c pytorch faiss-gpu==1.7.2, the following error is reported:

Collecting package metadata (current_repodata.json): done                       
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                                                            
Collecting package metadata (repodata.json): done                               
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                                                            
Solving environment: \                                                          
Found conflicts! Looking for incompatible packages.                             
This can take several minutes.  Press CTRL-C to abort.                          
failed                                                                          
                                                                                
UnsatisfiableError: The following specifications were found                     
to be incompatible with the existing python installation in your environment:   
                                                                                
Specifications:                                                                 
                                                                                
  - faiss-gpu==1.7.2 -> python[version='>=3.6,<3.7.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0']                                                                          
                      
Your python: python=3.9

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.35=0
  - faiss-gpu==1.7.2 -> libgcc-ng[version='>=9.3.0'] -> __glibc[version='>=2.17']
  - python=3.9 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']

Your installed version is: 2.35

And when I install via conda install -c conda-forge faiss-gpu=1.7.2, the above error does not occur, but it installs a second cuda in my environment as follows:

The following packages will be downloaded.

    package | build
    ---------------------------|-----------------
    cudatoolkit-11.8.0 | h4ba93d1_12 682.8 MB conda-forge
    faiss-1.7.2 |py39cuda112h460e57a_4_cuda 1.2 MB conda-forge
    faiss-gpu-1.7.2 | h788eb59_4 15 KB conda-forge
    libfaiss-1.7.2 |cuda112hb18a002_4_cuda 51.4 MB conda-forge
    libfaiss-avx2-1.7.2 |cuda112h1234567_4_cuda 51.4 MB conda-forge

Here's part of my conda list:

cuda-cudart               11.7.99                       0    nvidia
cuda-cupti                11.7.101                      0    nvidia
cuda-libraries            11.7.1                        0    nvidia
cuda-nvrtc                11.7.99                       0    nvidia
cuda-nvtx                 11.7.91                       0    nvidia
cuda-runtime              11.7.1                        0    nvidia
cudatoolkit               11.8.0              h4ba93d1_12    conda-forge

How do I install packages so that my environment doesn't conflict?

Windows

Pretty amazing work, will windows be supported in the future?

Downloading cache.zip

I am currently facing an issue downloading the cache.zip file locally, which appears to be crucial for the remaining code to function properly. I eagerly await your response.

Thank you,
Boni

Question about the "Anyloc vocabularies"

Thank you for sharing your great work.

I have some questions about how to get "Anyloc vocabularies" which is different from "VLAD vocabularies".
According to the link below, the vocabulary by the Anyloc method is mentioned to be different from the VLAD vocabulary.
(#22)

I would like to reproduce the results of the paper, using the Vocabulary generated by the Anyloc method (I understood this to be the cluster center corresponding to each Domain) and Anyloc-VLAD-dinoV2.

To sum up, my question is as belows.

  1. How can i get vocabularies which is made by "Anyloc method"?
  • I'm wondering if I should download the vocabularies already created(e.g. download from links), or if I should do the PCA projection of descriptors by GeM global pooling as mentioned in the paper and get the vocabularies for each domain by myself.
    If I need to create a vocabulary, can you tell me how to create?
  1. In Table 4 of the paper, the recall values are calculated for each dataset. For Anyloc-VLAD-DINOv2, I am curious to know which Vocabulary was used to obtain each of these results. In particular, I am interested in the results for the dataset in the Aerial domain.
    The reason I am interested in this is that I would like to reproduce the results in the paper.
    Additionally, to get the results of Anyloc-VLAD-DINOv2 after creating a cluster, can I use anyloc_vlad_generate.py?
    image

Thank you.

demo for image retrieval

HI, thanks for you great work, but would you release the demo for image retrieval, I'm new to this area, and when I read your demos I couldn't imagine how this cluster would be used in image retrieval.

Releasing the model on torch.hub?

Are there any plans to release the trained AnyLoc model on torch.hub? It is quite simple to do and allows people to use your model with two lines of code, allowing more people to use your model and helping to spread your work!
For example I did it for CosPlace, and the trained model can be automatically downloaded from anywhere without cloning the repo just like this

import torch
model = torch.hub.load("gmberton/cosplace", "get_trained_model", backbone="ResNet50", fc_output_dim=2048)

I'd be happy to help if needed :-)

Minor issues for the next release

Listing minor proposed changes/discovered issues for the next release

In README.md

  • HuggingFace Paper badge HF-Paper points to the wrong URL. Use the correct URL.
  • Dataset Setup > ... Download (and unzip) the datasets from here into this folder ...: From where? Add public release link (already there in the badge). Use the modified link.
  • Change the website badge to anyloc.github.io (no left label) from the present badge.
  • Issue tracker shield

Repository

File is not a zip file

Hey everyone,

I am trying to run
python ./anyloc_vlad_generate.py --in-dir <path_to_images> --imgs-ext jpg --out-dir <path_to_output> --domain urban
locally and I am running into issue when downloading the cache.zip.

... raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

And this cache seems to be important to use the rest of the code.
Have you seen this problem before?
Thanks!

vlad cluster center for custom dataset

Hi @TheProjectsGuy , thank you for such a great project.
I can run the anyloc_vlad_generate.py file. But now I wan to extract the feature with dinov2+vlad on my custom dataset. Iam struggling in creating the cluster center with Vlad. Can you guide me how to create cluster center ?
Tks~

Nordland dataset

Hi, I am unable to use the download_nordland.py since the link for downloading the dataset in the python file that hosts the nordland dataset is not working. Is the dataset hosted by AnyLoc available at different link?

Error:

raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='cloudstor.aarnet.edu.au', port=443): Max retries exceeded with url: /plus/s/8L7loyTZjK0FsWT/download?path=%2F&files=summer.tar.gz (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000021D03DC58B0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.