dinov2-retrieval's Introduction

dinov2-retrieval

A program of image retrieval using dinov2

Some random results(left: query image, others: retrieved image from Caltech 256):

How to install

pip install dinov2_retrieval

Tested on:

MacOS 13.4
Windows 11
Ubuntu 20.04

How to use

All available options:

$ dinov2_retrieval -h
usage: dinov2_retrieval [-h] [-s {small,base,large,largest}] [-p MODEL_PATH] [-o OUTPUT_ROOT] -q QUERY -d DATABASE [-n NUM] [--size SIZE]
                        [-m {0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100}] [--disable-cache] [-v]

optional arguments:
  -h, --help            show this help message and exit
  -s {small,base,large,largest}, --model-size {small,base,large,largest}
                        DinoV2 model type
  -p MODEL_PATH, --model-path MODEL_PATH
                        path to dinov2 model, useful when github is unavailable
  -o OUTPUT_ROOT, --output-root OUTPUT_ROOT
                        root folder to save output results
  -q QUERY, --query QUERY
                        path to a query image file or image folder
  -d DATABASE, --database DATABASE
                        path to the database image file or image folder
  -n NUM, --num NUM     How many images to show in retrieval results
  --size SIZE           image output size
  -m {0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100}, --margin {0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100}
                        margin size (in pixel) between concatenated images
  --disable-cache       don't cache database features, will extract features each time, quite time-consuming for large database
  -v, --verbose         show detailed logs

Generally, You only need to set -q (or --query) and -d (or --database) to run this program:

dinov2_retrieval -q /path/to/query/image -d /path/to/database

NOTE: both of them can be path to an image or a folder.

The results will be saved to output folder.

Debug info

add -v or --verbose to show debug info

Change model size

DINOv2 contains 4 types of pretrained models, you can use --model_size + one of [small, base, large, largest] to choose which model to use.

Use cached model

If the network to GitHub (where the DINOv2 models are stored) is unstable, you can set --model-path to cached model folder after downloading model during the first run.

TODO

[ ]. optimize visualization result, e.g., better background color
[ ]. More tech details

License

MIT

dinov2-retrieval's People

Contributors

Stargazers

Watchers

dinov2-retrieval's Issues

size mismatch for pos_embed

Hello, I have a question about DINOv2. Could you please help me?I instantiated a vit_small ViT model and tried to load the pretrained weights using the load_pretrained_weights function from utils. Here's the code I wrote:
self.vit_model = vits.dict'vit_small'
load_pretrained_weights(self.vit_model, 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pth', None)
However, I encountered the following error:
Traceback (most recent call last):
File "/data/PycharmProjects/train.py", line 124, in
model = model(aff_classes=args.num_classes)
File "/data/PycharmProjects/models/locate.py", line 89, in init
load_pretrained_weights(self.vit_model, pretrained_url, None)
File "/data/PycharmProjects/models/dinov2/dinov2/utils/utils.py", line 32, in load_pretrained_weights
msg = model.load_state_dict(state_dict, strict=False)
File "/home/ustc/anaconda3/envs/locate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DinoVisionTransformer:
size mismatch for pos_embed: copying a param with shape torch.Size([1, 1370, 384]) from checkpoint, the shape in current model is torch.Size([1, 257, 384]).

Could you please help me understand what might be causing this issue? Thank you for your assistance.

vra / dinov2-retrieval Goto Github PK