GithubHelp home page GithubHelp logo

ducha-aiki / affnet Goto Github PK

View Code? Open in Web Editor NEW
263.0 18.0 47.0 62.72 MB

Code and weights for local feature affine shape estimation paper "Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability"

License: MIT License

Python 95.68% Shell 0.27% Jupyter Notebook 4.05%
deep-learning local-features convolutional-neural-networks convolutional-networks pytorch computer-vision hessian image-retrieval image-matching affine-shape-estimator

affnet's Introduction

AffNet model implementation

CNN-based affine shape estimator.

AffNet model implementation in PyTorch for ECCV2018 paper "Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability"

Update: pytorch 1.4 version

The master branch is the one, which produced ECCV-paper results, python 2.7 and pytorch 0.4.0

Here is the one, which successfully runs on python 3.7, pytorch 1.4.0

AffNet generates up to twice more correspondeces compared to Baumberg iterations HesAff HesAffNet

Retrieval on Oxford5k, mAP

Detector + Descriptor BoW BoW + SV BoW + SV + QE HQE + MA
HesAff + RootSIFT 55.1 63.0 78.4 88.0
HesAff + HardNet++ 60.8 69.6 84.5 88.3
HesAffNet + HardNet++ 68.3 77.8 89.0 89.5

Datasets and Training

To download datasets and start learning affnet:

git clone https://github.com/ducha-aiki/affnet
./run_me.sh

Paper figures reproduction

To reproduce Figure 1 in paper, run notebook

To reproduce Figure 2-3 in paper, run notebooks here

git clone https://github.com/ducha-aiki/affnet
./run_me.sh

Pre-trained models

Pre-trained models can be found in folder pretrained: AffNet.pth

Usage example

We provide two examples, how to estimate affine shape with AffNet. First, on patch-column file, in HPatches format, i.e. grayscale image with w = patchSize and h = nPatches * patchSize

cd examples/just_shape
python detect_affine_shape.py imgs/face.png out.txt

Out file format is upright affine frame a11 0 a21 a22

Second, AffNet inside pytorch implementation of Hessian-Affine

2000 is number of regions to detect.

cd examples/hesaffnet
python hesaffnet.py img/cat.png ells-affnet.txt 2000
python hesaffBaum.py img/cat.png ells-Baumberg.txt 2000

output ells-affnet.txt is Oxford affine format

1.0
128
x y a b c 

WBS example

Example is in [notebook](examples/hesaffnet/WBS demo.ipynb)

Citation

Please cite us if you use this code:

@inproceedings{AffNet2017,
 author = {Dmytro Mishkin, Filip Radenovic, Jiri Matas},
    title = "{Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability}",
    year = 2018,
    month = sep,
    booktitle = {Proceedings of ECCV}
    }

affnet's People

Contributors

ducha-aiki avatar jukindle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

affnet's Issues

About the AFFNET

Hi, I am a little confused about the AFFNET. Is the output of the AFFNET the affine transformation predicted? Since I look through the code and find that the input for the AFFNET is only one image. How can the AFFNET predict the affine transformation with one input image.

How to extract patches in Oxford Dataset?

Hi, I cannot understand the details about extracting patches and I have some questions:

  1. How many patches have you extracted in every image in the experiment of Oxford5k retrieval?
  2. How to extract patches when you used the function extract_patches_from_pyr, does it have something to do with "Spatial Transformer Networks"? Because you used the torch.nn.functional.affine_grid and torch.nn.functional.grid_sample functions in Pytorch in LAF.py.
  3. What is the parameter scale_pyramid, pyr_inv_idxs, in
    extract_patches_from_pyramid_with_inv_index

About train_OriNet_test_on_graffity.py

I have a question about train_OriNet_test_on_graffity.py.
It's about the test function.
I think that for "input_img_fname1" and "input_img_fname2", the same 3d point part is extracted as a 32 * 32 image and assigned to the model (AffNet, OriNet).
However, I don’t know how to identify and extract the same 3D points.
Perhaps you are doing it in ine116, line118 of https://github.com/ducha-aiki/affnet/blob/master/SparseImgRepresenter.py, can you tell me the detailed mechanism?

Aligning 2 images using affnet

How can I use affnet and hardnnet++ to align two images

similar to this

def alignImages(im1, im2, detector_type = "sift"):
  # Convert images to grayscale
  im1Gray = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
  im2Gray = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
  
  # Detect ORB features and compute descriptors.
  orb = cv2.ORB_create(MAX_FEATURES)
  keypoints1, descriptors1 = orb.detectAndCompute(im1Gray, None)
  keypoints2, descriptors2 = orb.detectAndCompute(im2Gray, None)
  print(numpy.array(keypoints1).shape, numpy.array(descriptors1).shape)
  # Match features.
  matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
  matches = matcher.match(descriptors1, descriptors2, None)
  
  # Sort matches by score
  matches.sort(key=lambda x: x.distance, reverse=False)
  
  # Remove not so good matches
  numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
  matches = matches[:numGoodMatches]
 
  # Draw top matches
  imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
  cv2.imwrite("matches.jpg", imMatches)
  cv2_imshow(imMatches)
  # Extract location of good matches
  points1 = np.zeros((len(matches), 2), dtype=np.float32)
  points2 = np.zeros((len(matches), 2), dtype=np.float32)
 
  for i, match in enumerate(matches):
    points1[i, :] = keypoints1[match.queryIdx].pt
    points2[i, :] = keypoints2[match.trainIdx].pt
   
  # Find homography
  h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
 
  # Use homography
  height, width, channels = im2.shape
  im1Reg = cv2.warpPerspective(im1, h, (width, height))
   
  return im1Reg, h

LAFs2ell

There were no description about functions LAFs2ell, LAFs2ellT, ells2LAFsT. Can you provide the reference about the relationship between LAF and ell? It is difficult to understand the relation only from the source code. Thank you in advance! @ @ducha-aiki

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation


parsed options:
{'dataroot': 'dataset/6Brown', 'log_dir': './logs', 'num_workers': 8, 'pin_memory': True, 'resume': '', 'start_epoch': 0, 'epochs': 20, 'batch_size': 1024, 'test_batch_size': 1024, 'n_pairs': 1000, 'n_test_pairs': 50000, 'lr': 0.005, 'wd': 0.0001, 'no_cuda': False, 'gpu_id': '0,1', 'expname': 'AffNetFast_lr005_10M_20ep_aswap', 'seed': 0, 'log_interval': 10, 'descriptor': 'HardNet', 'loss': 'HardNegC', 'arch': 'AffNetFast', 'cuda': True}

train_AffNet_test_on_graffity.py:249: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
train_AffNet_test_on_graffity.py:249: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
0.5302860736846924 detection multiscale
affnet_time 0.08162307739257812
pe_time 0.038355112075805664
0.12385892868041992 affine shape iters
0.23496747016906738 detection multiscale
affnet_time 0.02823162078857422
pe_time 0.04812979698181152
0.08057141304016113 affine shape iters
Test epoch -1
Test on graf1-6, 196 tentatives 10 true matches 0.051  inl.ratio
Now native ori
0.07808423042297363 detection multiscale
affnet_time 0.025239229202270508
pe_time 0.03278350830078125
0.06878876686096191 affine shape iters
0.07655787467956543 detection multiscale
affnet_time 0.025121450424194336
pe_time 0.0314483642578125
0.0674288272857666 affine shape iters
Test epoch -1
Test on ori graf1-6, 107 tentatives 9 true matches 0.084  inl.ratio
0it [00:00, ?it/s]Traceback (most recent call last):
  File "train_AffNet_test_on_graffity.py", line 416, in <module>
    main(train_loader, test_loader, model)
  File "train_AffNet_test_on_graffity.py", line 380, in main
    train(train_loader, model, optimizer1, epoch)
  File "train_AffNet_test_on_graffity.py", line 235, in train
    loss.backward()
  File "/root/anaconda3_py3.6_torch0.4.1/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/root/anaconda3_py3.6_torch0.4.1/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
0it [00:02, ?it/s]

out of memory when extracting patches from Oxford dataset

Hi,
I have used your code from examples/hesaffnet/WBS demo.ipynb'to extract patches and descriptors from the Oxford dataset, however when I set USE_CUDA = True, num_features=3000, the error follows:

Traceback (most recent call last):
  File "/home/dyx/workspace/affnet/examples/hesaffnet/123.py", line 103, in <module>
    detector_time, descriptors_time = get_geometry_and_descriptors(img, detector, descriptor)
  File "/home/dyx/workspace/affnet/examples/hesaffnet/123.py", line 69, in get_geometry_and_descriptors
    patches = detector.extract_patches_from_pyr(LAFs, PS = 32)
  File "/home/dyx/workspace/affnet/examples/hesaffnet/SparseImgRepresenter.py", line 178, in extract_patches_from_pyr
    PS = PS)
  File "/home/dyx/workspace/affnet/examples/hesaffnet/LAF.py", line 218, in extract_patches_from_pyramid_with_inv_index
    patches[cur_lvl_idxs,:,:,:] = extract_patches(scale_pyramid[i][j], LAFs[cur_lvl_idxs, :,:], PS )
  File "/home/dyx/workspace/affnet/examples/hesaffnet/LAF.py", line 200, in extract_patches
    return torch.nn.functional.grid_sample(img.expand(grid.size(0), ch, h, w),  grid)  
  File "/home/dyx/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 995, in grid_sample
    return GridSampler.apply(input, grid)
  File "/home/dyx/anaconda2/lib/python2.7/site-packages/torch/nn/_functions/vision.py", line 27, in forward
    input = input.contiguous()
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THC/generic/THCStorage.cu:66

Process finished with exit code 1

and when I set num_features=1000 it will be OK, could you tell me how to solve it?
Thanks a lot!

Does the hardnet part need to retrain after affnet part finished?

Thanks for your work! I find in the training and testing code, the "hardnet" descriptor is fixed. Weights are loaded from "HardNet++.pth" and not changed in the training process. I wonder how the original "HardNet++.pth" comes and why the hardnet model is not need to train again.

does my training process look ok??

Hi, thanks for the repo!

I wants to train the network, so I just call run_me.sh without any change. (but with pytorch 0.4.1)
but the process is so slow (and the gpu load is very very low), and the loss seems not changing much...
the losses are not decreasing, and the test results are worse.
So I would like to ask if the training process looks ok ?

below are part of training and validation logs

for the epoch -1

train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
0.641245126724 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.101545810699
pe_time 0.0597500801086
0.166574954987 affine shape iters
0.0878648757935 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0334780216217
pe_time 0.0607059001923
0.108034133911 affine shape iters
Test epoch -1
Test on graf1-6, 217 tentatives 11 true matches 0.050 inl.ratio
Now native ori
0.066300868988 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0342180728912
pe_time 0.0577509403229
0.11149096489 affine shape iters
0.101871013641 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0336909294128
pe_time 0.0553169250488
0.103847026825 affine shape iters
Test epoch -1
Test on ori graf1-6, 147 tentatives 10 true matches 0.068 inl.ratio

for the epoch 0

Train Epoch: 0 [9984000/10000000 (100%)] Loss: 0.9201, 1.5074,0.9073: : 9760it [32:02:26, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9760it [32:02:38, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9761it [32:02:38, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9762it [32:02:49, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9763it [32:03:01, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9764it [32:03:12, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9765it [32:03:24, 11.82s/it]
Train Epoch: 0 [9994240/10000000 (100%)] Loss: 0.9484, 1.5387,0.9369: : 9766it [32:03:32, 11.82s/it]
train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
0.0655670166016 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0328919887543
pe_time 0.0553648471832
0.103418111801 affine shape iters
0.0645890235901 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0329079627991
pe_time 0.0524799823761
0.100947141647 affine shape iters
Test epoch 0
Test on graf1-6, 183 tentatives 13 true matches 0.071 inl.ratio
Now native ori
0.0709731578827 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.033175945282
pe_time 0.0535531044006
0.103495836258 affine shape iters
0.100589036942 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0331048965454
pe_time 0.0523760318756
0.100074052811 affine shape iters
Test epoch 0
Test on ori graf1-6, 155 tentatives 9 true matches 0.058 inl.ratio

for the epoch 1
Train Epoch: 1 [9984000/10000000 (100%)] Loss: 0.9505, 2.0144,0.9437: : 9759it [33:31:50, 12.37s/it]
Train Epoch: 1 [9984000/10000000 (100%)] Loss: 0.9505, 2.0144,0.9437: : 9760it [33:32:02, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9760it [33:32:14, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9761it [33:32:14, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9762it [33:32:26, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9763it [33:32:39, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9764it [33:32:51, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9765it [33:33:03, 12.37s/it]
Train Epoch: 1 [9994240/10000000 (100%)] Loss: 0.9703, 1.9384,0.9606: : 9766it [33:33:11, 12.37s/it]
train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
train_AffNet_test_on_graffity.py:250: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
var_image = torch.autograd.Variable(torch.from_numpy(img.astype(np.float32)), volatile = True)
0.0826170444489 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0436849594116
pe_time 0.0609710216522
0.110808134079 affine shape iters
0.0830068588257 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0333650112152
pe_time 0.054986000061
0.103302001953 affine shape iters
Test epoch 1
Test on graf1-6, 165 tentatives 6 true matches 0.036 inl.ratio
Now native ori
0.066458940506 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0339682102203
pe_time 0.0546021461487
0.103893041611 affine shape iters
0.104158878326 detection multiscale
/media/iouiwc/0596f94c-b314-4162-80b4-79b3a602c9a2/iouiwc/github/affnet/SparseImgRepresenter.py:151: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
if (num_features > 0) and (num_survived.data[0] > num_features):
affnet_time 0.0339379310608
pe_time 0.0541059970856
0.104035139084 affine shape iters
/home/iouiwc/anaconda2/envs/pytorch/lib/python2.7/site-packages/matplotlib/pyplot.py:537: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning).
max_open_warning, RuntimeWarning)
Test epoch 1
Test on ori graf1-6, 148 tentatives 9 true matches 0.060 inl.ratio

How to get the A matrixs which satisfies the Geometric constraints condition EP=-AEP?

I hope everything is well with you. I am a Ph.D. student currently researching image matching and pose estimation. Recently, I read your work "Repeatability Is Not Enough: Learning Affine Regions via Discriminability" and ran the code provided in the paper's link (github.com/ducha-aiki/affnet/tree/pytorch1-4_python3) multiple times, as well as reviewing relevant materials. However, I encountered some issues I couldn't solve, so I am writing to ask for your advice. It would be greatly appreciated if you could reply at your convenience.
Firstly, I ran the demo for image matching and obtained a set of n affine transformation parameters (work_LAFs, n x 2 x 3 classical [A, (x;y)] matrix) based on my understanding. I think these parameters include the A matrix and the coordinates of the feature points, but I am not sure if my understanding is correct, and would like to know how to obtain the A matrix if it is not the case.
Secondly, we obtained the affine transformation parameters for two views (work_LAFs1 and work_LAFs2), each of which is a 2x2 matrix. I am unsure about how to obtain the so-called A matrix from these two matrices.
Thirdly, we obtained the A matrix and performed image matching. However, we found that the accuracy of the AC truth values is unsatisfactory when compared to the true values. Therefore, I would like to know how to compare them with the true values and what the accuracy of the correctly obtained A matrix should be.
Last but not least, according to this method, we can obtain the A matrix that satisfies the Geometric constraints condition EP=-AEP, where E is the essential matrix and A is the local affine transformation matrix.
Thank you very much for your consideration
Best Wishes!

nmsed_resp

What's the shape of nmsed_resp in the function https://github.com/ducha-aiki/affnet/blob/f46f0dcee547fb571c1c2d20adcd5e85cd6317b0/HandCraftedModules.py#L260 ? nmsed_resp, idxs = torch.topk(nmsed_resp, k = num_features, dim = 0); the shape of input nmsed_resp should be bachsize*3*height*width. So, the shape of output nmsed_resp is batchsize*3*height*k ? Is it right?
What's the role of the small piece of code? https://github.com/ducha-aiki/affnet/blob/f46f0dcee547fb571c1c2d20adcd5e85cd6317b0/HandCraftedModules.py#L278

Especially for the line sc_y_x = F.conv2d(resp3d, self.grid, padding = 1) / (F.conv2d(resp3d, self.grid_ones, padding = 1) + 1e-8)

Originally posted by @yunyundong in #13 (comment)

RuntimeError: [enforce fail at CPUAllocator.cpp:47]

Hello, I tried to run affnet, there was an error.
RuntimeError: [enforce fail at CPUAllocator.cpp:47] ((ptrdiff_t)nbytes) >= 0. alloc_cpu() seems to have been called with negative number: 18446744072009756672
I started to think it was a memory problem, but my computer's memory is 64G.
I choose two data sets to run successfully, but more than three show this problem.
can you tell me the reason?thank you very much!

unexpected patch extract phenomenon

Hi, I notice an strange phenomenon in generating random affine transformed training patch pairs here

The synthesis affine transformation are composition of two matrices, being rot_LAFs and TA.
As TA is rectified to be UpIsUp, it will keep the vertical lines in images are vertical. And, the same rot_LAF is applied to anchor image and patch image, so both images will be rotated with the same angle.
Therefore, after applying rot_LAFs and TA, vertical lines in anchor image and positive image should be orientated to the same angle.

But the phenomenon is that, when apply rot_LAF and TA sequentially, this is true; while apply them jointly like in your code) , this is wrong.

code for appling rot_LAF and TA sequentially
def extract_random_LAF(data, max_rot = math.pi, max_tilt = 1.0, crop_size = 32): st = int((data.size(2) - crop_size)/2) fin = st + crop_size if type(max_rot) is float: rot_LAFs, inv_rotmat = get_random_rotation_LAFs(data, max_rot) else: rot_LAFs = max_rot inv_rotmat = None aff_LAFs, inv_TA = get_random_norm_affine_LAFs(data, max_tilt); # aff_LAFs[:,0:2,0:2] = torch.bmm(rot_LAFs[:,0:2,0:2],aff_LAFs[:,0:2,0:2]) # pdb.set_trace() data_aff = extract_patches(data, aff_LAFs, PS = data.size(2)) data_aff = extract_patches(data_aff, rot_LAFs, PS = data.size(2)) data_affcrop = data_aff[:,:, st:fin, st:fin].contiguous() return data_affcrop, data_aff, rot_LAFs,inv_rotmat,inv_TA

below are some examples, note that they obtained with different runs, so random angle and tilts are different

applying TA only (note TA is rectified to be UpIsUp)
data_a0
data_p0
data_a_aff0
data_p_aff0

applying rot_LAF only (note the same rot_LAFs is applied to both anchor and positive images)
data_a0
data_p0
data_a_aff0
data_p_aff0

applying rot_LAF and TA sequentially
data_a0
data_p0
data_a_aff0
data_p_aff0

applying rot_LAF and TA jointly
data_a0
data_p0
data_a_aff0
data_p_aff0

why the validation of handcrafted Baumberg Iteration runs without orientation

Hi,
in the "train_AffNet_test_on_graffity.py", the AffFastnet is trained and validated on the 1th-6th image of Oxford viewpoint dataset.
For a test image, the features/affshape/orientation/descriptor are extracted and computed by using the following function (line 254-259)

def get_geometry_and_descriptors(img, det, desc, do_ori = True):
    with torch.no_grad():
        LAFs, resp = det(img,do_ori = do_ori)
        patches = det.extract_patches_from_pyr(LAFs, PS = 32)
        descriptors = desc(patches)
    return LAFs, descriptors

Note here, the do_ori is set to be True as default.

For learned part, the method first detect features and use learned aff module to do affine shape estimation. As shown In the test() function (validation after each epoch)
(Line 264-267)

    model.eval()
    detector = ScaleSpaceAffinePatchExtractor( mrSize = 5.192, num_features = 3000,
                                          border = 5, num_Baum_iters = 1, 
                                          AffNet = model)

and (Line 286-288)

LAFs1, descriptors1 = get_geometry_and_descriptors(img1, detector, descriptor)
torch.cuda.empty_cache()
LAFs2, descriptors2 = get_geometry_and_descriptors(img2, detector, descriptor)

The class ScaleSpaceAffinePatchExtractor handles learned or handcrafted way to do feature detection, affine shape estimation and orientation for an input image.

For the handcrafted way, the processing is this
(Line 316-317)

        LAFs1, descriptors1 = get_geometry_and_descriptors(img1, detector, descriptor, False)
        LAFs2, descriptors2 = get_geometry_and_descriptors(img2, detector, descriptor, False)

as the do_ori is switched off here, the whole process will skip the orientation step as shown in the forward() function of class ScaleSpaceAffinePatchExtractor (SparseImgRepresenter.py)

        if do_ori:
            LAFs = self.getOrientation(LAFs, final_pyr_idxs, final_level_idxs)

If I understand correctly, this means for validation of the hand-crafted way, no orientation module is used, but the aff module used here is learned affine net. [If I understand wrong, please tell me and ignore the following question]

Here is my question, why for the learned affine module, the orientation is assigned with handcrafted orientation method but for the hand-crafted way, the orientation step is skipped? I think a fairer comparison here will be comparing learned affnet with handcrafted aff estimation method. Thank you very much for you attention.

role of orientation net

Hi thanks for the repo!

I would like to ask is that specially use a OriNet to rotate the patch, and then extract descriptor from the rotated patch necessary? I think most of descriptors are, at least claimed to be, rotation invariant.

descriptor(rotate(raw_patch)) and descriptor(raw_patch), are they really differ much?

Just a curiosity on the tech report...

Hi,
I found your work very interesting. Well done!

Just one question/curiosity about the results of Fig.6 in the tech report: could you also report what it would happen if the Hessian matrix of the keypoint is used "as it is" to define the elliptical shape (i.e. a sort of 0-th Baumberg iteration), maybe with only a threshold on the ratio between the ellipse axes to discard too elongated ellipses (0.75 or below maybe could be a good value) ?

Thanks in advance for your reply

Question about evaluating the model on roxford dataset

I have tested the released model on roxford dataset, but the final result is not good as follows:

roxford5k: mAP E: 67.17, M: 53.37, H: 30.37
roxford5k: mP@k[ 1 5 10] E: [95.59 72.06 55.88], M: [97.14 81.43 62.86 ], H:[78.57 31.43 14.29]
My scheme is:

  1. Use hessian detector and affnet to extract 3000 features and hardnet++ to extract descriptors which is 3000*128.
  2. Use Ransac to find good matches between each query and reference and use the sum of matchMask as score.
  3. Use revisitop to evaluate the result.

Hope to get some suggestions.

Exact_Pacthes

After detecting a feature point by using Hessian detector, then we get a initial x0,y0 and sigma0. The range of x0 is [0, width-1], the range of y0 is [0, height-1], sigma0=power(2, i/L), i is the index of level, and L is the number of levels in a octave . Then we further fintune the location of x0 and y0 and optimize the value of sigma0, and get the final x1,y1 and sigma1.
Now, we assume the feature point located in the 0th octave and 2th level. According to the x1, y1, and sigma1, we crop a 64*64 patch from the DoG in the 0th octave and 2th level. Here, a affine transform matrix is constructed as LAF=[sigma1,0,x1;0,sigma1,y1] to crop the patch. Note, the range of sigma1 should be [1, 2].
About the sigma1, can I understand it is a radius? namely, centered at point (x1,y1) in the DoG, then the area within the radius are croped and then resampled to 64*64 patch?

Dataset used for pre-training?

Hi,
(Thank you for recommending me hardnet. Now I use hesaffnet+hardnet and this works much better!)

Can you tell me about dataset you used for pre-training?
What dataset did you use to train each model (HardNet++.pth and pretrained/AffNet.pth )?

Use pytorch-sift

To use the pytorch sift module, what format the patch should be ? Making the values [0,1] should suffice or it has to have mean 0 and std 1.

About AffNet training(Fig.5)

Hi.
I read "Repeatability is not Enough".
I was very interested in it.
Therefore, I want to try to use program of AffNet training(Fig.5 in that report).
*AffNet>ST>Desc>HardNegC Loss(using Phototour dataset)
Is there sample code of this?

Does the order of applying orientation matrix and upisup affine shape matrix matter?

Hi, I am confused if we should apply upisup affine shape matrix and then apply the orientation matrix to the patch?

The upisup affine shape matrix will keep only keep the y-axis of patch, so say if patch1 in image1 is an arraw pointing to the left, while the patch2 in image1 has the arraw pointing up. After applying the upisup affine shape matrix, the arraw in patch1 will be twisted while the the arraw in patch2 should remained up. then we apply the orientation matrix to both patch1 and patch2, even the ground truth orientation matrices will not make them similiar, except for the case where upisup affine shape matrix applyed to the patch1 is an identity matrix.

So I am thinking, should not we apply orientation matrix first to rotate both patches to the same direction, and then apply the upisup affine shape matrix to adjust the appeareance?

Comparison to vl_covdet()

In vl_covdet function, there are some parameters, is it similar the corresponding parameters in LAF?

PatchResolution 15 (SIFT) or 20 (LIOP, Patch)
The size of the patch R in pixel. Specifically, the patch is a square image of side 2*R+1 pixels.

It is the patchsize (PS), right?

PatchRelativeExtent 7.5 (SIFT), 10 (LIOP), or 6 (Patch)
The extent E of the patch in the normalized feature frame. The normalized feature frame is mapped to the feature frame F detected in the image by a certain affine transformation (A,T) (see VL_PLOTFRAME() for details). The patch is a square [-E, E]^2 in the normalize frame, and its shape in the original image is the (A,T) of it.

For the PatchRelativeExtent parameter, is it the measurement region scale, namely mrScale?

Besides, I want to know what's the difference the normalized feature frame in the vl_covdet and the normalized LAF ?
Thank you in advance.

OriNet

Why did not you talk about OriNet in the paper. How did you use it in the Table 1 and 2?

Extract Patches

Dear/Dmytro
I wonder what's the difference between "extract_patches_from_pyr" (method in SparseImgRepresenter.py) and extract_patches(method in LAF.py) ??
Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.