magicleap / superpointpretrainednetwork Goto Github PK
View Code? Open in Web Editor NEWPyTorch pre-trained model for real-time interest point detection, description, and sparse tracking (https://arxiv.org/abs/1712.07629)
License: Other
PyTorch pre-trained model for real-time interest point detection, description, and sparse tracking (https://arxiv.org/abs/1712.07629)
License: Other
hello, I want to use this model with libtorch, can you teach me how to convert .pth file to .pt file?
@ddetone
Hi, I did the experiment between SuperPoint and ORB together with my classmate(three different groups data were used), strangely found that ORB's result is not so bad compared with SuperPoint(while in the paper, M. Score, a parameter indicates the overall performance, is much higher than ORB). In terms of
difference of matched point's y-axis, while the portion of the delta y (y2-y1) equals to zero is higher than SuperPoint for each-two test images.
Thank you for your great work!
I wonder why the size of the descriptor is 256 float? Have you tested the other size ?
Thanks in advance! Really hope to get your reply!
The keypoint coordinates are extracted from the heatmap via:
SuperPointPretrainedNetwork/demo_superpoint.py
Lines 251 to 257 in 1fda796
The indices returned by np.where
, respectively np.nonzero
are in order of the dimensions. In a NumPy array, the first dimension is along the rows (i.e. the y
coordinates in an image), the second dimension is along the columns (i.e. the x
coordinates in an image).
Hence, syntactically this should be:
ys, xs = np.where(heatmap >= self.conf_thresh) # Confidence threshold.
# [...]
pts[0, :] = xs
pts[1, :] = ys
pts[2, :] = heatmap[ys, xs]
The actual variable name does not matter later, as those values are correctly normalised.
Hi @ddetone , I'm curious about the time cost of superpoint feature extraction.
You mentioned in your paper that you estimate the total runtime of the system on a GPU
to be about 13ms. But when I extracted superpoint features, (640*480, with RGB 3 channel ), using single GPU (Tesla P100 PCIe 16GB), the time cost is about 60ms. Is that normal? Or have you calculated the time cost yourself and you may guess what's wrong with my 60ms.
Thanks in advance! Really hope to get your reply!
How to create .pth file that has weights?
Hello, I have a question. How to infer batch images at one time ?
Anyone can give some advises?
@ddetone my results after trying to retrain your model based on your released code:
It is able now to cope with rotation angles greater than 45 degree.
The threshold I was setting is 0.3
I modified your training a bit and trained the descriptor part only. ( 1000 coco images + 1000 hpatches images, 21 epochs)
I'm using the following code to estimate the keypoints and matches using onnx
import json
import onnxruntime
import numpy as np
import cv2
path = "output/rgb.png"
img = cv2.imread(path)
img = cv2.resize(img, dsize=(640, 480), interpolation=cv2.INTER_AREA)
img.resize((1, 1, 640, 480))
data = json.dumps({'data': img.tolist()})
data = np.array(json.loads(data)['data']).astype('float32')
session = onnxruntime.InferenceSession("output/superpoint_640x480.onnx", None)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
print(input_name)
print(output_name)
result = session.run([output_name], {input_name: data})
print(result)
How do I interpret the result? or is it the proper way of doing it?
Hi,
I computed repeatability score of your pre-trained network using the repeatability code provided at https://github.com/rpautrat/SuperPoint/blob/master/superpoint/evaluations/detector_evaluation.py
I am getting the following results:
Whereas in paper, the reported repeatability score is 0.652 and 0.503 on illumination and viewpoint scenes respectively.
I found that your pretrained model performs much better than other open models on our images. May I ask if you have use other training data except coco? Thanks!
Hi I was trying to train on one of my dataset for a student research project. Can you please tell me if you applied NMS while homographic adaptation phase .
Thanks a lot
Do you use the pixel shuffle in the superpoint model? if not, can you try to reinterpret the following sentence's meaning ?
In sentence in the superpoint paper:
Unfortunately, upsampling layers tend to add a high amount of computation and can introduce unwanted checkerboard artifacts, thus we designed the interest point detection head with an explicit decoder1 to reduce the computation of the model. This decoder has no parameters and is known as “sub-pixel convolution” or “depth to space” in TensorFlow or “pixel shuffle” in PyTorch.
thank you.
Hello,
I'm trying to build my SuperPoint network from scratch and I want to implement the Matching Score metric. I am finding it hard to understand how the metric is computed. What do you mean by shared viewpoint region? Is the matching score the ratio of number of matches to number of keypoints detected in the two views?
Could you explain with an example from HPatches?
Thank you!
sorry, I have a question about batchnorm layers. I didn't find norm layer in .py (demo_superpoint.py). like :
` # Shared Encoder.
x = self.relu(self.conv1a(x))
x = self.relu(self.conv1b(x))
x = self.pool(x)
x = self.relu(self.conv2a(x))
x = self.relu(self.conv2b(x))`
Starting in version 1.4.0 of PyTorch the default behavior of torch.nn.functional.grid_sample
was changed to align_corners=False
. This might affect the outputting descriptors for certain input image sizes when compared to PyTorch 1.3-. To fix it, one simply needs to modify L281 as follows:
desc = torch.nn.functional.grid_sample(coarse_desc, samp_pts, align_corners=True)
How do i train my custum superpint and superglue models? Any git repo or package? Instead of using already available dataset?
When I run the demo with nyu_snippet.mp4 in GPU mode,it takes a large amount of time to pre-train and displays warning: PointTracker: no points were added to tracker. And there's no Superpoint shown in visualisation.What 's the problem?
When I just simply run the demo command as README.md :
./demo_superpoint.py assets/icl_snippet/
I got this:
Namespace(H=120, W=160, camid=0, conf_thresh=0.015, cuda=False, display_scale=2, img_glob='*.png', input='assets/icl_snippet/', max_length=5, min_length=2, nms_dist=4, nn_thresh=0.7, no_display=False, show_extra=False, skip=1, waitkey=1, weights_path='superpoint_v1.pth', write=False, write_dir='tracker_outputs/')
==> Processing Image Directory Input.
==> Loading pre-trained network.
==> Successfully loaded pre-trained network.
==> Running Demo.
Segmentation fault (core dumped)
As well as aboved problem, I use this test phase in my images, they finished unnormally like this:
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
I debug and find the error location: when codes run at dmat = np.dot(desc1.T, desc2)
in method nn_match_two_way, it will give such an error.
Anyone else see same situations?
Hi,
I wanted to get the HPatches benchmark score for SuperPoint. But to eliminate detector related factors, I want to have a feature vector describing 65x65 patches from the dataset.
How can I use the provided pre-trained model to get the descriptor on a 65x65 image patch?
Recommend going from:
parser.add_argument('input', type=str, default='',
To:
parser.add_argument('--input', type=str, default='',
It's amazing work. May you share the code? I would appreciate it !!!
Apart from resizing the image and reducing the skin horn iteration threshold, what are the various ways to optimize the inference speed in older gpus?
How can I use the code to provide custom points for tracking in the whole video? Suppose I want only 4 points to be tracked in the whole video so how can I specify them in the code to customize it.
Although the readme states that the synthetic shapes dataset or the training dataset, is there any plan to release the initial detector of MagicPoint?
Forgive me if this has been released already, or there's mention of it not being released elsewhere.
Is it somehow possible to convert/train the float descriptor to binary?
My pipeline is fully optimized for HAMMING distance matching so it would be nice to have the binary descriptor out of superpoint.
thanks,
Francesco
seem like that the pretrained model privided is not rubust agsinst Rotation,Right?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.