GithubHelp home page GithubHelp logo

truongkhang / topicfm Goto Github PK

View Code? Open in Web Editor NEW
101.0 5.0 4.0 99.27 MB

[AAAI2023] TopicFM: Robust, Efficient, and Interpretable Topic-Assisted Feature Matching

License: Apache License 2.0

Python 98.90% Shell 1.10%
camera-pose-estimation feature-matching image-matching visual-localization

topicfm's Introduction

TopicFM+: Boosting Accuracy and Efficiency of Topic-Assisted Feature Matching

This code implements TopicFM+, which is an extension of TopicFM. For the implementation of previous version TopicFM, please checkout the aaai23_ver branch.

Requirements

All experiments in this paper are implemented on the Ubuntu environment with a NVIDIA driver of at least 430.64 and CUDA 10.1.

First, create a virtual environment by anaconda as follows,

conda create -n topicfm python=3.8 
conda activate topicfm
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt
# using pip to install any missing packages

Data Preparation

The proposed method is trained on the MegaDepth dataset and evaluated on the MegaDepth test, ScanNet, HPatches, Aachen Day and Night (v1.1), and InLoc dataset. All these datasets are large, so we cannot include them in this code. The following descriptions help download these datasets.

MegaDepth

This dataset is used for both training and evaluation (Li and Snavely 2018). To use this dataset with our code, please follow the instruction of LoFTR.

ScanNet

We only use 1500 image pairs of ScanNet (Dai et al. 2017) for evaluation. Please download and prepare test data of ScanNet provided by LoFTR.

Training

To train our model, we recommend using GPU cards as much as possible, and each GPU should be at least 12GB. In our settings, we train on 4 GPUs, each of which is 12GB. Please setup your hardware environment in scripts/reproduce_train/outdoor.sh. Then run this command to start training.

bash scripts/reproduce_train/outdoor.sh <path to the training config file>
# for example,
bash scripts/reproduce_train/outdoor.sh configs/megadepth_train_topicfmfast.py

We provided the pretrained models, which were used in the paper (TopicFM-fast, TopicFM+)

Evaluation

MegaDepth (relative pose estimation)

bash scripts/reproduce_test/outdoor.sh <path to the config file in the folder configs> <path to pretrained model>
# For example, to evaluate TopicFM-fast 
bash scripts/reproduce_test/outdoor.sh configs/megadepth_test_topicfmfast.py pretrained/topicfm_fast.ckpt

ScanNet (relative pose estimation)

bash scripts/reproduce_test/indoor.sh <path to the config file in the folder configs> <path to pretrained model>

HPatches, Aachen v1.1, InLoc

To evaluate on these datasets, we integrate our code to the image-matching-toolbox provided by Patch2Pix. The updated code and detailed evaluations are available here.

Image Matching Challange 2023

Our method TopicFM+ achieved a high ranking (silver medal) on the Kaggle IMC2023 here.

Efficiency comparison

The efficiency evaluation reported in the paper was measured by averaging runtime of 1500 image pairs of the ScanNet evaluation dataset. The image size can be changed in configs/data/scannet_test_topicfmfast.py

We computed computational costs in GFLOPs and runtimes in ms for LoFTR, MatchFormer, QuadTree, and AspanFormer. However, this process required minor modification of the code of each method individually. Please contact us if you need evaluations for those methods.

Here, we provide the runtime measurement for our method, TopicFM-fast

python visualization.py --method topicfmv2 --dataset_name scannet --config_file configs/scannet_test_topicfmfast.py  --measure_time --no_viz

Runtime report at the image resolution of (640, 480) (measured on NVIDIA TITAN V 32GB of Mem.)

Model 640 x 480 1200 x 896
TopicFM-fast 56 ms 346 ms
TopicFM+ 90 ms 388 ms

Citations

If you find this code useful, please cite the following works:

@misc{giang2023topicfm,
  title={TopicFM+: Boosting Accuracy and Efficiency of Topic-Assisted Feature Matching}, 
  author={Khang Truong Giang and Soohwan Song and Sungho Jo},
  year={2023},
  eprint={2307.00485},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

or

@inproceedings{giang2023topicfm,
    title={TopicFM: Robust and interpretable topic-assisted feature matching},
    author={Giang, Khang Truong and Song, Soohwan and Jo, Sungho},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={37},
    number={2},
    pages={2447--2455},
    year={2023}
}

Acknowledgement

This code is built based on LoFTR. We thank the authors for their useful source code.

topicfm's People

Contributors

truongkhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

topicfm's Issues

About reproduce performances

Dear author, thanks for your great work! Recently i'm trying to reproduce your results, and i have some problem about it.

For training, I use your default setting(only change bs=8) to train TopicFMfast and TopicFMplus on unprocessed MegaDepth dataset. For testing, I also use the default setting to test on MegaDepth-1500 and Scannet-1500.

Here are the results, "paper" means the number you reported in your paper; "Pretrained" means tested with the weights you provided; "reproduce" means tested with the weights i trained:

MegaDepth-1500

name auc@5 auc@10 auc@20 prec@5e-04
TopicFM-fast(Paper) 56.2 71.9 82.9 -
TopicFM-fast(Pretrained) 0.5606952 0.71539569 0.82653555 0.93822512
TopicFM-fast(reproduce) 0.52482495 0.68158841 0.79560471 0.94266812
TopicFM+(paper) 58.2 72.8 83.2 -
TopicFM+(Pretrained) 0.56792696 0.72138904 0.83045042 0.95216762
TopicFM+(reproduce) 0.56648886 0.71682282 0.82511755 0.94402297

Scannet

name auc@5 auc@10 auc@20 prec@5e-04
TopicFM-fast(Paper) 19.7 36.7 52.7 -
TopicFM-fast(Pretrained) 0.18852710073235984 0.3612222633840676 0.5254263432725385 0.7259958315195613
TopicFM-fast(reproduce) 0.19810854534909592 0.3723126894809127 0.5360527855679162 0.7568452620821726
TopicFM+(paper) 20.4 38.5 54.5  
TopicFM+(Pretrained) 0.19944785341139995 0.3751533539225028 0.5389162160798258 0.6811763544889461
TopicFM+(reproduce) 0.20168166013885477 0.38085154323096326 0.5433350728906183 0.6783286089835198

Here are some questions:

  1. The "pretrained" model is not as accurate as reported(both megaDepth and Scannet, fast and plus). Are there some special parameters and tricks you used when testing to reach the reported performances?
  2. There are still some gaps between "pretrained" and "reproduce" when testing on MegaDepth, even though they are test under the same setting. But the results of "pretrained" and "reproduce" on Scannet are close. I don't know what causes the gaps on MegaDepth.
  3. An unrelated question: when I run the test, I change the batch_size=2 in the scripts (e.g “scripts/reproduce_test/outdoor.sh"), but it seems to be working and the test are still performed one by one(batch_size=1). Is it normal?

Thanks you again!

Can you provide D2-Net preprocessed images?

Hi, Authors,
Thank you for your time.
As the title, can you provide D2-Net preprocessed images?
As I have confirmed that this data has been removed from Loftr.
Best regards

Data type error in DynamicFineMatching forward()

Hello,

With the latest model weights (topicfm_fast.ckpt) and default config, I am getting an error at the very end of the DynamicFineMatching forward(...) function here

The exception reads:

scale0_f, scale1_f = scale0[true_matches], scale1[true_matches]
TypeError: 'float' object is not subscriptable

I believe this is because self.scale0 and self.scale1 are not properly set as tensors and remain floats here. I've looked all around to see where these might be set but I cannot find them.
Honestly, I don't really understand how the code/architecture works, but I got an older version of the codebase working out of the box, using image-matching-webgui

Could you explain what these scale attributes are and if there is a workaround for this issue?

Thank you! This is very impressive work and super helpful to my project

Contact Author

Hello, the author. After reading your article, I learned a lot. I am a graduate student in school, and I am also studying image feature matching. I want to ask if the author can leave an email or other ways to learn from you. If you can, thank you very much!

fine level correspondance

Great work!
Could you explain a little bit why the self-supervised loss (symmetric epipolar distance) will encourage the model to find the most informative point in patch A?

this is a error

Traceback (most recent call last):
File "D:\Project\image_match_code\TopicFM-main-n\demo\demo.py", line 112, in
matcher(batch)
File "D:\software\anaconda\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Project\image_match_code\TopicFM-main-n\src\models\topic_fm.py", line 43, in forward
feats_c, feats_f = self.backbone(torch.cat([data['image0'], data['image1']], dim=0))
File "D:\software\anaconda\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Project\image_match_code\TopicFM-main-n\src\models\backbone\fpn.py", line 99, in forward
x3_out = self.layer3_outconv2(x3_out+x4_out_2x)
RuntimeError: The size of tensor a (81) must match the size of tensor b (82) at non-singleton dimension 3

Process finished with exit code 1

Dataset Issues

Could the author privately distribute the dataset? I can't download here。QAQ

when the code will be released?

Hello, author. I just read your paper and it's really good. I want to ask when the code will be released? I want to learn!!

mask

Hi,author
image
1715139043447
I'm a little confused about the mask here,I understand this mask to refer to the useful area after padding(padding (bool): If set to 'True', zero-pad resized images to squared size.),I want to know if the image’s size of my dataset is squared size,Can I skip these steps(resize,padding...) and set the mask to none.
Expect your reply.

Runtime

Hi,
Thank you for your excellent work.

I have a question about the inference time compared to LofTR.
Because I do not have a computer with GPU here, I cannot run the section "Runtime comparison" as suggested in the readme.
Thus, can you please show me the inference time of TopicFM and LofTR?

I am looking forward to hearing from you soon.

this is a question

Hello author, which version of the code corresponds to TopicFM+? Which version of the code corresponds to TopicFM_fast? I use the code corresponding to TopicFM to run the above two weight files and will report an error, which can only correspond to model_best this weight file.

Feature Extraction

Hello,author, I would like to inquire why feature extraction is often performed on single-channel gray images. What are the advantages of this approach and is it possible to use three-channel color images for feature extraction?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.