GithubHelp home page GithubHelp logo

xuelunshen / gim Goto Github PK

View Code? Open in Web Editor NEW
267.0 18.0 9.0 69.43 MB

GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)

Home Page: https://xuelunshen.com/gim

License: MIT License

Python 100.00%
camera-pose-estimation image-matching

gim's Introduction

English Chinese

GIM: Learning Generalizable Image Matcher From Internet Videos

ICLR 2024 Spotlight Project Page arxiv HuggingFace Space Overview Video GitHub Repo stars

Intel Intel Intel

Method
Mean
AUC@5°
(%) ↑
GL3 BLE ETI ETO KIT WEA SEA NIG MUL SCE ICL GTA
Handcrafted
RootSIFT 31.8 43.5 33.6 49.9 48.7 35.2 21.4 44.1 14.7 33.4 7.6 14.8 35.1
Sparse Matching
SuperGlue (in) 21.6 19.2 16.0 38.2 37.7 22.0 20.8 40.8 13.7 21.4 0.8 9.6 18.8
SuperGlue (out) 31.2 29.7 24.2 52.3 59.3 28.0 28.4 48.0 20.9 33.4 4.5 16.6 29.3
GIM_SuperGlue
(50h)
34.3 43.2 34.2 58.7 61.0 29.0 28.3 48.4 18.8 34.8 2.8 15.4 36.5
LightGlue 31.7 28.9 23.9 51.6 56.3 32.1 29.5 48.9 22.2 37.4 3.0 16.2 30.4
GIM_LightGlue
(100h)
38.3 46.6 38.1 61.7 62.9 34.9 31.2 50.6 22.6 41.8 6.9 19.0 43.4
Semi-dense Matching
LoFTR (in) 10.7 5.6 5.1 11.8 7.5 17.2 6.4 9.7 3.5 22.4 1.3 14.9 23.4
LoFTR (out) 33.1 29.3 22.5 51.1 60.1 36.1 29.7 48.6 19.4 37.0 13.1 20.5 30.3
GIM_LoFTR
(50h)
39.1 50.6 43.9 62.6 61.6 35.9 26.8 47.5 17.6 41.4 10.2 25.6 45.0
🟩 GIM_LoFTR
(100h)
ToDO
Dense Matching
DKM (in) 46.2 44.4 37.0 65.7 73.3 40.2 32.8 51.0 23.1 54.7 33.0 43.6 55.7
DKM (out) 45.8 45.7 37.0 66.8 75.8 41.7 33.5 51.4 22.9 56.3 27.3 37.8 52.9
GIM_DKM
(50h)
49.4 58.3 47.8 72.7 74.5 42.1 34.6 52.0 25.1 53.7 32.3 38.8 60.6
GIM_DKM
(100h)
51.2 63.3 53.0 73.9 76.7 43.4 34.6 52.5 24.5 56.6 32.2 42.5 61.6
RoMa (in) 46.7 46.0 39.3 68.8 77.2 36.5 31.1 50.4 20.8 57.8 33.8 41.7 57.6
RoMa (out) 48.8 48.3 40.6 73.6 79.8 39.9 34.4 51.4 24.2 59.9 33.7 41.3 59.2
🟩 GIM_RoMa ToDO

The data in this table comes from the ZEB: Zero-shot Evaluation Benchmark for Image Matching proposed in the paper. This benchmark consists of 12 public datasets that cover a variety of scenes, weather conditions, and camera models, corresponding to the 12 test sequences starting from GL3 in the table. We will release ZEB as soon as possible.

✅ TODO List

  • Inference code
    • gim_roma
    • gim_dkm
    • gim_loftr
    • gim_lightglue
  • Training code

We are actively continuing with the remaining open-source work and appreciate everyone's attention.

🤗 Online demo

Go to Huggingface to quickly try our model online.

⚙️ Environment

I set up the running environment on a new machine using the commands listed below.

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install albumentations==1.0.1 --no-binary=imgaug,albumentations
pip install pytorch-lightning==1.5.10
pip install opencv-python==4.5.3.56
pip install imagesize==1.2.0
pip install kornia==0.6.10
pip install einops==0.3.0
pip install loguru==0.5.3
pip install joblib==1.0.1
pip install yacs==0.1.8
pip install h5py==3.1.0

🔨 Usage

Clone the repository

git clone https://github.com/xuelunshen/gim.git
cd gim

Download gim_dkm model weight from Google Drive

Put it on the folder weights

Run the following command

python demo.py --model gim_dkm

or

python demo.py --model gim_lightglue

The code will match a1.png and a2.png in the folder assets/demo
, and output a1_a2_match.png and a1_a2_warp.png.

Click to show a1.png and a2.png.

Click to show a1_a2_match.png.

a1_a2_match.png is a visualization of the match between the two images

Click to show a1_a2_warp.png.

a1_a2_warp.png shows the effect of projecting image a2 onto image a1 using homography

There are more images in the assets/demo folder, you can try them out.

Click to show other images.

📌 Citation

If the paper and code from gim help your research, we kindly ask you to give a citation to our paper ❤️. Additionally, if you appreciate our work and find this repository useful, giving it a star ⭐️ would be a wonderful way to support our work. Thank you very much.

@inproceedings{
xuelun2024gim,
title={GIM: Learning Generalizable Image Matcher From Internet Videos},
author={Xuelun Shen and Zhipeng Cai and Wei Yin and Matthias Müller and Zijun Li and Kaixuan Wang and Xiaozhi Chen and Cheng Wang},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}

🌟 Star History

Star History Chart

License

This repository is under the MIT License. This content/model is provided here for research purposes only. Any use beyond this is your sole responsibility and subject to your securing the necessary rights for your purpose.

gim's People

Contributors

xuelunshen avatar yvanyin avatar

Stargazers

 avatar Kentechx avatar  avatar  avatar  avatar  avatar SpyderZSY avatar Haiping Wang avatar  avatar Tamerlane avatar Colin Lee avatar Mengfan He avatar George Profenza avatar  avatar  avatar Kim Seong Hyeon avatar  avatar Pranav Manu avatar  avatar yejun avatar  avatar Edi Rumano avatar  avatar Daniel E. Acuna avatar  avatar Wang Shizun avatar Fan Yang avatar ZachDuan avatar  avatar XiGuang avatar  avatar Lixin avatar  avatar Dávid Komorowicz avatar 妖妖 avatar  avatar  avatar Hao Yu avatar  avatar Zeng Cheng avatar Chenyu avatar Viktor Larsson avatar Inquisitive Ibex avatar  avatar dong avatar Vincent avatar  avatar  avatar Ba Tran avatar  avatar ali_robot avatar Realcat avatar  avatar Frano Rajič avatar  avatar Wei Ding avatar  avatar  avatar Matthew avatar Junran Peng avatar  avatar Miles avatar Bratislava avatar Shijie Lin avatar Geng Zemin avatar  avatar  avatar  avatar  avatar Junda  Cheng avatar Puhua Jiang avatar elucida avatar Albeit Yang avatar Wangchao_Yu avatar  avatar Shang XU avatar 郭恒 avatar  avatar Lieλ avatar zha0ming1e avatar  avatar shuzhe wang avatar Peter Baylies avatar ericdejavu avatar Wenlong Zhu avatar Mr.Yang avatar WenLi avatar PilJoong avatar Kwon Ko avatar Sacha Lévy avatar Derrick avatar Shuo avatar tz ✨ avatar Erick Sosa Garcia avatar  avatar Siyan Dong avatar Heather Lynn Mulloy avatar wuyujack (Mingfu Liang) avatar D-Pheobus avatar  avatar

Watchers

cheng zhang avatar Dávid Komorowicz avatar Matt Angus avatar heiheihei avatar  avatar Lixin avatar  avatar Zhipeng Cai avatar Chenyu avatar Bukenya Lukman avatar ericdejavu avatar Wastoon avatar janusch avatar Jiaqi Gu avatar Eason Zhang avatar Xingyi He avatar Drewvv avatar  avatar

gim's Issues

GIM implement in hloc?

Thank you all for the amazing work!
As you said in #3 : “The code for SfM is based on hloc. In detail, we have imitated the LoFTR in hloc to implement the SfM for DKM” . To implement GIM, hloc need to add a new interface in hloc/extractors/ or hloc/matchers/, could you release the code for that?

Hugging Face is in maintenance

Hi guys, thanks for your nice work! It seems that the server on the hugging face is not OK and we cannot try the online demo anymore. Is there any method to fix this?

python demo.py /home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory warn(f"Failed to load image Python extension: {e}") Traceback (most recent call last): File "demo.py", line 316, in <module> state_dict = torch.load(checkpoints_path, map_location='cpu') File "/home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_arg

python demo.py
/home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
File "demo.py", line 316, in
state_dict = torch.load(checkpoints_path, map_location='cpu')
File "/home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/yuanweizhong/anaconda3/envs/gim/lib/python3.8/site-packages/torch/serialization.py", line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_arg

ZEB release timeline?

Hi, great work!

I'm wondering if you have a timeline for the release your proposed benchmark.
Personally I think this is more important than releasing the training code.

I've personally had some issues even downloading the related training sets (e.g. GL3D seems to be down now?).

Maybe authors of those datasets could allow you to bundle only the data used for the benchmark into an easy download (perhaps with some LICENSE restrictions)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.