GithubHelp home page GithubHelp logo

yamactan / unimatch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from autonomousvision/unimatch

0.0 0.0 0.0 21.95 MB

[TPAMI'23] Unifying Flow, Stereo and Depth Estimation

Home Page: https://haofeixu.github.io/unimatch/

License: MIT License

Shell 8.50% Python 91.50%

unimatch's Introduction

Unifying Flow, Stereo and Depth Estimation

Haofei Xu · Jing Zhang · Jianfei Cai · Hamid Rezatofighi · Fisher Yu · Dacheng Tao · Andreas Geiger

TPAMI 2023

Logo

A unified model for three motion and 3D perception tasks.

Logo

We achieve the 1st places on Sintel (clean), Middlebury (rms metric) and Argoverse benchmarks.

This project is developed based on our previous works:

Installation

Our code is developed based on pytorch 1.9.0, CUDA 10.2 and python 3.8. Higher version pytorch should also work well.

We recommend using conda for installation:

conda env create -f conda_environment.yml
conda activate unimatch

Alternatively, we also support installing with pip:

bash pip_install.sh

Model Zoo

A large number of pretrained models with different speed-accuracy trade-offs for flow, stereo and depth are available at MODEL_ZOO.md.

We assume the downloaded weights are located under the pretrained directory.

Otherwise, you may need to change the corresponding paths in the scripts.

Demo

Given an image pair or a video sequence, our code supports generating prediction results of optical flow, disparity and depth.

Please refer to scripts/gmflow_demo.sh, scripts/gmstereo_demo.sh and scripts/gmdepth_demo.sh for example usages.

kitti_demo.mp4

Datasets

The datasets used to train and evaluate our models for all three tasks are given in DATASETS.md

Evaluation

The evaluation scripts used to reproduce the numbers in our paper are given in scripts/gmflow_evaluate.sh, scripts/gmstereo_evaluate.sh and scripts/gmdepth_evaluate.sh.

For submission to KITTI, Sintel, Middlebury and ETH3D online test sets, you can run scripts/gmflow_submission.sh and scripts/gmstereo_submission.sh to generate the prediction results. The results can be submitted directly.

Training

All training scripts for different model variants on different datasets can be found in scripts/*_train.sh.

We support using tensorboard to monitor and visualize the training process. You can first start a tensorboard session with

tensorboard --logdir checkpoints

and then access http://localhost:6006 in your browser.

Citation

@article{xu2023unifying,
  title={Unifying Flow, Stereo and Depth Estimation},
  author={Xu, Haofei and Zhang, Jing and Cai, Jianfei and Rezatofighi, Hamid and Yu, Fisher and Tao, Dacheng and Geiger, Andreas},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2023}
}

This work is a substantial extension of our previous conference paper GMFlow (CVPR 2022, Oral), please consider citing GMFlow as well if you found this work useful in your research.

@inproceedings{xu2022gmflow,
  title={GMFlow: Learning Optical Flow via Global Matching},
  author={Xu, Haofei and Zhang, Jing and Cai, Jianfei and Rezatofighi, Hamid and Tao, Dacheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8121-8130},
  year={2022}
}

Acknowledgements

This project would not have been possible without relying on some awesome repos: RAFT, LoFTR, DETR, Swin, mmdetection and Detectron2. We thank the original authors for their excellent work.

unimatch's People

Contributors

haofeixu avatar tosemml avatar jiahaoxia avatar shriarul5273 avatar leonidk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.