GithubHelp home page GithubHelp logo

zyangchen / mocha-stereo Goto Github PK

View Code? Open in Web Editor NEW
89.0 12.0 2.0 184 KB

[CVPR2024] The official implementation of "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching”.

License: MIT License

Python 100.00%
cvpr2024 stereo-matching

mocha-stereo's Introduction

MoCha-Stereo 抹茶算法

[CVPR2024] The official implementation of "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching".

Demo.mp4

V1 Version

     

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching
Ziyang Chen†, Wei Long†, He Yao†, Yongjun Zhang✱,Bingshu Wang, Yongbin Qin, Jia Wu
CVPR 2024
Correspondence: [email protected]; [email protected]

@inproceedings{chen2024mocha,
  title={MoCha-Stereo: Motif Channel Attention Network for Stereo Matching},
  author={Chen, Ziyang and Long, Wei and Yao, He and Zhang, Yongjun and Wang, Bingshu and Qin, Yongbin and Wu, Jia},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={27768--27777},
  year={2024}
}

Requirements

Python = 3.8

CUDA = 11.3

conda create -n mocha python=3.8
conda activate mocha
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113

The following libraries are also required

tqdm
tensorboard
opt_einsum
einops
scipy
imageio
opencv-python-headless
scikit-image
timm
six

Dataset

To evaluate/train RAFT-stereo, you will need to download the required datasets.

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── KITTI_2015
            ├── testing
            ├── training
        ├── KITTI_2012
            ├── testing
            ├── training
    ├── Middlebury
        ├── MiddEval3
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
        ├── two_view_testing

Training

python train_stereo.py --batch_size 8 --mixed_precision

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury full resolution), run

python evaluate_stereo.py --restore_ckpt models/mocha-stereo.pth --dataset middlebury_F

Weight is available here.

FAQ (Same question asked ≥ 3 times)

Q1. Weight for "tf_efficientnetv2_l"?

A1: Please refer to issue #6 "关于tf_efficientnetv2_l检查点的问题", #8 "预训练权重", and #9 "code error".

Todo List

  • [CVPR2024] V1 version
    • Paper
    • Code of MoCha-Stereo
  • V2 version
    • Preprint manuscript
    • Code of MoCha-V2

Acknowledgements

  • This project borrows the code from IGEV, RAFT-Stereo, GwcNet. We thank the original authors for their excellent works!
  • Grateful to Prof. Wenting Li, Prof. Huamin Qu, Dr. Junda Cheng, Mr./Mrs. "DLUTTengYH" and anonymous reviewers for their comments on "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching" (V1 version of MoCha-Stereo).
  • This project is supported by Science and Technology Planning Project of Guizhou Province, Department of Science and Technology of Guizhou Province, China (Project No. [2023]159).
  • This project is supported by Natural Science Research Project of Guizhou Provincial Department of Education, China (QianJiaoJi[2022]029, QianJiaoHeKY[2021]022).

mocha-stereo's People

Contributors

zyangchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mocha-stereo's Issues

code error

File "MoCha-Stereo/evaluate_stereo.py", line 14, in
from mocha_stereo import Mocha, autocast
File "MoCha-Stereo/core/mocha_stereo.py", line 5, in
from core.extractor import MultiBasicEncoder, Feature, SpatialInfEncoder
ImportError: cannot import name 'SpatialInfEncoder' from 'core.extractor' (MoCha-Stereo/core/extractor.py)

请问MCA是如何更关注边缘信息的?

在论文中您写到f_pre = F(f - G(f)), 其中G是高通滤波器,所以应该只有高频信息也就是边缘会通过,那么f-G(f)会将这些边缘信息变成0,而低频信息不变,那么我在求和的时候,高频信息的响应值应该更低才对啊,为什么会更关注重复的几何结构呢?这里不是很理解,麻烦您解答一下,万分感谢!

License

Hello,

Thank you for your great work and for publishing the code for your model. Could you please add a license so it is clear how the code in this repository can be used? You state that you borrowed code from IGEV, RAFT-Stereo, GwcNet. All of them use MIT license, so this might be a good choice, but it is, of course, at your discretion.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.