GithubHelp home page GithubHelp logo

itailang / scoop Goto Github PK

View Code? Open in Web Editor NEW
53.0 2.0 1.0 31.58 MB

Self-Supervised Correspondence and Optimization-Based Scene Flow (CVPR 2023)

Home Page: https://itailang.github.io/SCOOP/

License: Other

Cuda 4.65% C++ 0.84% Python 90.90% Shell 3.61%
correspondence deep-learning point-cloud pytorch scene-understanding run-time-optimization

scoop's Introduction

SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow

[Project Page] [Paper] [Video] [Slides] [Poster]

Created by Itai Lang1,2, Dror Aiger2, Forrester Cole2, Shai Avidan1, and Michael Rubinstein2.
1Tel Aviv University   2Google Research

scoop_result

Abstract

Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations. Recently, there have been efforts to compute the scene flow from 3D point clouds. A common approach is to train a regression model that consumes source and target point clouds and outputs the per-point translation vector. An alternative is to learn point matches between the point clouds concurrently with regressing a refinement of the initial correspondence flow. In both cases, the learning task is very challenging since the flow regression is done in the free 3D space, and a typical solution is to resort to a large annotated synthetic dataset.

We introduce SCOOP, a new method for scene flow estimation that can be learned on a small amount of data without employing ground-truth flow supervision. In contrast to previous work, we train a pure correspondence model focused on learning point feature representation and initialize the flow as the difference between a source point and its softly corresponding target point. Then, in the run-time phase, we directly optimize a flow refinement component with a self-supervised objective, which leads to a coherent and accurate flow field between the point clouds. Experiments on widespread datasets demonstrate the performance gains achieved by our method compared to existing leading techniques while using a fraction of the training data.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{lang2023scoop,
  author = {Lang, Itai and Aiger, Dror and Cole, Forrester and Avidan, Shai and Rubinstein, Michael},
  title = {{SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow}},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages = {5281--5290},
  year = {2023}
}

Installation

The code has been tested with Python 3.6.13, PyTorch 1.6.0, CUDA 10.1, and cuDNN 7.6.5 on Ubuntu 16.04.

Clone this repository:

git clone https://github.com/itailang/SCOOP.git
cd SCOOP/

Create a conda environment:

# create and activate a conda environment
conda create -n scoop python=3.6.13 --yes
conda activate scoop

Install required packages:

sh install_environment.sh

Compile the Chamfer Distance op, implemented by Groueix et al. The op is located under auxiliary/ChamferDistancePytorch/chamfer3D folder. The following compilation script uses a CUDA 10.1 path. If needed, modify script to point to your CUDA path. Then, use:

sh compile_chamfer_distance_op.sh

The compilation results should be created under auxiliary/ChamferDistancePytorch/chamfer3D/build folder.

Usage

Data

Create folders for the data:

mkdir ./data/
mkdir ./data/FlowNet3D/
mkdir ./data/HPLFlowNet/

We use the point cloud data version prepared Liu et al. from their work FlowNet3D. Please follow their code to acquire the data.

  • Put the preprocessed FlyingThings3D dataset at ./data/FlowNet3D/data_processed_maxcut_35_20k_2k_8192/. This data is denoted as FT3Do.
  • Put the preprocessed KITTI dataset at ./data/flownet3d/kitti_rm_ground/. This dataset is denoted as KITTIo and its subsets are denoted as KITTIv and KITTIt.

We also use the point cloud data version prepared Gu et al. from their work HPLFlowNet. Please follow their code to acquire the data.

  • Put the preprocessed FlyingThings3D dataset at ./data/HPLFlowNet/FlyingThings3D_subset_processed_35m/. This dataset is denoted as FT3Ds.
  • Put the preprocessed KITTI dataset at ./data/flownet3d/KITTI_processed_occ_final/. This dataset is denoted as KITTIs.

Note that you may put the data elsewhere and create a symbolic link to the actual location. For example:

ln -s /path/to/the/actual/dataset/location ./data/FlowNet3D/data_processed_maxcut_35_20k_2k_8192  

Training and Evaluation

Switch to the scripts folder:

cd ./scripts

FT3Do / KITTIo

To train a model on 1,800 examples from the train set of FT3Do, run the following command:

sh train_on_ft3d_o.sh

Evaluate this model on KITTIo with 2,048 point per point cloud using the following command:

sh evaluate_on_kitti_o.sh

The results will be saved to the file ./experiments/ft3d_o_1800_examples/log_evaluation_kitti_o.txt.

Evaluate this model on KITTIo with all the points in the point clouds using the following command:

sh evaluate_on_kitti_o_all_point.sh

The results will be saved to the file ./experiments/ft3d_o_1800_examples/log_evaluation_kitti_o_all_points.txt.

KITTIv / KITTIt

To train a model on the 100 examples of KITTIv, run the following command:

sh train_on_kitti_v.sh

Evaluate this model on KITTIt with 2,048 point per point cloud using the following command:

sh evaluate_on_kitti_t.sh

The results will be saved to the file ./experiments/kitti_v_100_examples/log_evaluation_kitti_t.txt.

Evaluate this model on KITTIt with all the points in the point clouds using the following command:

sh evaluate_on_kitti_t_all_points.sh

The results will be saved to the file ./experiments/kitti_v_100_examples/log_evaluation_kitti_t_all_points.txt.

FT3Ds / KITTIs, FT3Ds / FT3Ds

To train a model on 1,800 examples from the train set FT3Ds, run the following command:

sh train_on_ft3d_s.sh

Evaluate this model on KITTIs with 8,192 point per point cloud using the following command:

sh evaluate_on_kitti_s.sh

The results will be saved to the file ./experiments/ft3d_s_1800_examples/log_evaluation_kitti_s.txt.

Evaluate this model on the test set of FT3Ds with 8,192 point per point cloud using the following command:

sh evaluate_on_ft3d_s.sh

The results will be saved to the file ./experiments/ft3d_s_1800_examples/log_evaluation_ft3d_s.txt.

Visualization

First, save results for visualization by adding the flag --save_pc_res 1 when running the evaluation script. For example, the script for evaluating on KITTIt. The results will be saved to the folder ./experiments/kitti_v_100_examples/pc_res/.

Then, select the scene index that you would like to visualize and run the visualization script. For example, visualizing scene index #1 from KITTIt:

python visualize_scoop.py --res_dir ./../experiments/kitti_v_100_examples/pc_res --res_idx 1 

The visualizations will be saved to the folder ./experiments/kitti_v_100_examples/pc_res/vis/.

Evaluation with Pretrained Models

First, download our pretrained models with the following command:

bash download_pretrained_models.sh

The models (about 2MB) will be saved under pretrained_models folder in the following structure:

pretrained_models/
├── ft3d_o_1800_examples/model_e100.tar
├── ft3d_s_1800_examples/model_e060.tar
├── kitti_v_100_examples/model_e400.tar

Then, use the evaluation commands mentioned in section Training and Evaluation, after changing the experiments folder in the evaluation scripts to the pretrained_models folder.

License

This project is licensed under the terms of the MIT license (see the LICENSE file for more details).

Acknowledgment

Our code builds upon the code provided by Puy et al., Groueix et al., Liu et al., and Gu et al. We thank the authors for sharing their code.

scoop's People

Contributors

itailang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

whuhxb

scoop's Issues

Test samples idxs differ from Rigid3DSceneFlow

Hi, thank you for releasing the code.
I find the stereo-kitti test samples indices differ from the Rigid3DSceneFlow stereo-kitti and lidar-kitti. Could you please elaborate? Also, could you please confirm which kitti dataset is adopted in the evaluation, stereo-kitti or lidar-kitti? I find it hard to distinguish these two.
Thank you!

Question regarding coord system and ego correction

Hi, thank you for opensource the repo! I have a few questions regarding coord system and ego correction. I have done a visualization (attached below) for the provided kitti_v dataset. The dataset seems to have nonzero GT flow for almost all points (including some static points like walls).

  1. Which coordinate system do you use for the kitti dataset?
  2. Is there a design choice reason not applying the ego motion correction before sending to train scoop?
  3. Which ground removal algorithm are you using? What other preprocessing have you applied?

Thank you so much!

kitti-batch2

Evaluation Concern Part 2

Continuing #3 issue,

You compare different models which were evaluated using either 2048 points or 8192 points.
Whereas your model does evaluation on whole pointcloud but run inference on chunks of 2048 points.

If we consider all the points for evaluation, won't our results be better by default? If we don't take chunks and just input a 2048 random sampled pointcloud pair, will the results be same as reported?

Evaluation Concern

In your evaluation file, you divide the point cloud into chunks of 2048 points of say batch b. Then you estimate flow for all these batches independently and concatenate the flow results. The full pointcloud is evaluated right? Not just any randomly sampled 2048 points?

Please do correct me, if I am wrong?

Question on details with code and paper

Thanks for your work. and open source. I have some questions when I read the paper and review the code. Looking forward to your reply.

  1. It seems the code and paper Matching Cost said C is limited by a distance of 10m, out of 10 is infinity. But the code shows no change, if we return S
    Here is the code part:

    SCOOP/tools/ot.py

    Lines 33 to 49 in 9f41e8b

    # Squared l2 distance between points points of both point clouds
    distance_matrix = torch.sum(pcloud1 ** 2, -1, keepdim=True)
    distance_matrix = distance_matrix + torch.sum(
    pcloud2 ** 2, -1, keepdim=True
    ).transpose(1, 2)
    distance_matrix = distance_matrix - 2 * torch.bmm(pcloud1, pcloud2.transpose(1, 2))
    # Force transport to be zero for points further than 10 m apart
    support = (distance_matrix < 10 ** 2).float()
    # Transport cost matrix
    feature1 = feature1 / torch.sqrt(torch.sum(feature1 ** 2, -1, keepdim=True) + 1e-8)
    feature2 = feature2 / torch.sqrt(torch.sum(feature2 ** 2, -1, keepdim=True) + 1e-8)
    S = torch.bmm(feature1, feature2.transpose(1, 2))
    C = 1.0 - S
    # Entropic regularisation
    K = torch.exp(-C / epsilon) * support

image

  1. code part on https://github.com/itailang/SCOOP/tree/master/scripts shows the max num of pts is 8192, when I tried to raw points with high channel LiDAR, the memory is not possible to allocate, especially when we use the distance matrix BxNxM. Did you try with around 107,000 pts?

Learning curve and hypyerparameter tuning on sparse data

Hi, thanks for the great work! I have a few questions regarding the hyperparameter tuning:

  1. why the train_corr_conf_loss and val_corr_conf_loss increases while all other three losses decreases overtime when training with default setting on kitti_v? Is this expected? Can you provide the learning curve on your side as a reference?
  2. I try to adapt scoop for nuscenes dataset. The learning curve seems to converge much slower and have much higher absolute loss values than kitti. The nuscenes dataset is more sparse and has many static points with zero GT flow. Do you have any suggestions for adjusting --nb_neigh_smooth_flow, --nb_neigh_cross_recon or any other hyperparameters accordingly? should --nb_neigh_cross_recon always be double the value of --nb_neigh_smooth_flow?

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.