vincentfung13 / mine Goto Github PK

View Code? Open in Web Editor NEW

406.0 16.0 43.0 7.41 MB

Code and models for our ICCV 2021 paper "MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis"

License: MIT License

Python 98.64% Shell 1.36%

deep-learning novel-view-synthesis nerf depth-estimation 3d-reconstruction computer-vision 3d-vision

mine's People

Contributors

Stargazers

Watchers

mine's Issues

KITTI traning code

Hi,
I was wondering if you plan to release KITTI training code at any time?
Apart from this, are the released model checkpoints all pretrained on ImageNet? Thanks!

无法下载

https://drive.google.com/file/d/1sV7ioO_bintNg4U33YfUpFDD782OY8NI/view?usp=sharing下载不了

Question about KITTI raw dataset

Hi,
Thanks for sharing your work! I am wondering when will you release the dataset pipeline for KITTI raw and other datasets.
By the way, how to evaluate the network for each dataset? And what's the reported performance in LLFF dataset? I can't found them in the paper.
Thanks!

Correspondence of formula and code（torch.cumprod）

Dear authors:
Thanks for your impressive work. I found the opetation "torch.cumprod" in code
def plane_volume_rendering(rgb_BS3HW, sigma_BS1HW, xyz_BS3HW, is_bg_depth_inf): transparency_acc = torch.cumprod(transparency + 1e-6, dim=1) # BxSx1xHxW
However, I can't see an equation that contains cumprod opetation in paper "MINE:...". Where should I refer to the corresponding formula.
Thanks a lot.

Hyperparameters for training

Hello,
would like to congratulate you on such great work!

Are the hyperparameters for the kitti_raw dataset included in params_kitti_raw.yaml the same ones used to reach the results in the paper or should they be changed?

Training on multiple images per scene

Hi,

I noticed in your code that there is an option to train MINE with multiple images as input. In that case, there is no scale ambiguity, right? Can you give an example of a data-loader for that case?

how to prepear my dataset?

hi,
thanks for your good job!
but if i want to train my data? how to process?
i see the llff data have cameras.bin images.bin，points3D.bin。。。how to get these?
could you share the code for that?
Thanks.

Reproducibility Discrepancy

I've been trying to reproduce your results on KITTI Raw dataset using the published code and also used the code in here to create the same splits and preprocessing indicated in the paper. I ran an evaluation run using the pretrained weights of KITTI (32 layers) but the following results are the ones I got which don't align with the results of the paper.

[2021-11-10 18:40:17,526 synthesis_task_kitti.py] Evaluation finished, average losses:
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_loss_rgb_src 0.011722 (0.013352)
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_loss_ssim_src 0.018813 (0.022328)
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_loss_rgb_tgt 0.058807 (0.064112)
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_loss_ssim_tgt 0.343890 (0.348406)
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_lpips_tgt 0.214107 (0.253099)
[2021-11-10 18:40:17,531 synthesis_task_kitti.py] val_psnr_tgt 18.623119 (18.415305)

I've also included attachments of the synthesized target and src images. There could be an issue with the kitti data loader that I created so I can share it with you to point out the issue that's causing the discrepancy. If not, I would appreciate it if you can share with me your KITTI data loader to trace where the error is myself.

Why image normalization twice

Hi,
Image normalization is realized by "img_transforms" when loading image in function of "nerf_dataset.py" . Why normalize the input image again in "ResnetEncoder forward step" ???

kitti depth map evaluation results?

Have you tried estimated depth evaluation on KITTI dataset?

Question about train my own photo set

It's a great work!
When I train my own photo set with llff's params, the code will report two errors："assert len(xyzs) >= visible_points_count" in nerf_dataset.py or "Matrix inverse contains nan!" in utils.py.Some data can be successfully trained, some data will report the first error, and some data will report the second error.
I would like to know if there are issues with the training data captured or if improvements can be made from the code? If there is a problem with the captured dataset, how should the correct image be captured?

Question about Training Data Requirements

Hi,
Thanks for your interesting work.

I have a question regarding training data but I seem not to be able to find it in the paper.
Do you need ground truth depth maps during training or not?
Say I give you a purely image dataset like CIFAR-10, can you run your method on this data or it should contain "additional" information? If so, what is this "additional" information?

I know that during inference you only need the image, but I want to know what information is required during training.

Sincerely,
Hadi.

out of memory

I train on the LLFF dataset with two 2080ti gpus, but it reports "out of memory". I changed the batch size from 2 to 1 in config file but still not work. What should I do?

KITTI split and LPIPS computation

Hi,

Thank you for the fantastic work! I have two small questions regarding model evaluation.

KITTI raw data split
Section 4.1 mentions that there are 20 city sequences from KITTI Raw used for training and 4 sequences used for test. However, there are 28 city sequences in KITTI Raw in total. Do you use the rest of 4 sequences anywhere in the pipeline? Are the 20 training sequences and 4 test sequences exactly the same as used in Tulsiani 2018, as implemented here?
LPIPS computation
You computed LPIPS here. According the dataloader implemented here, your inputs to LPIPS are in range [0, 1] while LPIPS expects inputs in range [-1, 1] as mentioned in their doc. Am I missing anything here, or the input should indeed be normalized to have the correct LPIPS score?

Thank you in advance for the time.

No module named 'kornia'

when I run the demo, then got the issue: No module named 'kornia'

Inference memory is insufficient

When using 1060-3g inference, the GPU memory is not enough, how should I modify it? thanks

Questions about eq(3), eq (8) and eq(12) in the paper

I have some questions about the equations in the paper.
I think those equations should be corrected.
If I misunderstood something, please let me know.

(3)
in the paper

expected

(8) Parenthesis position is somewhat weird.
in the paper

expected

(12) Scale factor defined in MPI and MINE is in a reverse relationship, but equations do not reflect the difference.
in the paper

expected

Qualitative comparision about KITTI

Hi,
there is a qualitative comparision with single-view MPI on KITTI dataset in your paper,
but I do not find their pretrained model on KITTI from their repository.
Did you train their model to get the qualitative results?
Could you provide me a copy of these qualitative results? (just for academic purposes)
Thank you.

无法训练

在没有下载resnet50和vgg16预训练模型的情况下，loss为0，无法训练

Inplement detail about plane homography warping between src camera and tgt camera.

In operations/homography_sampler.py file,

Line 107-108 calculate plane homography warping matrix between src camera and tgt camera, following the equation:

While the K_inv should be K_tgt_inv, not the K_src_inv, K should be K_src. This issue will not happen when K_tgt=K_src, but cause error when intrinsics are not equal.
H_tgt_src = torch.matmul(K_src, torch.matmul(R_tnd, K_tgt_inv))

Preprocessing and Training Flow for Other Datasets

Hello authors, thank you for your great work.

You noted in the README:

Apart from the LLFF dataset, we experimented on the RealEstate10K, KITTI Raw and the Flowers Light Fields datasets - the data pre-processing codes and training flow for these datasets will be released later.

I believe the last update on this was in October 2021, so I am following up. Will you be able to release the dataloaders/code soon?

All the best,

minimum hardward requirements

Thank you for your nice work!

What if I want to run your code, do I need 48 V100 GPUs as you mentioned in the paper?

What are the minimum requirements to run this code?

Thanks in advance.

vincentfung13 / mine Goto Github PK

mine's People

Contributors

Stargazers

Watchers

Forkers

mine's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs