GithubHelp home page GithubHelp logo

liuyuan-pal / neuray Goto Github PK

View Code? Open in Web Editor NEW
403.0 403.0 31.0 111.01 MB

[CVPR2022] Neural Rays for Occlusion-aware Image-based Rendering

License: GNU General Public License v3.0

Python 100.00%
nerf neural-rendering novel-view-synthesis radiance-field

neuray's People

Contributors

cwchenwang avatar liuyuan-pal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuray's Issues

License

Very impressive results from your paper! And thank you for releasing the code! I was wondering if you could add a LICENSE file to your repo? Thank you!

depth

Hi. In the DTU dataset, I found that there is a big difference between the depth map estimated using colomap (you provided) and the ground truth depth map, have you checked this?

scan1, view 0, 300x400:
ground truth depth map:
image
depth map estimated using colomap:
image

Could you check it? Thanks!!!

What's the difference of use_src_img and not use_src_img?

Hello. This is a great work. But I meet some problem in understanding this code.

In dataset/train_dataset.py, there are some code I do not understand.

ref_imgs_info, ref_cv_idx, ref_real_idx = build_src_imgs_info_select(database,ref_ids,ref_ids_all,self.cfg['cost_volume_nn_num'])
....
if self.cfg['use_src_imgs']:
    src_imgs_info = ref_imgs_info.copy()
    ref_imgs_info = imgs_info_slice(ref_imgs_info, ref_real_idx)
    ref_imgs_info['nn_ids'] = ref_cv_idx
else:
    # 'nn_ids' used in constructing cost volume (specify source image ids)
    ref_imgs_info['nn_ids'] = ref_idx.astype(np.int64)

What do the src_imgs and ref_imgs mean respectively? Why do you directly assign the copy of ref_imgs_info to src_imgs_info?

And what do ref_cv_idx and ref_real_idx respectively mean?

How to get the accuracy presented in Tab. 1?

Thank you so much for releasing the code.

I noticed that only one PSNR/SSIM number is reported in Table 1. However, because there are N scenes on DTU/LLFF, we will get "N" PSNR/SSIM numbers by using the evaluation code.

Do you get the final PSNR in Tab.1 by using "(PSNR_1 + PSNR_2 +, ..., +PSNR_N)/N"?

OR using some other approaches?

Thank you for your attention.

Kindly remind that there may be some minor mistakes in 'configs/' folder

Hi, thanks for your excellent work!
Sorry to bring your attention that in "configs/train/ft/neuray_ft_depth_birds.yaml",
the "database_name" should be "dtu_test/birds/black_800", not "llff_colmap/birds/high".
Also, in "configs/train/ft/neuray_ft_depth_fern.yaml", the "database_name" should be "llff_colmap/fern/high", not "dtu_test/fern/black_800".
Otherwise it will throw a mistake when finetuning on DTU and LLFF datasets.

An error about inplace-abn

Hello, after I configured the environment, I encountered the following error when running your test code.
Do you know what the reason is?
image

Question for the rendering

I saw two rendering method in the code.

  1. visibility, hitting probability -> IBRNet -> density, color -> alpha -> hitting probability -> images
  2. visibility -> alpha -> hitting probability -> images

However, in the follow code:
alpha_values, visibility, hit_prob = self.dist_decoder.compute_prob( prj_dict['depth'].squeeze(-1), que_dists.unsqueeze(0), prj_mean, prj_var, prj_vis, prj_aw, True, ref_imgs_info['depth_range'])
we can obtain hitting probability, can we direct render images via hit_prob?
Thanks

what's the usage of "get_diff_feats" function?

Hi Authors,

Thanks for sharing the code of the great work.
Could you please explain a little bit about the usage of "get_diff_feats" function?

  1. why you inverse the near/far plane?
    near_inv, far_inv = -1 / near[..., None], -1 / far[..., None]
  2. why you renorm the input depth?
    depth_in = depth_in * (far_inv - near_inv) + near_inv
    depth = -1 / depth_in
  3. why you unproject the depth to point cloud and then project them back? function "project_points_ref_views"

Thanks

A question about color c

Great work, but after reading your paper. I have a question: how is the aggregation of local features fi,j with vi,j achieved to compute alpha values and color c? While I can understand how alpha values are calculated in Section 3.6, I'm curious about the process for computing color c.

A question that loss values are all NaN

Have you ever encountered any cases where all loss values are NaN values?
I ran your code on shapeNet dataset, and the training was fine at first, but after a few thousand steps, loss would all be NaN, with a warning that the input tensor might have NaN or Inf.
I wonder if you have any experience in this field?
I implemented shapeNet's dataset class myself, modeled after the other dataset classes in database.py. Among it, the masks are all set to 1, and the depth maps are all set to 0, because I use your Cost Volume model.
The parameter I am not sure about is the background. I set the background to white, while your training and testing seem to use black. Will this matter?

Possible bug in equation (9)

Hi! Thank you for your work!

In the equation (9) in the paper:

\tilde{\alpha}(z_0,z_1)=\frac{t(z_1)-t(z_0)}{1-t(z_0)}

the denominator = 1-t(z_0)

While in code:
alpha_value = torch.log(hit_prob / (visibility - hit_prob + eps) + eps)
If we omit log the denominator = visibility - hit_prob = 1 - t(z_0) - (t(z_1) - t(z_0)) = 1-t(z_1)

The denominator in the paper and the code do not match.

Is this intended behaviour? and where is the error in the paper or the code or none?

ImportError: cannot import name 'select_working_views_by_overlap' from 'utils.view_select'

Thank you for your sharing!
I download the nerf_synthetic dataset and your pre-trained model. When I run

python render.py --cfg configs/gen/neuray_gen_depth.yaml --database nerf_synthetic/lego/black_800 --pose_type eval

There is an error:

Traceback (most recent call last):
File "render.py", line 12, in
from network.renderer import name2network
File "/home/hyx/NeuRay/network/renderer.py", line 21, in
from utils.view_select import compute_nearest_camera_indices, select_working_views, select_working_views_by_overlap
ImportError: cannot import name 'select_working_views_by_overlap' from 'utils.view_select'

I found that the function 'select_working_views_by_overlap' is commented. So I uncomment that and run again. But there is another error:

Traceback (most recent call last):
File "render.py", line 210, in
render_video_gen(flags.database_name, cfg_fn=flags.cfg, pose_type=flags.pose_type, pose_fn=flags.pose_fn,
File "render.py", line 113, in render_video_gen
ref_ids_list = select_working_views_db(database, ref_ids_all, que_poses, render_cfg['min_wn'])
File "/home/hyx/NeuRay/utils/view_select.py", line 86, in select_working_views_db
indices = select_working_views(ref_poses, que_poses, work_num, exclude_self)
File "/home/hyx/NeuRay/utils/view_select.py", line 21, in select_working_views
dists = np.linalg.norm(ref_cam_pts[None, :, :] - render_cam_pts[:, None, :], 2, 2) # qn,rfn
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Could you please help me solve this problem? Thank you for your time!

A question about "depth_range"

Hello, I noticed that in your database.py, DTUTrainDatabase's "depth_range" is written with a fixed value, while GoogleScannedObjectDatabase's "depth_range" is calculated based on pose. What's the difference between these two ways? And if I want to run your code on ShapeNet dataset, which way do you recommend to get depth_range?

DTUTrain:
image
GoogleScannedObject:
image

Why are inconsistent parameters set in general model config and fine-tune model config?

(1)
In configs/train/gen/train_gen_cost_volume_train.yaml, use_vis is false.

fine_dist_decoder_cfg:
    use_vis: false

while, in configs/train/ft/neuray_ft_cv_lego.yaml, the use_vis of fine_dist_decoder_cfg is set as the default value of dist_decoder: true.

This will cause the pretrained general model is not successfully loaded in the fine-tune model because of the different setting of vis-encoder in dist-decoder. So what should use_vis of fine_dist_decoder_cfg be ?

(2) I noticed that use_self_hit_prob is only set to true in the fine-tune-model. So why not set it to true consistently in the general-model?

渲染自定义数据集的问题

文档中的“python run_colmap.py --example_name desktop --colmap ”--colmap 需要指向哪?模型及数据的目录结构完全和文档一致

环境问题

感谢作者的分享,我尝试了很多方法,都会报错

(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay# python render.py 
Traceback (most recent call last):
  File "render.py", line 12, in <module>
    from network.renderer import name2network
  File "/cmdata/docker/yfq/NeuRay/network/renderer.py", line 13, in <module>
    from network.init_net import name2init_net, DepthInitNet, CostVolumeInitNet
  File "/cmdata/docker/yfq/NeuRay/network/init_net.py", line 5, in <module>
    from inplace_abn import ABN
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/__init__.py", line 1, in <module>
    from .abn import ABN, InPlaceABN, InPlaceABNSync
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/abn.py", line 8, in <module>
    from .functions import inplace_abn, inplace_abn_sync
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/functions.py", line 8, in <module>
    from . import _backend
ImportError: /opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/_backend.cpython-38-x86_64-linux-gnu.so: undefined symbol: THPVariableClass
(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay# 

可以帮忙看下么,或者提供完成的python版本,cuda版本,各个三方库的版本;
如果能提供一个可用的docker镜像就再好不过了。

自定义数据集问题

我发现你的desktop那个数据就只有图片,我现在可以根据这些图片进行渲染。但是在其他nerf的项目会发现dtu等数据都是带着一个相机参数、世界坐标的一个文件,请问你知道这个文件是怎么生成的么?
是原本就自带的?还是用colmap等工具生成的呢?

Custom scene rendering

Hi

I am trying to render a custom scene

python run_colmap.py --example_name desktop
--colmap # note we need the dense reconstruction

Here what do you mean by "path-to-your-colmap" ?

DTU testing dataset

Hi,

Thank you for your amazing work. I understand that for DTU, you're evaluating only for 4 scans (birds, tools, bricks and snowman). However, I wanted to evaluate NeuRay for all the scans included in dtu_test_scans.txt. Hence, I'd be grateful if you could share the corresponding processed dataset, if you happen to have it.

Thanks in advance

Difference between IBRnetwithNeuray model and Neuray model

Hi, thanks for sharing the code for this amazing work!

If I understand correctly, IBRnet didn't consider the visibility in each view while neuray considers it with the help from the depth map. In the implementation, I saw there is a class called IBRNetWithNeuRay

class IBRNetWithNeuRay(nn.Module):
. I wonder in terms of implementation, would this model will have the same performance as the neuray model itself(NeuralRayGenRenderer)?
class NeuralRayGenRenderer(NeuralRayBaseRenderer):

Or, neuray model actually has other designs which make it even better?

Thank you!

Best,
Wenzheng

about depth using in init_net.py

Thank you for your contribution!

I notice that "DepthInitNet" and "CostVolumeInitNet" both use the depth image. Does it mean the model is not end-to-end for the dataset without depth information, because you should get the depth using colmap firstly.

If my dataset is relatively large and there is no depth information or it is time-consuming to calculate depth, how should I use your code to run it?

Thank you very much.

finetuning on custom dataset?

Hi,

I've run the general algorithm on some custom data - it does quite a nice job on my relatively sparse views vs the few other tools I've been trialling - is it possible to run fine-tuning on custom data to further improve the performance? I'm not quite sure I understand how the configs should be adapted.

colmap image_undistorter

After colmap image_undistorter ,inconsistent image size.
(*geometric.bin)the depth image size is not uniform and the code does not work.

finished

学长你好,在这个函数中,我个人理解pose中是存放的相机在世界坐标系下的位置与姿态,那么第一个cam_xyz应该就是相机坐标系下的坐标了吧,为什么还要经历一次从世界坐标系到相机坐标系的转换呢(cam_xyz = rot @ cam_xyz + trans),或者说是我的理解有问题,想请学姐帮我讲解一下

渲染自定义数据报错

我按照文档下载了模型和测试数据,并实际测试了 python render.py dtu数据,是能够生成图像的,但是渲染自定义数据发生报错,(数据仍使用你提供的 desktop 数据,目录结构也是按照文档组织的)

python run_colmap.py --example_name desktop --colmap data/example/desktop

Traceback (most recent call last):
  File "run_colmap.py", line 27, in <module>
    process_example_dataset(flags.example_name,flags.same_camera,flags.colmap_path)
  File "/cmdata/docker/yfq/NeuRay/colmap_scripts/process.py", line 38, in process_example_dataset
    db.add_image(img_fn.name, cam_id)
  File "/cmdata/docker/yfq/NeuRay/colmap/database.py", line 181, in add_image
    prior_q[3], prior_t[0], prior_t[1], prior_t[2]))
sqlite3.IntegrityError: UNIQUE constraint failed: images.name

能帮忙看下么?万分感谢!

Question for the coordinate?

Are the coordinates used here opencv coordinates, different from the opengl coordinates used by nerf?

When I load the poses of the Nerf Synthetic dataset,why do I need to multiply diag[1, -1, -1]?

The code is in the dataset/database/class NeRFSyntheticDatabase
def parse_info(self,split='train'):
with open(f'{self.root_dir}/transforms_{split}.json','r') as f:
# 加载数据
img_info=json.load(f)
# 焦距
focal=float(img_info['camera_angle_x'])
# 存储images下标和poses
img_ids,poses=[],[]
for frame in img_info['frames']:
img_ids.append('-'.join(frame['file_path'].split('/')[1:]))
pose=np.asarray(frame['transform_matrix'], np.float32)
# 旋转矩阵
R = pose[:3,:3].T
t = -R @ pose[:3,3:]
R = np.diag(np.asarray([1,-1,-1])) @ R
t = np.diag(np.asarray([1,-1,-1])) @ t
poses.append(np.concatenate([R,t],1))
h,w,_=imread(f'{self.root_dir}/{self.img_id2img_path(img_ids[0])}.png').shape
focal = .5 * w / np.tan(.5 * focal)
# 内参
K=np.asarray([[focal,0,w/2],[0,focal,h/2],[0,0,1]],np.float32)
return img_ids, poses, K

difference of 'pixel_colors_nr' and 'pixel_colors_nr_fine'

Thank you for sharing the source code.
However, I have a question about the difference between 'pixel_colors_nr' and 'pixel_colors_nr_fine,' which appear to represent the rendered color values.
It seems that both are obtained by sampling 64 points and rendering 'c.'
Are the values recorded in 'Table 1' obtained using only one of these methods, or is there a different sampling method used to obtain those values?

Code for training your dataset on baseline models?

Thanks for the excellent work. I noticed that you trained other baselines (e.g. PixelNeRF) on the same dataset as yours, but I found that adapting the extra dataset on pixelnerf is not successful. Could you also provide the code of training baseline models?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.