liuyuan-pal / neuray Goto Github PK
View Code? Open in Web Editor NEW[CVPR2022] Neural Rays for Occlusion-aware Image-based Rendering
License: GNU General Public License v3.0
[CVPR2022] Neural Rays for Occlusion-aware Image-based Rendering
License: GNU General Public License v3.0
Very impressive results from your paper! And thank you for releasing the code! I was wondering if you could add a LICENSE
file to your repo? Thank you!
This is a great work. But I have a problem for this code. Is the pose of data world-to-camera parameter, or camera-to-world parameter?
Hello. This is a great work. But I meet some problem in understanding this code.
In dataset/train_dataset.py, there are some code I do not understand.
ref_imgs_info, ref_cv_idx, ref_real_idx = build_src_imgs_info_select(database,ref_ids,ref_ids_all,self.cfg['cost_volume_nn_num'])
....
if self.cfg['use_src_imgs']:
src_imgs_info = ref_imgs_info.copy()
ref_imgs_info = imgs_info_slice(ref_imgs_info, ref_real_idx)
ref_imgs_info['nn_ids'] = ref_cv_idx
else:
# 'nn_ids' used in constructing cost volume (specify source image ids)
ref_imgs_info['nn_ids'] = ref_idx.astype(np.int64)
What do the src_imgs and ref_imgs mean respectively? Why do you directly assign the copy of ref_imgs_info to src_imgs_info?
And what do ref_cv_idx
and ref_real_idx
respectively mean?
Thank you so much for releasing the code.
I noticed that only one PSNR/SSIM number is reported in Table 1. However, because there are N scenes on DTU/LLFF, we will get "N" PSNR/SSIM numbers by using the evaluation code.
Do you get the final PSNR in Tab.1 by using "(PSNR_1 + PSNR_2 +, ..., +PSNR_N)/N"?
OR using some other approaches?
Thank you for your attention.
Hi, thanks for your excellent work!
Sorry to bring your attention that in "configs/train/ft/neuray_ft_depth_birds.yaml",
the "database_name" should be "dtu_test/birds/black_800", not "llff_colmap/birds/high".
Also, in "configs/train/ft/neuray_ft_depth_fern.yaml", the "database_name" should be "llff_colmap/fern/high", not "dtu_test/fern/black_800".
Otherwise it will throw a mistake when finetuning on DTU and LLFF datasets.
I saw two rendering method in the code.
However, in the follow code:
alpha_values, visibility, hit_prob = self.dist_decoder.compute_prob( prj_dict['depth'].squeeze(-1), que_dists.unsqueeze(0), prj_mean, prj_var, prj_vis, prj_aw, True, ref_imgs_info['depth_range'])
we can obtain hitting probability, can we direct render images via hit_prob?
Thanks
Hi Authors,
Thanks for sharing the code of the great work.
Could you please explain a little bit about the usage of "get_diff_feats" function?
Thanks
Hi, thanks for the wonderful work. I'm wondering could you provide the link to download your DTU training dataset, colmap depth for DTU training images and colmap depth for forward-facing scenes? I found that if I click on "here", it will direct me to the same page. Thank you so much!
Great work, but after reading your paper. I have a question: how is the aggregation of local features fi,j with vi,j achieved to compute alpha values and color c? While I can understand how alpha values are calculated in Section 3.6, I'm curious about the process for computing color c.
Have you ever encountered any cases where all loss values are NaN values?
I ran your code on shapeNet dataset, and the training was fine at first, but after a few thousand steps, loss would all be NaN, with a warning that the input tensor might have NaN or Inf.
I wonder if you have any experience in this field?
I implemented shapeNet's dataset class myself, modeled after the other dataset classes in database.py. Among it, the masks are all set to 1, and the depth maps are all set to 0, because I use your Cost Volume model.
The parameter I am not sure about is the background. I set the background to white, while your training and testing seem to use black. Will this matter?
Hi! Thank you for your work!
In the equation (9) in the paper:
\tilde{\alpha}(z_0,z_1)=\frac{t(z_1)-t(z_0)}{1-t(z_0)}
the denominator = 1-t(z_0)
While in code:
alpha_value = torch.log(hit_prob / (visibility - hit_prob + eps) + eps)
If we omit log the denominator = visibility - hit_prob = 1 - t(z_0) - (t(z_1) - t(z_0)) = 1-t(z_1)
The denominator in the paper and the code do not match.
Is this intended behaviour? and where is the error in the paper or the code or none?
Thank you for your sharing!
I download the nerf_synthetic dataset and your pre-trained model. When I run
python render.py --cfg configs/gen/neuray_gen_depth.yaml --database nerf_synthetic/lego/black_800 --pose_type eval
There is an error:
Traceback (most recent call last):
File "render.py", line 12, in
from network.renderer import name2network
File "/home/hyx/NeuRay/network/renderer.py", line 21, in
from utils.view_select import compute_nearest_camera_indices, select_working_views, select_working_views_by_overlap
ImportError: cannot import name 'select_working_views_by_overlap' from 'utils.view_select'
I found that the function 'select_working_views_by_overlap' is commented. So I uncomment that and run again. But there is another error:
Traceback (most recent call last):
File "render.py", line 210, in
render_video_gen(flags.database_name, cfg_fn=flags.cfg, pose_type=flags.pose_type, pose_fn=flags.pose_fn,
File "render.py", line 113, in render_video_gen
ref_ids_list = select_working_views_db(database, ref_ids_all, que_poses, render_cfg['min_wn'])
File "/home/hyx/NeuRay/utils/view_select.py", line 86, in select_working_views_db
indices = select_working_views(ref_poses, que_poses, work_num, exclude_self)
File "/home/hyx/NeuRay/utils/view_select.py", line 21, in select_working_views
dists = np.linalg.norm(ref_cam_pts[None, :, :] - render_cam_pts[:, None, :], 2, 2) # qn,rfn
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Could you please help me solve this problem? Thank you for your time!
Hello, I noticed that in your database.py, DTUTrainDatabase's "depth_range" is written with a fixed value, while GoogleScannedObjectDatabase's "depth_range" is calculated based on pose. What's the difference between these two ways? And if I want to run your code on ShapeNet dataset, which way do you recommend to get depth_range?
Hi, the Google Drive link for the pre-trained model is down. Could you please update the link?
Thanks!
Thanks for your wonderful work. Is it possible to train the model using multiple gpus?
(1)
In configs/train/gen/train_gen_cost_volume_train.yaml
, use_vis is false.
fine_dist_decoder_cfg:
use_vis: false
while, in configs/train/ft/neuray_ft_cv_lego.yaml
, the use_vis
of fine_dist_decoder_cfg is set as the default value of dist_decoder: true.
This will cause the pretrained general model is not successfully loaded in the fine-tune model because of the different setting of vis-encoder in dist-decoder. So what should use_vis
of fine_dist_decoder_cfg
be ?
(2) I noticed that use_self_hit_prob is only set to true in the fine-tune-model. So why not set it to true consistently in the general-model?
文档中的“python run_colmap.py --example_name desktop --colmap ”--colmap 需要指向哪?模型及数据的目录结构完全和文档一致
This is a great work! But I still have a question. Why does train/ft/neuray_ft_cv_lego.yaml
use neuray_gen_cost_volume.yaml as the pretrained model setting instead of configs/train/gen/neuray_gen_cost_volume_train.yaml
?
感谢作者的分享,我尝试了很多方法,都会报错
(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay# python render.py
Traceback (most recent call last):
File "render.py", line 12, in <module>
from network.renderer import name2network
File "/cmdata/docker/yfq/NeuRay/network/renderer.py", line 13, in <module>
from network.init_net import name2init_net, DepthInitNet, CostVolumeInitNet
File "/cmdata/docker/yfq/NeuRay/network/init_net.py", line 5, in <module>
from inplace_abn import ABN
File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/__init__.py", line 1, in <module>
from .abn import ABN, InPlaceABN, InPlaceABNSync
File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/abn.py", line 8, in <module>
from .functions import inplace_abn, inplace_abn_sync
File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/functions.py", line 8, in <module>
from . import _backend
ImportError: /opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/_backend.cpython-38-x86_64-linux-gnu.so: undefined symbol: THPVariableClass
(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay#
可以帮忙看下么,或者提供完成的python版本,cuda版本,各个三方库的版本;
如果能提供一个可用的docker镜像就再好不过了。
我发现你的desktop那个数据就只有图片,我现在可以根据这些图片进行渲染。但是在其他nerf的项目会发现dtu等数据都是带着一个相机参数、世界坐标的一个文件,请问你知道这个文件是怎么生成的么?
是原本就自带的?还是用colmap等工具生成的呢?
Hi
I am trying to render a custom scene
python run_colmap.py --example_name desktop
--colmap # note we need the dense reconstruction
Here what do you mean by "path-to-your-colmap" ?
Hi! I do not understand why here you add a negative sign.
Line 37 in 939af16
colmap_scripts/process.py:
cmd=[colmap_path,'patch_match_stereo',
'--workspace_path',f'{project_dir}/dense']
print(' '.join(cmd))
subprocess.run(cmd,check=True)
cmd=[colmap_path,'patch_match_stereo',
'--workspace_path',f'{project_dir}/dense']
print(' '.join(cmd))
subprocess.run(cmd,check=True)
Hi,
Thank you for your amazing work. I understand that for DTU, you're evaluating only for 4 scans (birds, tools, bricks and snowman). However, I wanted to evaluate NeuRay for all the scans included in dtu_test_scans.txt. Hence, I'd be grateful if you could share the corresponding processed dataset, if you happen to have it.
Thanks in advance
Dear authors,
when will the code be released?
Thanks
Hi, thanks for sharing the code for this amazing work!
If I understand correctly, IBRnet didn't consider the visibility in each view while neuray considers it with the help from the depth map. In the implementation, I saw there is a class called IBRNetWithNeuRay
Line 239 in a877129
Line 256 in a877129
Thank you!
Best,
Wenzheng
I test instant-ngp fox data https://github.com/NVlabs/instant-ngp/tree/master/data/nerf/fox
under your instruction https://github.com/liuyuan-pal/NeuRay/blob/main/custom_rendering.md
But I get a poor result.
Raw image:
dense pointcloud:
render result:
Thank you for your contribution!
I notice that "DepthInitNet" and "CostVolumeInitNet" both use the depth image. Does it mean the model is not end-to-end for the dataset without depth information, because you should get the depth using colmap firstly.
If my dataset is relatively large and there is no depth information or it is time-consuming to calculate depth, how should I use your code to run it?
Thank you very much.
Hi,
I've run the general algorithm on some custom data - it does quite a nice job on my relatively sparse views vs the few other tools I've been trialling - is it possible to run fine-tuning on custom data to further improve the performance? I'm not quite sure I understand how the configs should be adapted.
After colmap image_undistorter ,inconsistent image size.
(*geometric.bin)the depth image size is not uniform and the code does not work.
学长你好,在这个函数中,我个人理解pose中是存放的相机在世界坐标系下的位置与姿态,那么第一个cam_xyz应该就是相机坐标系下的坐标了吧,为什么还要经历一次从世界坐标系到相机坐标系的转换呢(cam_xyz = rot @ cam_xyz + trans),或者说是我的理解有问题,想请学姐帮我讲解一下
我按照文档下载了模型和测试数据,并实际测试了 python render.py dtu数据,是能够生成图像的,但是渲染自定义数据发生报错,(数据仍使用你提供的 desktop 数据,目录结构也是按照文档组织的)
python run_colmap.py --example_name desktop --colmap data/example/desktop
Traceback (most recent call last):
File "run_colmap.py", line 27, in <module>
process_example_dataset(flags.example_name,flags.same_camera,flags.colmap_path)
File "/cmdata/docker/yfq/NeuRay/colmap_scripts/process.py", line 38, in process_example_dataset
db.add_image(img_fn.name, cam_id)
File "/cmdata/docker/yfq/NeuRay/colmap/database.py", line 181, in add_image
prior_q[3], prior_t[0], prior_t[1], prior_t[2]))
sqlite3.IntegrityError: UNIQUE constraint failed: images.name
能帮忙看下么?万分感谢!
Are the coordinates used here opencv coordinates, different from the opengl coordinates used by nerf?
When I load the poses of the Nerf Synthetic dataset,why do I need to multiply diag[1, -1, -1]?
The code is in the dataset/database/class NeRFSyntheticDatabase
def parse_info(self,split='train'):
with open(f'{self.root_dir}/transforms_{split}.json','r') as f:
# 加载数据
img_info=json.load(f)
# 焦距
focal=float(img_info['camera_angle_x'])
# 存储images下标和poses
img_ids,poses=[],[]
for frame in img_info['frames']:
img_ids.append('-'.join(frame['file_path'].split('/')[1:]))
pose=np.asarray(frame['transform_matrix'], np.float32)
# 旋转矩阵
R = pose[:3,:3].T
t = -R @ pose[:3,3:]
R = np.diag(np.asarray([1,-1,-1])) @ R
t = np.diag(np.asarray([1,-1,-1])) @ t
poses.append(np.concatenate([R,t],1))
h,w,_=imread(f'{self.root_dir}/{self.img_id2img_path(img_ids[0])}.png').shape
focal = .5 * w / np.tan(.5 * focal)
# 内参
K=np.asarray([[focal,0,w/2],[0,focal,h/2],[0,0,1]],np.float32)
return img_ids, poses, K
Thank you for sharing the source code.
However, I have a question about the difference between 'pixel_colors_nr' and 'pixel_colors_nr_fine,' which appear to represent the rendered color values.
It seems that both are obtained by sampling 64 points and rendering 'c.'
Are the values recorded in 'Table 1' obtained using only one of these methods, or is there a different sampling method used to obtain those values?
Thanks for the excellent work. I noticed that you trained other baselines (e.g. PixelNeRF) on the same dataset as yours, but I found that adapting the extra dataset on pixelnerf is not successful. Could you also provide the code of training baseline models?
Maybe there is something wrong with the link to the LLFF test set.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.