GithubHelp home page GithubHelp logo

nerfplusplus's People

Contributors

dkasuga avatar kai-46 avatar maximevandegar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nerfplusplus's Issues

How to visualise the cameras?

Thank you for the script you have provided for visualizing the cameras in the scene.

Could you please share the usage instructions for that script too?

Thanks in advance.

What is the main point of shape-radiance ambiguity

Dear author,

While I read your nerf++ paper, I coudn't fully understand shape-radiance ambiguity (Section 3 of nerf++ paper).

  1. Is the purpose of Figure 2 experiment illustrating the ambiguity to show that NERF model can fit to arbitrary 3d shape setting of training data ?
    And if it were correct (verified by Figure 2 experiment), how this fact is related to the Factor 1 ("c" must become a high-frequency function as "sigma" deviates from the correct shape) ?

  2. Why the Factor 2 (NERF MLP structure implicitly regularize to make "c" have smooth BRDF prior w.r.d. "d") helps NERF to avoid the shape-radiance ambiguity ?

  3. How the Factor 1 and 2 is logically related ? It seems unrelated since the Factor 1 argues NERF MLP has a limited capacity to model high complexity given incorrect shape, and the Factor 2 argues NERF MLP model implictly regularize to make "c" smooth w.r.d "d" at any given "x"

Thanks you.

Best regards,
YJHong.

How to understand the intersect_sphere in your code?

def intersect_sphere(ray_o, ray_d):
'''
ray_o, ray_d: [..., 3]
compute the depth of the intersection point between this ray and unit sphere
'''
# note: d1 becomes negative if this mid point is behind camera
d1 = -torch.sum(ray_d * ray_o, dim=-1) / torch.sum(ray_d * ray_d, dim=-1)
p = ray_o + d1.unsqueeze(-1) * ray_d
# consider the case where the ray does not intersect the sphere
ray_d_cos = 1. / torch.norm(ray_d, dim=-1)
p_norm_sq = torch.sum(p * p, dim=-1)
if (p_norm_sq >= 1.).any():
raise Exception('Not all your cameras are bounded by the unit sphere; please make sure the cameras are normalized properly!')
d2 = torch.sqrt(1. - p_norm_sq) * ray_d_cos

return d1 + d2

General use case

Does this method only work for 360 unbounded scenes? Does this work on, for example, forward facing scenes in NeRF? Has anyone tested?
I currently tried applying this on a driving scene, where the images are photos taken from a forward-moving car. I defined the sphere center as the last position, and the radius as 8 times the distance travelled (like for T&T dataset), poses are like the image below.

image

When I use NeRF, it works well with the NDC setting since everything lies inside the frustum in front of camera 0. However with NeRF++, it fails to distinguish the foreground(fg) and background(bg): when I check the training output, it learns everything as fg and the bg is all black. And since the faraway scenery is bg, it learns it very badly. I therefore have question if it only works for 360 unbounded scenes, where the fg/bg is easier to distinguish?

train on my own datasets, the loss is nan

Hi @Kai-46, after using colmap to get the pose and intrinsics of my own datasets and train from scratch, the loss got nan, and I print the network's output, the ret['rgb']'s values are all nan.

I wonder whether the pose and intrinsics are wrong(data[key]['K'] stores the intrinsic and data[key]['W2C'] pose, right? )or I need to adjust some hyperparameters of the training phase?

Hope you can help, thanks~

About shape-radiance ambiguity

Thanks for your work!

I have some questions about the solution you construct to demonstrate the shape-radiance ambiguity in Paragraph 2 of Section 3:

To illustrate this ambiguity, imagine that for a given scene we represent the geometry as a unit
sphere. In other words, let us fix NeRF’s opacity field to be 1 at the surface of the unit sphere,
and 0 elsewhere. Then, for each pixel in each training image, we intersect a ray through that pixel
with the sphere, and define the radiance value at the intersection point (and along the ray direction)
to be the color of that pixel. This artificially constructed solution is a valid NeRF reconstruction
that perfectly fits the input images.

1, Does it means that the opacity field inside the unit sphere is fixed to 0?
2, If only the opacity field at the surface be 1, the integral in Eq.(2) should be zero, since there are at most only two non-zero points along the ray.
3, Or you let $dt$(the step size of the numerical integration) to 1?

So I cannot figure out why this is a valid solution...Can you help me?

question on camera position

Thanks for open source this great repo!

In my situation, I sample camera positions on the surface of a unit sphere which is centered at the world origin. Sample cameras are distributed along x-axis and looking at the world origin. Object/scene are supposed to be around the world origin. Then, with these camera positions, I use look-at rule to to calculate camera2world transform matrices. My question is that will this camera setting be compatible with the requirements of nerf++, because I noticed that "Opencv camera coordinate system is adopted, i.e., x--->right, y--->down, z--->scene. "? Here's an example of 32 sampled camera positions in x, y, z format.

[[ 0. , 0. , 1. ],
[-0.011, 0. , 1. ],
[-0.016, 0. , 1. ],
[ 0.02 , 0. , 1. ],
[ 0.023, 0. , 1. ],
[ 0.025, 0. , 1. ],
[ 0.028, 0. , 1. ],
[-0.03 , 0. , 1. ],
[-0.032, 0. , 0.999],
[ 0.034, 0. , 0.999],
[-0.036, 0. , 0.999],
[ 0.038, 0. , 0.999],
[-0.039, 0. , 0.999],
[-0.041, 0. , 0.999],
[-0.042, 0. , 0.999],
[ 0.044, 0. , 0.999],
[-0.045, 0. , 0.999],
[ 0.047, 0. , 0.999],
[-0.048, 0. , 0.999],
[ 0.049, 0. , 0.999],
[ 0.051, 0. , 0.999],
[ 0.052, 0. , 0.999],
[ 0.053, 0. , 0.999],
[ 0.054, 0. , 0.999],
[-0.056, 0. , 0.998],
[ 0.057, 0. , 0.998],
[ 0.058, 0. , 0.998],
[ 0.059, 0. , 0.998],
[-0.06 , 0. , 0.998],
[-0.061, 0. , 0.998],
[ 0.062, 0. , 0.998],
[ 0.063, 0. , 0.998]]

Thank you so much!

Running run_colmap.py failed in the last step.

image

image

Hi,

Great work!

When I run run_colmap.py, I encountered this error:
NameError: name 'mesh' is not defined.

Besides, I find in your current code, the tf is also not defined because both in_geometry_file and out_geometry_file are None.

Could you fix these problems?

Run on cumstom data

I have run colmap on my own data. Then I convert it to json format by running extract_sfm.py. THen I run normalize_cam_dict.py to get the normalized json. But when I am training the model, it requires intrinsics.txt and other files that are not generated in my pipeline, How to run your code on our custom dataset?

Process 1 terminated with the following error

2022-11-18 23:46:13,697 [INFO] root: tat_training_Truck step: 0 resolution: 1.000000 level_0/loss: 0.064675 level_0/pnsr: 11.892565 level_1/loss: 0.064430 level_1/pnsr: 11.909071 iter_time: 0.250360
Exception in thread Thread-1:
Traceback (most recent call last):
File "//anaconda3/envs/nerfplusplus/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/
/anaconda3/envs/nerfplusplus/lib/python3.6/site-packages/tensorboardX/event_file_writer.py", line 202, in run
data = self._queue.get(True, queue_wait_duration)
File "//anaconda3/envs/nerfplusplus/lib/python3.6/multiprocessing/queues.py", line 108, in get
res = self._recv_bytes()
File "/
/anaconda3/envs/nerfplusplus/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "//anaconda3/envs/nerfplusplus/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/
/anaconda3/envs/nerfplusplus/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError

Traceback (most recent call last):
File "ddp_train_nerf.py", line 604, in
train()
File "ddp_train_nerf.py", line 599, in train
join=True)
File "//anaconda3/envs/nerfplusplus/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/
/anaconda3/envs/nerfplusplus/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
while not context.join():
File "/~~/anaconda3/envs/nerfplusplus/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 119, in join
raise Exception(msg)
Exception:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "//anaconda3/envs/nerfplusplus/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
fn(i, *args)
File "/
/nerfplusplus-master/ddp_train_nerf.py", line 488, in ddp_train_nerf
idx = what_val_to_log % len(val_ray_samplers)
ZeroDivisionError: integer division or modulo by zero

how to realise the problem,could you help me

Model output for demo examples are blurry

After training with default configs listed in repo, I got much blurrier rendered images compared to what has been demonstrated. In particular truck scene has been trained for 500K iterations. Same situation for the train scene. I might be missing something important in the training phase.

What are the training params for the best model performance with high resolutions?

For reference see below two images, one is rendered second is the ground truth.

image

image

Question about the background net

Hi, amazing work! I have a small question about the model:
At L119:

bg_dists = torch.cat((bg_dists, HUGE_NUMBER * torch.ones_like(bg_dists[..., 0:1])), dim=-1) # [..., N_samples]

why the "dists" is inversed distance, instead of the real distance? It seems wrong in volume rendering? or Am I missing something?
Thanks so much!

About run_colmap

Thanks for your work firstly. I am now working on my outdoor datasets and have some bad results in previous NeRF methods. I want to try on your code but I fail to build the dataset as the form of yours because of the run_colmap.py. I wonder where the install/bin/colmap is under the colmap/build directory. I just know how to use the cmake to compile colmap and cannot find the above path . Could you give me some advice?

what LICESE is for nerfplusplus ?

Thanks for a great repository !!!

I'd like to use those nerfplusplus codes for daily task for job if possible.
So, what kind of license is expected in this repository ?

Code for initializing NeRF's geometry as a sphere

Hi! Thanks for the elaborate analysis and informative experiments!

Noticed that you conducted an experiment where the geometry (opacity field) of NeRF is fixed (or initialized). Could you please also offer the code of that experiment?

Thanks in advance!

COLMAP pipeline gives faulty results on custom and vanilla data

Thanks for all the hard work on this!

Summary

The provided COLMAP pipeline is giving apparently faulty results, making it difficult to use my own custom data. I've confirmed this by running your given data through the pipeline with minimal changes to the code.

Overview

I'm attempting to run NeRF++ on my own custom dataset and I ran into very blurry and unusable results after running it through the COLMAP pipeline and training step. To isolate the issue, I ran the full dataset conversion on your dataset, specifically tat_training_Truck in your provided tanks and temples dataset.

I ran run_colmap.py on a new directory with just the rgb images of tat_training_Truck. Multiple issues arose when I visualized the results.

1. Focal point is out of frame in epipolar geometry visualization

I'm not terribly familiar with epipolar geometry, but I assume that the epipolar lines should converge within the view of the given frame (I assume this is the focal point? Please correct me if I'm wrong). This does not occur in the given dataset despite the camera pose pointing at the object of interest, which tells me that the outputted intrinsic matrix is incorrect.
image
green camera is visible on left side of image, seemingly oriented and positioned correctly
Screenshot from 2020-12-15 12-16-25
visualization of epipolar geometry of this pose

This tells me that there's some bug in the run_colmap.py pipeline that is causing a bad intrinsic matrix to result

2. Camera path not fully normalized to unit sphere

This was not an issue with my custom dataset, but it seems to be here. I visualized the automatic normalization that your script performed and the camera track did not get bound to the unit sphere. Additionally, there seems to be no built-in support for normalizing the kai_points.ply pointcloud. You seemed to have successfully normalized it in the example you gave, so I have two questions on this point:

  1. How do you successfully normalize these camera poses within the unit sphere?
  2. How do you normalize the kai_points.ply pointcloud and convert it to a mesh like you did in your example?

image
This comes straight out of the vanilla COLMAP pipeline, which is very different from the posted example

3. Blurry training results

I figure that this is a consequence of 1.. However, I can't demonstrate this for the vanilla data since its poses aren't successfully normalized according to 2.. Here's a sample of the blur experienced from training on a chair for many, many hours:
Screenshot from 2020-12-15 12-55-58

I also wrote my own converter that takes this outputted COLMAP data and transforms it into NeRF++-readable format. I figure no bugs from there are present here since this is before that conversion even takes place. On that note, if you have official code for this process I'd also love to take a look.

End

Since I performed minimal modifications upon the code and I'm using vanilla data, I figure there's either a bug in the system or I'm doing this fundamentally improperly. Do you have any suggestions on how to fix this so that I can use my own custom data without running into these same issues?

COLMAP code providing bad camera poses on provided data

Hi,
I am having trouble working with the camera geometries generated by the run_colmap.py function, and thus have not generated any usable model for nerfplusplus so far. I have downloaded the trucks and temples data and rerun the colmap generation on just the images in the training_truck directory, and obtained the following image:
Screenshot from 2021-05-07 22-26-51

This was after several guesses in the code, such as commenting out this line here which gave me an error as 'mesh' didn't exist, and guessing that my generated file at $DATASET_PATH/posed_images/kai_cameras_normalized.json is the desired variable needed in train/cam_dict_norm.json. So please let me know if I've done anything wrong in that regard.

I cannot even run the camera visualisation code as I get this error:
GLX: Failed to create context: BadValue (integer parameter out of range for operation)

I appreciate any help you can provide.

Colmap creates 0 and 1 directory under sfm/sparse

I just have a scene captured from multiple views and placed all 39 images in one directory...

Once I run the script run_colmap.py, I am seeing 2 folders getting created under sfm/sparse namely 0 and 1. The "0" folder consist info of 35 images and "1" folder.

May I know what is reason behind creation of 0 and 1 folder?

For MVS, which folder path I need to give?

LPIPS version

Hi,

Thanks for the great work. Could you please tell me what version of LPIPS was used to obtain the results as stated in the paper?
i.e. AlexNet, VGG or SqueezeNet?

Thanks in advance

What is autoexposure ?

Hi, I am YJHong and thanks for your great work!

I wonder what is autoexposure option for nerf.

Is it necessary option to run nerf++ code?

Thank you,
YJHong.

about run_colmap.py

cameras, images, points3D files do not exist at “...”\output\sfm\sparse
how to solve this problem?

run_colmap json files

Hi,

Thanks for sharing your code. I am running the colmap script on my own data and it produces json files but all the examples scenes you give have the camera parameters and poses in txt files. Is there a utility to convert jsons to txt files or the main train script can understand both?

George

about scene normalization

In your implementation, scene normalization is just camera position normalization. In my understanding, this is equivalent to scaling the size of the world. So should the intrinsics of the camera also be scaled?

Explanation of intersect_sphere and a faster implementation

This function computes the intersection depth, but there is no explanation either in the paper or in the code.

def intersect_sphere(ray_o, ray_d):
'''
ray_o, ray_d: [..., 3]
compute the depth of the intersection point between this ray and unit sphere
'''
# note: d1 becomes negative if this mid point is behind camera
d1 = -torch.sum(ray_d * ray_o, dim=-1) / torch.sum(ray_d * ray_d, dim=-1)
p = ray_o + d1.unsqueeze(-1) * ray_d
# consider the case where the ray does not intersect the sphere
ray_d_cos = 1. / torch.norm(ray_d, dim=-1)
p_norm_sq = torch.sum(p * p, dim=-1)
if (p_norm_sq >= 1.).any():
raise Exception('Not all your cameras are bounded by the unit sphere; please make sure the cameras are normalized properly!')
d2 = torch.sqrt(1. - p_norm_sq) * ray_d_cos
return d1 + d2

So in case it's not clear for somebody, I intend to provide some insights of how it is calculated, and a faster implementation based on my approach:
We have the origin o and the direction d, and we want the intersection depth with the unit sphere.
A straightforward method is to find t such that ||o+td|| = 1.
By raising both sides to the square, what we get is a quadratic equation in t such that:

||d||^2*t^2 + 2*(o.d)*t + ||o||^2-1 = 0

Then we can solve t using the famous formula.

It results in the following implementation:

def intersect_sphere(rays_o, rays_d):
    odotd = torch.sum(rays_o*rays_d, 1)
    d_norm_sq = torch.sum(rays_d**2, 1)
    o_norm_sq = torch.sum(rays_o**2, 1)
    determinant = odotd**2+(1-o_norm_sq)*d_norm_sq
    assert torch.all(determinant>=0), \
        'Not all your cameras are bounded by the unit sphere; please make sure the cameras are normalized properly!'
    return (torch.sqrt(determinant)-odotd)/d_norm_sq

which I have verified to yield the same result (epsilon-close) as the original implementation, but 5-10x faster (11ms vs 2ms for 100k rays on my PC, not that significant though).

Another possible code optimization that we can do is possibly normalize rays_d from the beginning, that way we can get rid of the d_norm_sq in intersect_sphere and also here

ray_d_norm = torch.norm(ray_d, dim=-1, keepdim=True) # [..., 1]
viewdirs = ray_d / ray_d_norm # [..., 3]

How do I use camera poses I already have?

I have a new set of custom images. I also have the camera parameters.

Can you please let me know what should be the structure of the camera poses, both intrinsic and extrinsic parameters to run the code on those?

question about unit sphere

During rendering, is the camera position which is the ray_o must within the unit sphere? What if we want to render beyond the sphere?
image

Camera path meaning and using of poses

Hello!

I've got two questions for you, hope it's fine.

First one: what exactly is camera path and how to obtain it? I'm thinking like you need a video forming a path and for that you need to extract camera posed and intrinsics? Correct me if I'm wrong. I want to understand the concept.

Second one: Do we need to operate with normalized poses or unnormalized poses? Those two .json files store the camera, but I don't know which one to choose and use it for my custom dataset.

Thank you and stay safe!

Add a Google Colab

Hi,

does it run on Google Colab?
Will add a version myself, if I have the time. 👍

split_size error when trainning

Thank you for your perfect work. When I train on my own dataset, there exists error as follows:
截屏2021-03-17 上午1 18 58

Could you please help me deal with it. Thanks so much!

Colmap

Hello,

I was having problems running the script. The point is that I can't find the proper path for the colmap_bin in run_colmap.py script, using COLMAP 3.6 for Windows. I was trying to reproduce your path, but without success. Can you please help with the colmap_bin path?
Thanks in advance!

Preprocessing data

Hello!

First of all, nice job! I was wondering how can we preprocess new data and make our own dataset with generated poses, intrinsics. Thanks in advance!

about inverted sphere parameterization

To do volume rendering, we need to get x', y', z'.
image

And in the paper, in order to find x', y', z', it is said to be obtained by rotating point a of the figure.

If you just divide x, y, z by r, isn't it x', y', z'? Why do I have to get it as hard as a picture? Am I misunderstanding something?

Temple and tanks pretrained models

Hello, it seems that the pretrained model don't correspond to the model because when i Load the train scene the psnr at testing time is 7.

could you please provide the pretrained model again? thanks!!! I really appreciate it !!!

Sara

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.