chenhsuanlin / bundle-adjusting-nerf Goto Github PK

View Code? Open in Web Editor NEW

776.0 776.0 112.0 198 KB

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

License: MIT License

Python 100.00%

bundle-adjusting-nerf's People

Contributors

Stargazers

Watchers

Forkers

peterzhousz hiyyg metavai laplacekorea bruinxiong xiaoteng-whu linzzz98 cv-ip qiaoptdun fred1912 ccnobata ranqing wh-forker darlinghang mfkiwl jeremyipark annopackage jennakeubank harshotha airplane5 namnaku87 szymanowiczs atlasgooo2 shanekelly martinarroyo kimsoohwan xyzalanix hengfei-wang hyunjin7 sherwoac milescranmer deepshwang harlanhutton v-mehta holmes-alan towardsautonomy gkouros joeljulin santolina gabbysuwichaya wbjang knowledgecluster ir1d goodmattg sirykd jiadingfang yijunwu jahaniam jlyw1017 elegantttt whuhxb longruidong jackzhousz 1ucky40nc3 hjxwhy daoguizhang tne-ai voletiv aerocooc mirlansmind nkyriazis lixinhaiyk dfqytcom sz7li yibeibankaishui samleo8 yuzhongruicn jiawangbian nemo1999 apoorvaverma31 cv-nerf xinhuazhang icodein limzh00 poptree stalhabukhari jyh2005xx hangg7 universea jinjiwu ralvarezlucend pavlosmak sainanl hanzhic cmh1027 dannytds samxiaosheng mateicristea88 datomi79 ghul-huan roshanbiswanath geothinking kaiju8 ysyy88 mohmansy94 bbrockbernd bnothing ytwas jzubizarreta xyx18061932

bundle-adjusting-nerf's Issues

Documentation on the LLFF data scene?

Hi Chen-Hsuan,

Do you have any documentation regarding how to set up another scene experiment? I did not found anything on the readme nor in the paper.

I'm refering to this folders and files:

Thanks in advance,

Why do you use Xavier init instead of Kaiming init for the parameters of NeRF?

I tried to use the built-in initialization (kaiming init) and the network cannot converge. Why is the xavier init necessary?

How was focal length calculated for the iPhone 12 in the example?

In the example in data/iphone.py the focal length is computed as self.focal = self.raw_W * 4.2 / (12.8 / 2.55). I am wondering what the specific constants mean?

How to use train data in class Graph

Hi, we are experimenting with the BARF code so that I can load another data format.

I want to load and use train_data(For example, other data that I added other than camera pose or image data in iphone.py) in def forward in nerf.py .
Could you tell me how I can load the data I want?

experiment

could you please share the code of "Planar Image Alignment" Experiment ? Thanks

camera pose refinement starting from relatively correct initial poses instead of identity transformation matrices

Thank you @chenhsuanlin for sharing this impressive work. I wanted to ask you whether it is possible to initiate BARF training from initial camera poses instead of identity matrices changing the configuration only or I have to adapt the code.

AttributeError: 'Model' object has no attribute 'train_data'

When I run “extract_mesh.py”. It seems that there is no data loaded. What's the matter？

At the same time, the external parameters of the program I run are as follows
“python extract_mesh.py --group=G1 --model=barf --yaml=barf_blender --name=First --data.scene=chair --data.val_sub= --resume”

Training BARF with another real data

Hello Chen-Hsuan,

Thank you again for releasing the code.
I have tried to train BARF with Freiburg Cars dataset. (please see the attached mp4 file)
As you can see, it is a real dataset with a spherical sequence. (hybrid between LLFF and NeRF-Synthetic)

I ran COLMAP to obtain camera pose, and have tried to train the model but it was not successful.
I have tried

Starting from zero 6-dimensional se3 vectors (as in LLFF), but the model could not find the proper pose.
Adding noise (0.15) from the GT pose, but the error on Rotation and Translation keeps increasing. As far as I know, the model finds the correct pose first and optimizes the NeRF parameters - but they do not work. Please see the images. Validation error keeps increasing as the pose estimation was not successful.

Separately, NeRF could be trained with this dataset successfully.
Please advise me on how I should do with this.

Cheers,
Wonbong

car001.mp4

Weird traning behavior of barf

Hi, @chenhsuanlin

Thanks for your great work. I'm trying barf on other scenes. However, the training behavior seems weird. As you can see from the image below, the rotation error is decreasing, but the translation error keeps increasing.

When I took a look at the synthesized validation image, it seems the result was biased by several pixels from the original image, and also the scale is not consistent with the original image.

For my experimental setting, I used COLMAP to compute the ground truth camera poses and intrinsics. The initial camera poses for barf are not identities instead of perturbing by a small pose with noise to be 0.15. I wonder if there are any parameters we need to fine tune?

Part of the scene looks like this:

And the reconstructed scene:

scale calculation of sim3

def procrustes_analysis(X0,X1): # [N,3]
    # translation
    t0 = X0.mean(dim=0,keepdim=True)
    t1 = X1.mean(dim=0,keepdim=True)
    X0c = X0-t0
    X1c = X1-t1
    # scale
    s0 = (X0c**2).sum(dim=-1).mean().sqrt()
    s1 = (X1c**2).sum(dim=-1).mean().sqrt()
    X0cs = X0c/s0
    X1cs = X1c/s1
    # rotation (use double for SVD, float loses precision)
    U,S,V = (X0cs.t()@X1cs).double().svd(some=True)
    R = ([email protected]()).float()
    if R.det()<0: R[2] *= -1
    # align X1 to X0: X1to0 = (X1-t1)/[email protected]()*s0+t0
    sim3 = edict(t0=t0[0],t1=t1[0],s0=s0,s1=s1,R=R)
    return sim3

It’s line278~line295 of camera.py
@chenhsuanlin 
Hello, I have a doubt why we did the average instead of the square when calculating s0 and s1. The s here represents the scale of the scene, it seems that it is more reasonable to square first and then average

how to visualize positional encoding in visdom?

Figure 5 in the paper shows NeRF naive postinal encoding result, which displays the GT pose and refine pose of full postional encoding.

I looked into nerf part codes, but there is no relevant ones.

bundle-adjusting-NeRF/model/nerf.py

Line 89 in 803291b

def visualize(self,opt,var,step=0,split="train",eps=1e-10):

Visulization function in nerf.py only supports Tensorbord but not includes visdom like barf.py.

It would be great if you could provide some suggestion about this issue. thx a lot

What does the numbers in get_camera() mean?

Thanks for the wonderful project.
I wonder what does these numbers mean, and should I change them when testing my own dataset?

bundle-adjusting-NeRF/data/iphone.py

Line 65 in 803291b

self.focal = self.raw_W*4.2/(12.8/2.55)

Cannot reproduce results on llff:fern

Thanks for your sharing of codes.

I followed all your tips in the ReadME and tried to reproduce the results of your paper.
However, the rotation error of my test is much higher than yours in the paper.

llff:fern	rotation	translation( x100 )
paper	0.191	0.192
reproduce	0.689	0.193

Can you give some advice?
Besides, the depth map and the rendered RGB seem good.

Thanks!

Question about camera pose transformation for LLFF

Hi, Chen-Hsuan Lin.
Thank you for sharing the great work!

I have been reading the code and I did not understand very well about camera pose transformation when calling __getitem__ method for LLFF dataset:
https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L104

In my understanding, camera pose in returned values from parse_cameras_and_bounds is camera-to-world matrix and its coordinate system is [right, up, backwards].
https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L42

Then, the camera pose is transformed by parse_raw_camera when calling __getitem__, but I could not follow what the transformation did:
https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L104
Could you please let me know?

run codes in the docker and meet a problem of cuda error

Thank you for your code. I tried to use your code on a docker environment. I chose to close visdom visualization and run the code as the following command:
CUDA_VISIBLE_DEVICES=1 python3 train.py --group=llff --model=barf --yaml=barf_llff --name=orchids --data.scene=orchids --barf_c2f=[0.1,0.5] --visdom!

Then I met the problem as,

Could you please help me? What did I do wrong?

Misaligned axes when converting LLFF data format to BARF coordinate frame

In data/llff.py, in the parse_cameras_and_bounds method, we are converting the poses_bounds.npy and ingesting it to use the given camera poses. Per LLFF's specification, it seems like the convention of this dataset has the transformation matrix for axes [down, right, backward] (i.e. positive x is down, positive y is right, positive z is backward).

Per line 49 in the aforementioned file, it seems like we are swapping these axes to switch to a new convention

poses_raw[...,0],poses_raw[...,1] = poses_raw[...,1],-poses_raw[...,0]

moving from [down, right, backward] to [right, up, backward] (i.e. positive x is right, positive y is up, positive z is backward).

However, the translation vector doesn't seem to be receiving the same modification. Is this behavior intended, as it seems inconsistent with the rest of the change?

Multi-GPU training (DataParallel)

Hi @chenhsuanlin

Thank you for sharing this nice work. I'm just curious if you happen to have multi-gpu training code by hand? I was trying to train BARF
with multi GPU, but got stuck in a weird OOM issue: the GPU memory explode into over 50G, while your original code base takes less than 10G on blender/lego

Here's the edit I made: Ir1d@904228c
The command to run: CUDA_VISIBLE_DEVICES=0,1 python train.py --group=blender --model=barf --yaml=barf_blender --name=lego_baseline --data.scene=lego --gpu=1 --visdom! --batch_size=2

Do you know what might be the leading to the OOM here?
Thank you!

When I try BARF on my own sequence， some error(s) occurred in Test while loading state_dict for NeRF

Hi Chenhsuan!

Thank you for sharing the code! When I try BARF on my own sequence，

tran script：python train.py --group='barf' --model=barf --yaml=barf_iphone --name='iphone' --data.scene='cats' --arch.posenc!
test script：python evaluate.py --group='barf' --model=barf --yaml=barf_iphone --name='iphone' --data.scene='cats' --resume

some error(s) occurred in test while loading state_dict for NeRF. Have you encountered this problem before? Thank you

Errors：
restoring nerf...
Traceback (most recent call last):
File "evaluate.py", line 34, in
main()
File "evaluate.py", line 28, in main
m.restore_checkpoint(opt)
File "/data1/lvxinbi/barf/model/base.py", line 53, in restore_checkpoint
epoch_start,iter_start = util.restore_checkpoint(opt,self,resume=opt.resume)
File "/data1/lvxinbi/barf/util.py", line 131, in restore_checkpoint
child.load_state_dict(child_state_dict)
File "/home/hotel_ai/anaconda3/envs/torch1.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for NeRF:
size mismatch for mlp_feat.0.weight: copying a param with shape torch.Size([256, 3]) from checkpoint, the shape in current model is torch.Size([256, 63]).
size mismatch for mlp_feat.4.weight: copying a param with shape torch.Size([256, 259]) from checkpoint, the shape in current model is torch.Size([256, 319]).
size mismatch for mlp_rgb.0.weight: copying a param with shape torch.Size([128, 259]) from checkpoint, the shape in current model is torch.Size([128, 283]).

how to plot the camera pose

Launching experiments in AzureML

Hi Chen-Hsuan,

Thanks for the great repo and a great piece of research! I think it would have even more impact if any user who wants to try it could launch training in the cloud, with automated hyperparameter sampling.

I have just made a PR which includes all the necessary infrastructure to be able to launch experiments in AzureML #14. This should enable faster and easier experimentation with your code for anyone who might want to use it. Let me know if you have any suggestions of how to make the PR better.

Best wishes,
Stan

question about llff_leaves dataset

I set the barf_c2f = [0.1,0.5],but can't reproduce the result in the paper. It's the question about Hyperparameters？
Thank u

Camera Pose Optimization with Geometry Prior

Hey,

I am trying to optimize the camera positions given a non textured mesh. Right now NeRF implementation does not consider this prior and samples random rays. I was wondering how do I provide my mesh as initial input to BARF? Since I already have the mesh it should be instant to fix the camera poses!

I've tried to use PyTorch3D as differentiable renderer with BARF's additional parameters for camera pose optimisation, but it doesn't work. The cameras drift away and loss becomes NaN.

How can i export the camera pose estimated by barf?

When i use my own dataset to trainning the model,how can i export the camera pose file? Or where can i find it?

Question about test and evaluate

I would like to ask a question about the difference between test and evaluate, I notice that for the "test-optim" mode, you would still perform some refinement and for the "eval" part, you evaluate the results, so if I have trained a model, I'm going to test the model and regenerate the figure, I should use the "test-optim" mode or the "eval" mode, from my understanding, I should use the "test-optim" mode? I'd appreciate it if you answered my questions about the difference about test and evaluate.

By the way, great works and thank you in advance!!!

How to train with my own images?

thanks for the code！that‘s really a nice work！
But there seems to be a little description of how to use my own images for training, could you tell me more about it?

How to create a 3D-model from the output of iphone experiment?

Hi Chen-Hsuan,

I successfully created an experiment in the iphone.py style. That is, without any information of camera poses. Is it possible to create a 3D model from the output?

Thanks in advance,

questions about how to do the experiments on my own dataset

Hello Chenhsuan!
Thanks for sharing your code!
I got a problem with your experiments on my own dataset, which is shot on iPhone.
I see your video which introduces BARF on youtube. In the end, you give the results on your life sequences such as living room or kitchen.
could you tell me how can I apply your approach to my dataset, so that I can try novel view synthesis by just providing an image folder with no pre-computed camera poses ?

initial camera parameters

Hi @chenhsuanlin,

Thanks for great work. I have some initial estimates of camera parameters from some of my own images. I wanted to use the blender dataset and I was wondering if I understand correctly that the transform_matrix in transforms_train.json for each frame is the camera extrinsic parameter in the form of [R | t], right?

Extract mesh or rendering on Custom Data?

Hi, thanks for sharing such a good work!

I have a question regarding the custom data. I use custom data to train my network. But I found that there is no code to render image or extract mesh based on the trained network.

Would you mind giving an introduction regarding how to represent custom data? Thank you.

Any help will be greatly appreciated!

graph.pose_noise not saved for models with blender dataset?

I hope this is not too dumb of a question, but I'm having trouble understanding how evaluate.py can correctly derive the mean rotation and translation errors from a model checkpoint file if the camera noise is not saved as part of the checkpoint?

As I understand it, self.graph.se3_refine is learning pose corrections from the identity (for non-blender models) or from randomly-initialized noise (for blender models). In the blender case, when evaluating a trained model, wouldn't the camera noise be different between training and evaluation?

Thanks for your help!

Camera pose perturbation in every iteration

Hi @chenhsuanlin,

Thanks for sharing the great work. I had a doubt. Why are the camera poses perturbed after every iteration during training?

Image from validation cameras 'moves away'

Hi Chen-Hsuan,

Thanks for the great work.
Every time the validation code is run, the resulting renders seem to move away from the camera (see images).

The underlying reason is that the arguments passed to preprocess_camera.py are passed by reference, so preprocess_camera.py changes the intrinsics of the underlying camera. This change occurs every time the validation code is run, so the intrinsics change, and the validation images become smaller and smaller. To avoid this issue, I have created a PR that detaches and clones the pose before preprocessing, therefore passing a copy of the intrinsics to be modified appropriately. #12

Let me know if we can pull this into main - it avoids the objects 'moving away', making the validation images easier to analyse.

Kind regards,
Stan

LLFF Fern produces zoomed-in views

Thank you for sharing the code!

We are trying to reproduce the paper results, and run into an issue where the LLFF Fern seems to produce zoomed-in views. Is this expected, or should we change something in our procedure?

Thanks!

Output:

Reference:

Unreasonable training result with nerf_blender_repr.yaml

Hi,
I wish to replicate the results from the original NeRF paper， but when I train with nerf_blender_repr.yaml， I only get PSNR 26.38 on lego scene. It's lower than nerf_blender.yaml, which can get PSNR 29.19.
Is there any mistake on nerf_blender_repr.yaml? Hope for your response.

Result seems not right

Hello, thanks for your great work!
I ran the barf code on the chair of the blender dataset, but the validation result doesn't seem right on tensorboard, is this normal?
Train:

Test:

implement on registration

Dear ChenHsuan Lin, thanks for sharing your great job!
I just have some difficulty to understand the main idea of the "camera poses registration" part in your article. I wonder if I can understand as it's through back propagation to optimize the initial camera poses? And where can i find this part in your codebase?

(You can express in Chinese if it's more comfortable. I just saw your name but i'm not sure hhh. If not, ignore this line.)

Multiprocessing issue on Windows

Hi Chen-Suan,

There is an issue with multiprocessing on Windows (see screenshot below). Wrapping the functions in train.py and evaluate.py with a main() function resolves them. I made a PR that resolves this issue: #11

Let me know if you think you can merge it into main :)

Kind regards,
Stan

What computer configuration are you running on?

Thanks for your great work.What computer configuration are you running on?

train loss converged, but val loss do not

I use Nuscenes data, a auto-driving dataset which have camera to world transform matrix, to train the barf. And I normalize the translation matrix between 1 to 10. I used tensorboard to visualize the training process and found the train loss converged, but the val loss went up. Do you have some ideas about the reason for this?

How to train for the case of a real life object and blender-like camera setting

hello thanks for the sharing~!
im trying to train barf with my custom dataset. I dont know the camera intrinsics and
the camera is located at 3 position(0, 30, 60 degree) spherically and an object is on the turn-table and took pics of it every 15 degrees.
so i have 72 pics ((pitch 0 yaw 15, 30, 45,...,360) and (pitch 30 yaw 15, 30, 45,...,360) and (pitch 60 yaw 15, 30, 45,...,360)).
so, even the camera is fixed per each pitch, its similar as blender style camera movement.
Barf support blender and llff but i failed to provide applicable camera pose information(.json or npz) as blender or llff dataloader.py need. so i tried to use iphone configuration.

i seems to train at a level. but is quite far from succeed after 200000steps.

i tried to initialize the camera pose spherically manually but it gives worse results.

please give me some hints to solve it.

thanks^^.

distort visualization in visdom

Hi Chenhsuan!

Thank you for sharing the code! I ran into a small issue regarding the visdom visualization: the camera meshes are sometimes distorted as shown in below figure. Have you encountered this problem before? Thank you

!

Hello, what is your experimental environment like?

How to optimize intrinsics as well?

Hi @chenhsuanlin great work!

I'm trying BARF on my custom data, and it shows promising results. However, I'm wondering whether the performance will be better if we optimize the intrinsic as well. Do you feel it's doable? If so, could you please guide me on where I should be modifying? Thanks!

Out of memory problem

Hi! Thank you for sharing your fantastic work!
When I was trying to train the network with a 3090 in barf mode with position encoding with blender datasets, after validating, it always show that
RuntimeError: CUDA out of memory. Tried to allocate 5.25 GiB (GPU 0; 23.70 GiB total capacity; 15.75 GiB already allocated; 4.71 GiB free; 15.77 GiB reserved in total by PyTorch)
Could you please tell me which parameter I should change to avoid this problem? Thank you very much!

Training BARF w/o camera pose

Hi, thank you for releasing the code.
I want to train BARF w/ the photos taken by myself, and there is no camera pose.
Please advise me on how I should do with this.

Thank you very much.

About training gpu type and training time?

Thanks for your great work! I want to know about training gpu type and training time for BARF and NeRF.

A question

after i train a barf and a image align model ,i get a .ckpt file ,how i can use the model or test the model

rotation and translation losses

hi @zhenpeiyang and thanks for your interesting paper and code!
A detailed that is disturbing me is that when you start with pose_GT=pose, you have a non-zero rotation loss, and it's lowering during training!

Matthieu.

Error in magma_getdevice_arch: MAGMA not initialized (call magma_init() first) or bad device

Hi Chen-Hsuan

First of all thank you very much for uploading your work here.

I cloned your repo to try to run the experiments on my local machine (Ubuntu 20.04). I tried to run the chair dataset with BARF with these 2 lines:

python3 train.py --group=G0 --model=barf --yaml=barf_blender --name=Test1 --data.scene=chair --barf_c2f=[0.1,0.5] --max_iter=2000 --visdom!

python3 evaluate.py --group=G0 --model=barf --yaml=barf_blender --name=Test1 --data.scene=chair --data.val_sub= --resume

But when running the evaluate.py file I got an error regarding MAGMA. Do you know the cause of this error? How can I fix it?

Thanks in advance,

Training are very sensitive to network initialization

Hello,
Thank you for the great work! I really like the neat project structure so I am experimenting with NeRF on this codebase. However, I find the training is very sensitive to network initialization.

First, if I trained on the hotdog scene for 2000 iterations using default configs in the options/nerf_blender_repr.yaml, I would get a normal render from self.nerf, but an empty render from self.nerf_fine.

The command is

python train.py --group=nerf-debug --model=nerf --yaml=nerf_blender_repr --name=debug-hotdog-nerf-9 --data.scene=hotdog

The result shown in tensorboard is

After some investigation, I found that the reason lay in the different initialization weights of the two networks. Then I tried changing the random seed from 0 to 233, this time both networks rendered an empty scene.

The command is

python train.py --group=nerf-debug --model=nerf --yaml=nerf_blender_repr --name=debug-hotdog-nerf-10 --data.scene=hotdog --seed=233

The result is

Finally, I tried copying the self.nerf's weights to self.nerf_fine, using the following code

          if opt.nerf.fine_sampling:
              self.nerf_fine = NeRF(opt)
              self.nerf_fine.load_state_dict(self.nerf.state_dict()) // here

and set the random seed back to 0. This time the result was fine.

The command is

python train.py --group=nerf-debug --model=nerf --yaml=nerf_blender_repr --name=debug-hotdog-nerf-11 --data.scene=hotdog

The result is

Here, I want to post the results to discuss the general stability of NeRF training. I wonder if other NeRF repositories all have this kind of sensitivity to network initialization, or are we missing some initialization trick in this repo?

I believe this can relate to a more fundamental nature of NeRF. Do you have any ideas about this phenomenon? Thank you!

chenhsuanlin / bundle-adjusting-nerf Goto Github PK

bundle-adjusting-nerf's People

Contributors

Stargazers

Watchers

Forkers

bundle-adjusting-nerf's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs