bennyguo / instant-nsr-pl Goto Github PK
View Code? Open in Web Editor NEWNeural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!
License: MIT License
Neural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!
License: MIT License
Hi,
I'm wondering if the showing case for the chair is using the same configuration as the neus-blender.yml you provided in the repo, and if it is possible to share the training params for several scenes as a guidance of hyper-params fine-tuning. Thanks in advance!
Currently, the chair model I trained using the provided .yml cannot reproduce the mesh as good as the one in your show case. Here is my test result (256 resolution mesh).
I was wondering if it is feasible to extract a mesh with very high resolution using the Marching Cube algorithm, say 2048**3? In theory, it shouldn't require excessive memory, but we seem to be encountering difficulties with both the CPU and GPU, resulting in cuda out of memory errors. Could you help us with this problem?
I tried to train DTU dataset using this implementation, but GPU memory exceed during backward after a few steps.
If I delete the rgb loss, than OOM disappear. The shape of tensor: rgb_ground_truth looks correct.
Any idea why this happened? Thanks in advance!
/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.54 GiB (GPU 0; 39.45 GiB total capacity; 21.52 GiB already allocated; 1.53 GiB free; 27.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Hi guys, thanks for the excellent work!
When I try the training code, I encountered the following error after Epoch 0 finished:
Global seed set to 42
Warning: FullyFusedMLP is not supported for the selected architecture 70. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
Warning: FullyFusedMLP is not supported for the selected architecture 70. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
------------------------------------
0 | model | NeuSModel | 12.6 M
------------------------------------
12.6 M Trainable params
0 Non-trainable params
12.6 M Total params
25.220 Total estimated model params size (MB)
Epoch 0: : 10000it [06:59, 23.83it/s, loss=0.000965, train/inv_s=1.47e+3, train/num_rays=8192.0] Traceback (most recent call last): | 0/2 [00:00<?, ?it/s]
File "/home/zhouyiren/code/instant-nsr-pl/launch.py", line 123, in <module>
main()
File "/home/zhouyiren/code/instant-nsr-pl/launch.py", line 112, in main
trainer.fit(system, datamodule=dm)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 648, in _call_and_handle_interrupt
return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
return function(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1283, in _run_train
self.fit_loop.run()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 271, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 201, in run
self.on_advance_end()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 241, in on_advance_end
self._run_validation()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 299, in _run_validation
self.val_loop.run()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 143, in advance
output = self._evaluation_step(**kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 240, in _evaluation_step
output = self.trainer._call_strategy_hook(hook_name, *kwargs.values())
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1704, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/strategies/ddp.py", line 358, in validation_step
return self.model(*args, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/pytorch_lightning/overrides/base.py", line 90, in forward
return self.module.validation_step(*inputs, **kwargs)
File "/home/zhouyiren/code/instant-nsr-pl/systems/neus.py", line 137, in validation_step
out = self(batch)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/code/instant-nsr-pl/systems/neus.py", line 46, in forward
return self.model(batch['rays'])
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/code/instant-nsr-pl/models/neus.py", line 167, in forward
out = chunk_batch(self.forward_, self.config.ray_chunk, rays)
File "/home/zhouyiren/code/instant-nsr-pl/models/utils.py", line 22, in chunk_batch
out_chunk = func(*[arg[i:i+chunk_size] if isinstance(arg, torch.Tensor) else arg for arg in args], **kwargs)
File "/home/zhouyiren/code/instant-nsr-pl/models/neus.py", line 137, in forward_
rgb, opacity, depth = rendering(
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/nerfacc/vol_rendering.py", line 115, in rendering
rgbs, alphas = rgb_alpha_fn(t_starts, t_ends, ray_indices.long())
File "/home/zhouyiren/code/instant-nsr-pl/models/neus.py", line 121, in rgb_alpha_fn
rgb = self.texture(feature, t_dirs, normal)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/code/instant-nsr-pl/models/texture.py", line 26, in forward
color = self.network(network_inp).view(*features.shape[:-1], self.n_output_dims).float()
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/tinycudann/modules.py", line 145, in forward
output = _module_function.apply(
File "/home/zhouyiren/anaconda3/envs/instant-nsr/lib/python3.10/site-packages/tinycudann/modules.py", line 57, in forward
native_ctx, output = native_tcnn_module.fwd(input, params)
RuntimeError: /tmp/pip-req-build-8gaiqjyg/include/tiny-cuda-nn/cutlass_matmul.h:332 status failed with error Error Internal
Epoch 0: : 10000it [07:05, 23.49it/s, loss=0.000965, train/inv_s=1.47e+3, train/num_rays=8192.0]
I follow the instruction and train the model on COLMAP format data but got this result.
I note that the preprocess in NeuS(https://github.com/Totoro97/NeuS/tree/main/preprocess_custom_data) has another step:
So how can I define the region of interest in this repo?
Hi, thanks for the great work.
As I was trying to train the synthesed drums data on your framework (for the first time), the program freezes like this:
I understand that some scripts would be compiled first as the code is run for the first time, but it has been like more than 2 hours still fronzen.
Any advise would be very appreciated. Thanks in advance.
ImportError: /home/anaconda3/envs/instant-nsr-pl/lib/python3.8/site-packages/tinycudann_bindings_61/_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor6deviceEv
I have this error, according to the readme operation down, google and issue did not find a similar problem, you have encountered there?
Hi, bennyguo
I got this error when implementing your code.
ImportError: /home/eason/anaconda3/envs/nerf/lib/python3.10/site-packages/tinycudann_bindings/_86_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr9_M_addrefEv
system version:
linux 20.04
cuda 11.7
python 3.10
pytorch 1.13.1
Do you have any ideas to fix it? Thanks very much.
Hello, thanks for your great code! I am using your code to get textured meshes. I have low PSNR and the rendered RGB images have very high quality. I was able to get a very good color for my mesh with your previous version but the color of the current version seems to be wrong. Could be related to the alphas that you are calculating? Should I multiply the RGB values by a parameter to get the correct colors (similar to the rendered RGB)?
Hi, I run both neus and nerf, and I got the same ZeroDivisionError in systems\neus.py and systems\nerf.py.
Here's the cmd output for running neus:
Global seed set to 42 Using 16bit native Automatic Mixed Precision (AMP) GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs
Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
12.6 M Trainable params
0 Non-trainable params
12.6 M Total params
25.221 Total estimated model params size (MB)
Traceback (most recent call last):
File "G:\GitHub\instant-nsr-pl\launch.py", line 123, in
main()
File "G:\GitHub\instant-nsr-pl\launch.py", line 112, in main
trainer.fit(system, datamodule=dm)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1166, in _run
results = self._run_stage()
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1252, in _run_stage
return self._run_train()
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1283, in _run_train
self.fit_loop.run()
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\loop.py", line 200, in run
self.advance(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\fit_loop.py", line 271, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\loop.py", line 200, in run
self.advance(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 203, in advance
batch_output = self.batch_loop.run(kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\loop.py", line 200, in run
self.advance(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\batch\training_batch_loop.py", line 87, in advance
outputs = self.optimizer_loop.run(optimizers, kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\loop.py", line 200, in run
self.advance(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 201, in advance
result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 248, in _run_optimization
self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 358, in _optimizer_step
self.trainer._call_lightning_module_hook(
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1550, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\core\module.py", line 1705, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\core\optimizer.py", line 168, in step
step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\strategies\strategy.py", line 216, in optimizer_step
return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\plugins\precision\native_amp.py", line 85, in optimizer_step
closure_result = closure()
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 146, in call
self._result = self.closure(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 132, in closure
step_output = self._step_fn()
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 407, in _training_step
training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1704, in _call_strategy_hook
output = fn(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\strategies\dp.py", line 134, in training_step
return self.model(*args, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\parallel\data_parallel.py", line 169, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\overrides\data_parallel.py", line 65, in forward
output = super().forward(*inputs, **kwargs)
File "C:\Users\halbe\AppData\Local\Programs\Python\Python310\lib\site-packages\pytorch_lightning\overrides\base.py", line 79, in forward
output = self.module.training_step(*inputs, **kwargs)
File "G:\GitHub\instant-nsr-pl\systems\neus.py", line 86, in training_step
train_num_rays = int(self.train_num_rays * (self.train_num_samples / out['num_samples'].sum().item()))
ZeroDivisionError: division by zero
Epoch 0: : 0it [01:22, ?it/s]
[W ..\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)`
Hi @bennyguo, thanks for your great code! I wonder have you evaluated this repo on the geometry construction task and make a comparison to the original neus
or Instant-NSR
?
The exported .obj
file contains only the shape, not the corresponding material and color.
Hi, bennyguo
I hopes to find a simple way to save the trained model and the hashmap. Do you know how to do that?
Thank you for the great job.
I can run nerf but cannot run neus:
The terminal output is as follows:
Epoch 0: : 0it [00:00, ?it/s]Traceback (most recent call last): File "launch.py", line 123, in <module> main() File "launch.py", line 112, in main trainer.fit(system, datamodule=dm) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit self._call_and_handle_interrupt( File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 648, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch return function(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run results = self._run_stage() File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage return self._run_train() File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1283, in _run_train self.fit_loop.run() File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 271, in advance self._outputs = self.epoch_loop.run(self._data_fetcher) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 203, in advance batch_output = self.batch_loop.run(kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 87, in advance outputs = self.optimizer_loop.run(optimizers, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run self.advance(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 201, in advance result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position]) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 248, in _run_optimization self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 358, in _optimizer_step self.trainer._call_lightning_module_hook( File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1550, in _call_lightning_module_hook output = fn(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1705, in optimizer_step optimizer.step(closure=optimizer_closure) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 168, in step step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 289, in optimizer_step optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 216, in optimizer_step return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/native_amp.py", line 85, in optimizer_step closure_result = closure() File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 146, in __call__ self._result = self.closure(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 141, in closure self._backward_fn(step_output.closure_loss) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 304, in backward_fn self.trainer._call_strategy_hook("backward", loss, optimizer, opt_idx) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1704, in _call_strategy_hook output = fn(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 191, in backward self.precision_plugin.backward(self.lightning_module, closure_loss, optimizer, optimizer_idx, *args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 80, in backward model.backward(closure_loss, optimizer, optimizer_idx, *args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1450, in backward loss.backward(*args, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/root/miniconda3/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: trying to differentiate twice a function that was marked with @once_differentiable Exception ignored in: <function tqdm.__del__ at 0x7fb7ef681af0> Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/tqdm/std.py", line 1152, in __del__ File "/root/miniconda3/lib/python3.8/site-packages/tqdm/std.py", line 1306, in close File "/root/miniconda3/lib/python3.8/site-packages/tqdm/std.py", line 1499, in display File "/root/miniconda3/lib/python3.8/site-packages/tqdm/std.py", line 1155, in __str__ File "/root/miniconda3/lib/python3.8/site-packages/tqdm/std.py", line 1457, in format_dict TypeError: cannot unpack non-iterable NoneType object
.
Thank you in advance. ^ ^
Hi, when I set sdf_bias from 0 to -0.5 according to your latest updates, the psnr degrades. Could you explain the reason?
Awesome job Yuanchen! I was thinking of doing an example in nerfacc with NeuS. But you already got it worked even with pytorch-lightning!!
Do you want to be linked to nerfacc repo for showcase?
BTW I noticed you are relying on alpha marching & rendering, which is currently from your fork of nerfacc. I just add this support in nerfacc>=0.2.2
, along with relaxing the pytorch requirement. Check it out if interested! nerfstudio-project/nerfacc#94
Hi @bennyguo,thank you for your excellent work. During the experiment, I found that the psnr and ssim is very low, even lower than nues, and then I found that the perspective of the rendered image and the original image are different, and the two images are not aligned.
Hi,
Thanks for your great work!
I just noticed that during test time, the "materials" scene cannot export the mesh aftering training on NeuS. Here is the error message:
File "/local-scratch/localhome/***/models/geometry.py", line 90, in isosurface
vmin, vmax = mesh_coarse['v_pos'].amin(dim=0), mesh_coarse['v_pos'].amax(dim=0)
IndexError: amin(): Expected reduction dim 0 to have non-zero size.
I am testing NeuS with the DTU dataset (scan122 particularly). I am only using its RGB images without masks and computed new camera poses using COLMAP. However, it is not able to extract mesh and the NeRF is in low quality. It converged fine with masks on though. All other configs are defaults. What could I change to improve quality and extract the NeuS mesh without masks as input? Thanks
I have been trying out some scenes with the COLMAP-based pipeline you provide in the repo and I've been running into the issue of the model learning only the input views and not being able to generalize at all.
I first suspected that this was an issue with my data, but even on the synthetic NeRF datasets that work very well with that respective pipeline (e.g. the lego bulldozer or the chair), I face this same issue.
This might look like this:
One thing that I noticed is that the generated masks tend to be inverted compared to what one would expect - one extreme example for this is the drums scene (image from the training dataset again):
Generally, I have experienced these issues with a wide variety of scenes (both from synthetic-NeRF & various real-world samples such as the dog used in the original NeuS publication) and with both your COLMAP preprocessing script & our internal one. I'm using the latest state of the repo with default configs except for the fused MLP being replaced with the vanilla one and the image resolutions being adjusted if needed.
This issue could also be related to #17, as some (but not all) of the resulting meshes I get look quite similar.
Have you been able to get some good samples out of that pipeline in your testing?
It would be awesome if you had some insights into this issue.
Hi,
Where does 1.732 * 2 come from from the below line?
Line 58 in 6ab0c3d
Thanks for your nice work, my reconstruction targets are real outdoor scenes represented in scale of about 10 to 50 meters for each dimension. Nerf could be trained properly with hash encoding, but NeuS won't be trained correctly, which leads to train/inv_s=5
when model is converged. I have changed model.radius
to 10 so that the ray could go through the correct space.
Should I change other hyper-parameters accordingly to get the NeuS model fit for large scenes? Should I increase sphere_init_radius
accordingly as well?
Hi there, thank you for sharing, good work.
I want to run the code on windows and it says NCCL error.
So i changed the backend from NCCL to GLOO, and an invalid scalar type error pop up.
Do you have any idea why? What is your environment running the code? Mine is python3.10 Cudatoolkit11.3 with torch 1.12.1+cu113
Appreciate!
Hello, thanks for the work.
I cloned the code and run it on my PC. Since my graphics card is GTX1660S, I have to reduce 'train_num_rays' to 128 in case OOM error occurs(other parameters remain default). Then I extract the mesh and found it has some noise and holes. I thought maybe it's hard for my card to train a good result:(
So, would you mind share some pretrained models of, for example, nerf synthetic datasets? Or share some ideas of tuning those parameters so that a 6GB card can get good result? Thanks!
Thanks so much for this project!
Any chance to add an implementation of NeuRIS?
https://jiepengwang.github.io/NeuRIS/
https://arxiv.org/pdf/2206.13597.pdf
The basic premise is that one incorporates a surface normal prior to generating higher fidelity meshes. The currently available implementation https://github.com/jiepengwang/NeuRIS takes many hours to train (around 9 on an A6000) so using the optimization you've added here would be amazing.
I would love to know your thoughts!
Hi, thanks for sharing your code.
I've been trying out several things and found something weird.
When using sphere initialization of the vanilla MLP, I expected the initial shape to be a sphere.
If you render the outputs of the initialized model by setting val_check_interval=1, the images (rgb, normal, depth) indeed resemble a sphere.
However, the marching cubes fail with the following error message
vmin, vmax = mesh_coarse['v_pos'].amin(dim=0), mesh_coarse['v_pos'].amax(dim=0)
IndexError: amin(): Expected reduction dim 0 to have non-zero size.
I guess this means that the aabb cube is empty.
When I looked into the code, I found that the VanillaMLP does not initialize the constants of the layers, which is different from the initialization of the paper "SAL: Sign Agnostic Learning of Shapes from Raw Data".
I think the make_linear
function should be as follows
def make_linear(self, dim_in, dim_out, bias, is_first, is_last):
layer = nn.Linear(dim_in, dim_out)
if self.sphere_init:
if is_last:
torch.nn.init.constant_(layer.bias, -bias)
torch.nn.init.normal_(layer.weight, mean=math.sqrt(math.pi) / math.sqrt(dim_in), std=0.0001)
elif is_first:
torch.nn.init.constant_(layer.bias, 0.0)
torch.nn.init.constant_(layer.weight[:, 3:], 0.0)
torch.nn.init.normal_(layer.weight[:, :3], 0.0, math.sqrt(2) / math.sqrt(dim_out))
else:
torch.nn.init.constant_(layer.bias, 0.0)
torch.nn.init.normal_(layer.weight, 0.0, math.sqrt(2) / math.sqrt(dim_out))
else:
torch.nn.init.kaiming_uniform_(layer.weight, nonlinearity='relu')
if self.weight_norm:
layer = nn.utils.weight_norm(layer)
return layer
Also, from forward
and forward_level
methods in class VolumeSDF
if 'sdf_activation' in self.config:
sdf = get_activation(self.config.sdf_activation)(sdf + float(self.config.sdf_bias))
The if statement is True
even when you simply set sdf_activation
to None
in the config, since it's still in the config. I found that this leads the sdf values to be all positive at the start of training. I just removed the sdf_activation
in the config.
After changing this part and setting the bias of the SDF to 0.6, the initial model output is as follows:
And the result of marching cubes is indeed a sphere.
However, I found that by changing the model like this results in very poor training results.
Also, the mesh is completely broken
So, I guess you had a reason for this design choice? Otherwise, I think this might be the reason why training the model on my custom dataset fails.
Hi, thanks for your awesome work. I get a weird error during validation:
terminate called after throwing an instance of 'std::runtime_error' what(): /tmp/pip-req-build-z4954kz1/include/tiny-cuda-nn/cuda_graph.h:124 cudaGraphExecUpdate(m_graph_instance, m_graph, &error_node, &update_result) failed with error the graph update was not performed because it included changes which violated constraints specific to instantiated graph update Aborted
Do you know what causes this problem and how to solve it? Thank you in advance!
Hi, thanks for your great code! I am wondering if I can run the original version of NeuS (which is much slower) in this repo? Are there any configs that can achieve this?
def contract_to_unisphere(x, radius, contraction_type):
if contraction_type == ContractionType.AABB:
x = scale_anything(x, (-radius, radius), (0, 1))
elif contraction_type == ContractionType.UN_BOUNDED_SPHERE:
x = scale_anything(x, (-radius, radius), (0, 1))
x = x * 2 - 1 # aabb is at [-1, 1]
mag = x.norm(dim=-1, keepdim=True)
mask = mag.squeeze(-1) > 1
x[mask] = (2 - 1 / mag[mask]) * (x[mask] / mag[mask])
x = x / 4 + 0.5 # [-inf, inf] is at [0, 1]
else:
raise NotImplementedError
return x
I don't know why you implemented a x = x / 4 + 0.5 # [-inf, inf] is at [0, 1]
here, could you explain it a little bit :)
Thank you!
Hi,
I found that the result of neus-colmap is really bad. I used colmap on nerf_synthetic data then trained the model, and the result is very bad.
I have also tried to change radius in neus-colmap.yaml file as in #20. I have tried 0.5, 1, 1.5 which all gave me bad results.
I tested nerf-colmap, which seems to be working, but neus-colmap isn't working on the nerf_synthetic data. Any idea what went wrong here?
Thanks
Hi, Benny. Have you faced the ghost floater problem when using NeuS+HashEncoding?
The rendered image is good and converged to the GT. But the mesh/normal bump/sink in some areas and many floaters are on the air.
encoding_config={
"otype": "HashGrid",
"n_levels": 16,
"n_features_per_level": 2,
"log2_hashmap_size": 19,
"base_resolution": 16,
"per_level_scale": 1.447269237440378,
"include_xyz": True,
}
SDF Network is nn.Linear(encoding.n_output_dims, 65)
Any idea is welcome~
Trainer(limit_train_batches=1.0)
was configured so 100% of the batches per epoch will be used..LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
12.6 M Trainable params
0 Non-trainable params
12.6 M Total params
25.220 Total estimated model params size (MB)
Epoch 0: : 0it [00:00, ?it/s]
after this line code got stuck
Trainer(limit_train_batches=1.0)
was configured so 100% of the batches per epoch will be used..LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
12.6 M Trainable params
0 Non-trainable params
12.6 M Total params
25.220 Total estimated model params size (MB)
Epoch 0: : 0it [00:00, ?it/s]
Hi Yuanchen,
When I train the model on my own data, I realized that my validation set gets overfitting. So I printed out the split source for each batch, then I found that self.dataset
didn't switch back to train_dataloader().dataset
after the first validation process finished. In other words, after the first validation process, self.dataset
remains to be val_dataloader().dataset
, so the model keeps training on my val set which causes the overfitting.
I think the reason is that you manually switch the self.dataset
inside on_train_start()
which is called only once at the very beginning outside of the training loop. So the training loop has no chance to use the training set once on_val_start()
is called.
For now, my quick fix is moving self.dataset = self.trainer.datamodule.train_dataloader().dataset
to on_train_batch_start()
, but it is not so efficient. Otherwise, I think we need to refactor datamodule somehow (move the self.preprocess_data()
to the dataloader) to avoid a manual switch.
Can you please let me know if it supports LLFF data for customized images?
thx for great works. when i train lego example with neus_blender.yml , i got the non-zero size error, i tried change the seed number, but that did not work.
vmin, vmax = mesh_coarse['v_pos'].amin(dim=0), mesh_coarse['v_pos'].amax(dim=0)
IndexError: amin(): Expected reduction dim 0 to have non-zero size.
Here are some findings about improving the reconstruction quality.
In my original implementation, I omitted the bias terms in the geometry MLP for simplicity as they're initialized to 0. However, I found that these bias terms are important for producing high quality surfaces especially for detailed regions. A possible reason is that the "shifting" brought by these biases acts as some form of normalization, making the high frequency signals easier to lean. Thanks @terryryu to mention this problem in #22. Fixed in latest commits.
Although the original NeuS paper adopts L1 as the photometric loss for its "robustness to outliars", we found that L1 could lead to suboptimal results in certain cases, like the Lego bulldozer:
L1, 20k iters | MSE, 20k iters |
---|---|
Therefore, we simultaneously adopt L1 and MSE loss in NeuS training.
Training NeuS without background model can lead to floaters (uncontrolled surfaces) in free space. This is because floaters in background color do no harm to rendering quality, therefore cannot be optimized when training with only photometric loss. We alleviate this problem by random background augmentation (masks needed) and the sparsity loss proposed in SparseNeus (no masks needed):
For NeRF, we adopt the distortion loss proposed by MipNeRF 360 to alleviate the floater problem in training unbounded 360 scenes.
In the given config files, the number of training steps is set to 20000
by default. This works well for objects of simple geometries, like the chair
scene in the NeRF-Synthetic dataset. However, for more complicated cases where many thin structures occur, more training iterations are needed to get high quality results. This could simply be done by setting trainer.max_steps
to a higher value, like 50000
or 100000
.
A simple way to tell whether the training is converging is to check the value of inv_s
(which is shown in the progress bar by default). If inv_s
is steadily increasing (often ends up with >1000
), then we are good. If the training diverges, inv_s
typically drops below the initialized value and gets stuck. There are many reasons that could lead to divergence. To alleviate divergence caused by unstable optimization, we adopt an learning rate warm-up strategy following the original NeuS in latest commits.
Hi Yuanchen,
Great work on developing this. Could you point of the differences between this and Instant-NSR.
Hi,
This is some great work!
One thing i would like to understand, the wavefront(.obj) file obtained has no texture. How do i get the texture?
Hello! I see in your codes, it is: comp_rgb = comp_rgb + self.background_color * (1.0 - opacity)
I thought it should be comp_rgb = comp_rgb*opacity + self.background_color * (1.0 - opacity)
Did I miss something? could somebody explain that?
Thank you in advance!
Thanks to the author! When I tried it on pytorch-lightning -- 1.9.0, I had met several bugs due to the new features of pytorch-lightning. I would like to share my revised codes here:
# from pytorch_lightning.utilities.rank_zero import _get_rank
# -->
from lightning_fabric.utilities.rank_zero import _get_rank #corrected by yy
# from pytorch_lightning.callbacks.base import Callback
# -->
from pytorch_lightning.callbacks import Callback
Note that, it would help to fix the bug named "ValueError("Expected a parent")".
Neus:
shufujia-1
cameras_sphere.npz image mask
ours:
ls ./load/nerf_synthetic/lego*
test train transforms_test.json transforms_train.json transforms_val.json val
From the output of colmap, how do I construct my input?
Hi, can you help me with translating the format in IDR's .npz file into the transformation.json file.
Actually, I do not understand the relationship between these two formats.
First of all, thank you so much for providing such wonderful work.
I have two questions.
First, what is sphere_init_radius
in neus? I wonder what exactly this role does, and how it relates to radius
.
The neus implementation here only provides a square bounding box (as using radius
). Why would not provide a rectangular bounding box? Are there any issues with that version of implementation?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.