donydchen / mvsplat Goto Github PK
View Code? Open in Web Editor NEWπ [ECCV'24] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Home Page: https://donydchen.github.io/mvsplat
License: Other
π [ECCV'24] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Home Page: https://donydchen.github.io/mvsplat
License: Other
Hi, thanks for the great work. I ran your DTU evaluation with the view sampler "evaluation_index_dtu_nctx3.json" and it results in the following:
qkv = rearrange(qkv, "(v b) n t -> b n (v t)", v=self.n_frames)
File "/home/shubhendujena/anaconda3/envs/mvsplat/lib/python3.10/site-packages/einops/einops.py", line 591, in rearrange
return reduce(tensor, pattern, reduction="rearrange", **axes_lengths)
File "/home/shubhendujena/anaconda3/envs/mvsplat/lib/python3.10/site-packages/einops/einops.py", line 533, in reduce
raise EinopsError(message + "\n {}".format(e))
einops.EinopsError: Error while processing rearrange-reduction pattern "(v b) n t -> b n (v t)".
Input tensor shape: torch.Size([3, 384, 256]). Additional info: {'v': 2}.
Shape mismatch, can't divide axis of length 3 in chunks of 2
I'd be grateful if you could help me fix this
Thanks in advance
HiοΌthanks for your amazing workοΌI came to a problem that my cuda is out of memoryοΌi got 4 A100οΌ40g per A100οΌοΌbut it still said that out of memory(already tried to reduce the batchsize). I personally just want to use a single A100 to train the code ,but i dont know how to modify it to just use one A100, and with the 40 g memory , i guess the batchsize needs to be modified too.
But i m still a freshman in school, got no help and no clues on how to successfully run it..(please forgive my poor English...)
Hi, what a great job!
I have set the cuda devices set CUDA_VISIBLE_DEVICES=0
and i run the evaluation code:python -m src.main +experiment=acid checkpointing.load=checkpoints/acid.ckpt mode=test dataset/view_sampler=evaluation dataset.view_sampler.index_path=assets/evaluation_index_acid.json test.compute_scores=true
there is something wrong with my code:
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: C:\Users\user.conda\envs\mvsplat\lib\site-packages\lpips\weights\v0.1\vgg.pth
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-M7IQG5O]:2765 (system error: 10049
Error executing job with overrides: ['+experiment=acid', 'checkpointing.load=checkpoints/acid.ckpt', 'mode=test', 'dataset/view_sampler=evaluation', 'dataset.view_sampler.index_path=assets/evaluation_index_acid.json', 'test.compute_scores=true']
Traceback (most recent call last):
File "F:\Github\mvsplat\src\main.py", line 143, in train
trainer.test(
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 754, in test
return call._call_and_handle_interrupt(
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\trainer\call.py", line 43, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 105, in launch
return function(*args, **kwargs)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 794, in _test_impl
results = self._run(model, ckpt_path=ckpt_path)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 943, in _run
self.strategy.setup_environment()
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 154, in setup_environment
self.setup_distributed()
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 203, in setup_distributed
_init_dist_connection(self.cluster_environment, self._process_group_backend, timeout=self._timeout)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\lightning_fabric\utilities\distributed.py", line 291, in _init_dist_connection
torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, **kwargs)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\torch\distributed\c10d_logger.py", line 74, in wrapper
func_return = func(*args, **kwargs)
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\torch\distributed\distributed_c10d.py", line 1148, in init_process_group
default_pg, _ = _new_process_group_helper(
File "C:\Users\user.conda\envs\mvsplat\lib\site-packages\torch\distributed\distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
I follow the instructions and i don't know why? Plz help!
Hi, when running the following test command:
python -m src.main +experiment=re10k checkpointing.load=checkpoints/re10k.ckpt mode=test dataset/view_sampler=evaluation dataset.view_sampler.index_path=assets/evaluation_index_re10k_video.json test.save_video=true test.save_image=false test.compute_scores=false
I get this error:
Saving outputs to /home/ali/git/mvsplat/outputs/2024-04-15/11-22-42.
rm: cannot remove 'outputs/local': No such file or directory
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /opt/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Restoring states from the checkpoint path at checkpoints/re10k.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from the checkpoint at checkpoints/re10k.ckpt
Testing: | | 0/? [00:00<?, ?it/s]ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
Error executing job with overrides: ['+experiment=re10k', 'checkpointing.load=checkpoints/re10k.ckpt', 'mode=test', 'dataset/view_sampler=evaluation', 'dataset.view_sampler.index_path=assets/evaluation_index_re10k_video.json', 'test.save_video=true', 'test.save_image=false', 'test.compute_scores=false']
Traceback (most recent call last):
File "/home/ali/git/mvsplat/src/main.py", line 143, in train
trainer.test(
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 754, in test
return call._call_and_handle_interrupt(
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 794, in _test_impl
results = self._run(model, ckpt_path=ckpt_path)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run
results = self._run_stage()
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1026, in _run_stage
return self._evaluation_loop.run()
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/utilities.py", line 182, in _decorator
return loop_run(self, *args, **kwargs)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 128, in run
batch, batch_idx, dataloader_idx = next(data_fetcher)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/fetchers.py", line 133, in __next__
batch = super().__next__()
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/fetchers.py", line 60, in __next__
batch = next(self.iterator)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/utilities/combined_loader.py", line 341, in __next__
out = next(self._iterator)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/utilities/combined_loader.py", line 142, in __next__
out = next(self.iterators[0])
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/_utils.py", line 694, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 42, in fetch
return self.collate_fn(data)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 265, in default_collate
return collate(batch, collate_fn_map=default_collate_fn_map)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in collate
return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in <dictcomp>
return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in collate
return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in <dictcomp>
return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 119, in collate
return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 160, in collate_tensor_fn
storage = elem._typed_storage()._new_shared(numel, device=elem.device)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/storage.py", line 866, in _new_shared
untyped_storage = torch.UntypedStorage._new_shared(size * self._element_size(), device=device)
File "/opt/conda/envs/mvsplat/lib/python3.10/site-packages/torch/storage.py", line 262, in _new_shared
return cls._new_using_fd_cpu(size)
**RuntimeError: unable to write to file </torch_14471_3616880733_1>: No space left on device (28)**
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Testing: |
I am using an RTX 3070Ti to run the test with 8GB of VRAM. I reckon this is not enough memory for the test to run.
Is it possible to make the test run on a GPU with less memory? (maybe by reducing batch size or the number of test videos loaded in to GPU memory)
Thanks for this great work.
Hello, I tested on the n3dv dataset using the model and weights provided by you and found that all NVS images have a ghosting issue. Since Gaussians are based on back-projecting to the world coordinate system from two views, I suspect the problem may be due to the accuracy of the camera parameters (as the camera parameters in the n3dv dataset are float64 and I converted them to float32) or could it be that image scaling is causing the Gaussian back-projection to not accurately align in depth (I scaled the images from 2k2k to 512512). Looking forward to your reply!
The following images are respectively the NVS RGB image and the depth map from the source viewοΌ
hello. when you apply center-crop for the image, I think you should fix the cx
and cy
for the image intrinsics. But in your implementation, you fix the fx
and fy
. Is this your implementation bug?
mvsplat/src/dataset/shims/patch_shim.py
Lines 15 to 21 in 378ff81
thank your mvsplat , i like it very much !
how support LLFF Mip-NeRF 360 dataset?
Hi Yuedong, thank you for open source your great work!
When I trained the model using 3 Nvidia RTX 3090s (batch size 4 per GPU), I got significantly worse results on the re10k.
psnr 22.12379274863242
ssim 0.7298626045353773
lpips 0.22073094525619313
Will fewer batchsize or multi-GPU training significantly affect the performance of the model?
By the way, I use the official weights and can get results consistent with the paper.
psnr 26.386906073201686
ssim 0.8690403559103327
lpips 0.12837660807718004
Hi, thanks for the great work. Can you please give some details what are the changes in the new diff-gaussian-rasterization package? Would the model trained with the new package compatible with old one?
Hi, thanks for the great work. The paper mentions D=128, and "After obtaining the multi-view depth predictions, we directly unproject them to 3D point clouds using the camera parameters."
So is the initial number of 3D Gaussians 128*K? Isn't this number too small? Approximately how many 3D Gaussians are there after the training is completed?
Hi, thanks for the great work. I have some questions about custom data training.
In the paper, re10k data training only input 2 context-view rgb images and corresponding intrinsics and extrinsicsοΌand output a novel view rgb.
Hi! Dear author, I follow the instructions in README.md to run evaluation part. I have downloaded the pretrained models and sub-datasets and saved to the checkpoints
and datasets
respectively, but I got wrong results. It seems that I missed something important. Do you have advice to deal with it?
python -m src.main +experiment=acid checkpointing.load=checkpoints/acid.ckpt mode=test dataset/view_sampler=evaluation dataset.view_sampler.index_path=assets/evaluation_index_acid.json test.compute_scores=true
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/8
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
rm: cannot remove 'outputs/local': No such file or directory
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
Saving outputs to /data5/ly/mmdet/mvsplat/outputs/2024-04-07/09-31-02.
rm: cannot remove 'outputs/local': No such file or directory
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Initializing distributed: GLOBAL_RANK: 5, MEMBER: 6/8
Initializing distributed: GLOBAL_RANK: 4, MEMBER: 5/8
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/8
Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/8
Initializing distributed: GLOBAL_RANK: 6, MEMBER: 7/8
Initializing distributed: GLOBAL_RANK: 7, MEMBER: 8/8
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/8
distributed_backend=nccl
All distributed processes registered. Starting with 8 processes
Restoring states from the checkpoint path at checkpoints/acid.ckpt
LOCAL_RANK: 5 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 4 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 7 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 6 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 3 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
Loaded model weights from the checkpoint at checkpoints/acid.ckpt
Testing DataLoader 0: 0%| | 0/16 [00:00<?, ?it/s]Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 6%|βββββββββββ | 1/16 [00:04<01:04, 0.23it/s]Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 12%|βββββββββββββββββββββ | 2/16 [00:04<00:31, 0.45it/s]Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 31%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 5/16 [00:04<00:10, 1.02it/s]Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 6/16 [00:05<00:08, 1.18it/s]Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 44%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 7/16 [00:05<00:06, 1.34it/s]Loading model from: /data2/ly/conda/envs/mvsplat/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Testing DataLoader 0: 81%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 13/16 [00:06<00:01, 2.13it/s]psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.061416834592819214 seconds per call
decoder: 24 calls, avg. 0.0018823047478993733 seconds per call
Testing DataLoader 0: 81%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 13/16 [00:06<00:01, 2.12it/s]
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.06180933117866516 seconds per call
decoder: 24 calls, avg. 0.0019558072090148926 seconds per call
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.06272295117378235 seconds per call
decoder: 24 calls, avg. 0.002048651377360026 seconds per call
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.0654122531414032 seconds per call
decoder: 24 calls, avg. 0.0019436180591583252 seconds per call
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.09683313965797424 seconds per call
decoder: 24 calls, avg. 0.002124359210332235 seconds per call
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
psnr 5.61385152890132
ssim 0.0009663698885840579
encoder: 8 calls, avg. 0.09448182582855225 seconds per call
decoder: 24 calls, avg. 0.0020943681399027505 seconds per call
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.10512921214103699 seconds per call
decoder: 24 calls, avg. 0.0021263360977172847 seconds per call
psnr 5.61385152890132
ssim 0.0009663698885840579
lpips 0.7411693197030288
encoder: 8 calls, avg. 0.10539361834526062 seconds per call
decoder: 24 calls, avg. 0.002041985591252645 seconds per call
hi, i encounter one problem. When i run the code with multiple GPUs distributed on the different nodes on the slurm. I find that I can not execute GPUS on different nodes. I wonder does the code support the distribution on different nodes?
will this have an MIT license similar to pixelSplat and UniMatch?
Thanks for your great work! @donydchen
I tried to train MVSplat using processed Realestate10K dataset provided by pixelSplat's author, but following error occurred.
The training loop run successfully for 10K steps.
I have no idea what this is. Maybe a zero division? Have you faced this error before?
Error executing job with overrides: ['+experiment=re10k', 'data_loader.train.batch_size=8'] Traceback (most recent call last): File "/home/liang/mvsplat/src/main.py", line 141, in train trainer.fit(model_wrapper, datamodule=data_module, ckpt_path=checkpoint_path) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit call._call_and_handle_interrupt( File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run results = self._run_stage() File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1033, in _run_stage self.fit_loop.run() File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run self.advance() File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance self.epoch_loop.run(self._data_fetcher) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run self.advance(data_fetcher) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 223, in advance batch = call._call_strategy_hook(trainer, "batch_to_device", batch, dataloader_idx=0) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 309, in _call_strategy_hook output = fn(*args, **kwargs) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 278, in batch_to_device return model._apply_batch_transfer_handler(batch, device=device, dataloader_idx=dataloader_idx) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 347, in _apply_batch_transfer_handler batch = self._call_batch_hook("transfer_batch_to_device", batch, device, dataloader_idx) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 336, in _call_batch_hook return trainer_method(trainer, hook_name, *args) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook output = fn(*args, **kwargs) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/pytorch_lightning/core/hooks.py", line 613, in transfer_batch_to_device return move_data_to_device(batch, device) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_fabric/utilities/apply_func.py", line 103, in move_data_to_device return apply_to_collection(batch, dtype=_TransferableDataType, function=batch_to) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_utilities/core/apply_func.py", line 72, in apply_to_collection return _apply_to_collection_slow( File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_utilities/core/apply_func.py", line 104, in _apply_to_collection_slow v = _apply_to_collection_slow( File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_utilities/core/apply_func.py", line 104, in _apply_to_collection_slow v = _apply_to_collection_slow( File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_utilities/core/apply_func.py", line 96, in _apply_to_collection_slow return function(data, *args, **kwargs) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/lightning_fabric/utilities/apply_func.py", line 97, in batch_to data_output = data.to(device, **kwargs) File "/home/liang/anaconda3/envs/mvsplat/lib/python3.10/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 1201923) is killed by signal: Floating point exception. Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Hi,
Thank you for the work. Could you please clarify the meaning behind "times_per_scene" in dataset_re10k.py?
Thanks in advance
Hi, sincerely appreciate sharing this amazing work!
I am currently working on reproducing the DTU cross-generalization test results as described in your recent publication.
Despite my efforts to follow the experimental setup outlined in the paper, including ensuring the camera requirements are met (with normalized intrinsic parameters and cam2world matrices for extrinsic parameters), I've encountered difficulties in replicating the results presented in your paper, specifically the quality of the images.
For reference, here are the results I obtained:
Would it be possible for you to share the specific dataloader used for the DTU evaluation or provide any guidance or recommendations that could aid in accurately evaluating the tests?
Thanks in advance.
Sincerely,
Hi, thanks for your great work!
The version of CUDA on my machine is 11.6. Does this mean I have to upgrade CUDA version at least to 11.8 to match torch==2.1.2? It is possible that I use the cu116 and torch==1.13.1 and the matched torchvision and torchaudio.
Hopes your reply!
Great work.
I try to quickly validate the effectiveness of the network by "overfitting" on a small re10k subset, and the results seem to be less than my expectation. I wonder if I miss some key points of your work. Below are the settings.
Dataset: re10k subset
Training platform: 4 GPUs of 4090, batch_size=16 with 4 for each GPU.
Hyper-params: same as your newly-released codes, I didn't change any.
Training command keys:
+experiment=re10k
data_loader.train.batch_size=4
checkpointing.every_n_train_steps=5000
Test command keys:
+experiment=re10k
checkpointing.load=outputs/2024-04-15/17-57-20/checkpoints/epoch_1499-step_15000.ckpt
mode=test
dataset/view_sampler=evaluation
test.compute_scores=true
The results:
Testing DataLoader 0: 93%|βββββββββββββββ | 38/41 [00:06<00:00, 6.03it/s]
psnr 21.53944028051276
ssim 0.7834970355033875
lpips 0.21417335558094477
encoder: 33 calls, avg. 0.0347950892014937 seconds per call
decoder: 99 calls, avg. 0.0010259873939282966 seconds per call
That is, after "overfitting" on re10k subset by 1499 epochs /15000 steps, the model gets a PSNR with 21.54 on this subset (the renderring visualizations are also not good), much less than my expectation. Generally, I expect that the model could reach PSNR~30 after "overfitting on a small subset" by 100 epochs.
diff-gaussian-rasterization-modified is rightly built and installed. I have checked the test results of your released re10k model, which are consistent with Table. 1 (PSNR=26+)
I have referred to issue 14, i.e., the test results are good after large-scale training.
Maybe your proposed model is not suitable for "overfitting" on a small subset, right? But why? If so, it seems counter-intuitive in this field.
I prefer it is that I miss some key points. Look forward to your clarification. Thanks.
Attachment is the training log, for your checking.
20240415_175717.log
Hi @donydchen, kudos to your work. Your work MVSplat has my keen interest and I'm particularly interested in using it with custom datasets, but I'm having some trouble with a few things.
1.Do i have to upload custom dataset to the YouTube and use the URL, like in your dataset? Is there another approach I can take? If yes, could you please tell that approach?
2.How do you generate timestamps, camera poses, images, and keys for a particular video?
Thank you in advance.
Hi, I tested the pre-trained model on the co3d dataset. However, the results seem very bad. I checked 1: the intrinsic and extrinsic parameters of the input with the epipolar model. 2 I checked the reshaped images for 256 * 256. 3: I adjusted the depth of the near and far carefully. I wonder if is it because of the generalizability of the pre-trained modelοΌ Thank you so much.
Thank you for your excellent workοΌ
I noted that your evaluation mainly focus on rendering speedοΌthen how many hours does it take to train the modelοΌ
Thanks for your great contribution to this promising and interesting field.
I noticed that the paper's main experiment focused on two-view inputs, similar to PixelSplat. However, as you mentioned in the article, the MVS-based method can naturally be applied to multi-views(>2). Can the current pre-trained model directly extend to multi-view (>2) input?
Besides, the cost volume used in the paper needs the (near, far) plane for discrete depth sampling. So when we extend to other datasets w/o gt (near, far) as input, how should we deal with it? Also, while each view has a separate cost volume, when the view becomes dense and reso becomes larger, how to deal with the increased parameters and the need for cross-view information exchanging?
Hi, thanks for this great work. I don't understand why a multiplier generated by intrinsic
and pixel_size
is used for scales
.
scale_min = self.cfg.gaussian_scale_min
scale_max = self.cfg.gaussian_scale_max
scales = scale_min + (scale_max - scale_min) * scales.sigmoid()
h, w = image_shape
pixel_size = 1 / torch.tensor((w, h), dtype=torch.float32, device=device)
multiplier = self.get_scale_multiplier(intrinsics, pixel_size)
scales = scales * depths[..., None] * multiplier[..., None]
Is this to convert the scale factor from the image space to the camera space?
Thanks in advance!
Hi, i have a question about training data. Have you trained only one checkpoints with all the training data? I mean i don't want to train different checkpoints for different dataset? And i wonder if it is possible and how should i organize the dataset? if i need to write different dataloader? Thanks.
Hi, thanks for the great work. I was trying to understand your code, and I have doubts with the following operation.
# Create world-space covariance matrices.
covariances = build_covariance(scales, rotations)
c2w_rotations = extrinsics[..., :3, :3]
covariances = c2w_rotations @ covariances @ c2w_rotations.transpose(-1, -2)
Could you please clarify why this is being done?
Thanks in advance
Hi, I am writing to bring to your attention an issue I encountered while attempting to export .ply files from scenes generated by MVSplat after running the Evaluation phase.
Background:
I utilized the following inputs and steps:
Inputs:
--Dataset: re10k
--Location: datasets/re10k/test
Output:
Scenes Generated after evolution phase: test -> re10k List of scenes: [ 0c4c5d5f751aabf5 28e8300e004ab30b 57d25dafabb5a238 67a69088a2695987 a56ba2efb5e3fdd9... so on ]
Issue Details:
After processing the Evaluation phase on the re10k dataset, I attempted to export .ply files from the generated scenes using the provided script:
python -m src.paper.generate_point_cloud_figure_mvsplat
+experiment=re10k
checkpointing.load=checkpoints/re10k.ckpt
mode=test
dataset/view_sampler=evaluation
I made a modification to load index.json in generate_point_cloud_figure_mvsplat.py as follows:
with open("datasets/re10k/test/index.json") as f:
test_cfgs = json.load(f)
However, upon running the script, I encountered errors.
Questions:
1. How can I specify scenes as inputs in the script?
2. What steps are necessary to successfully export .ply files from the generated scenes after the Evaluation phase?
I appreciate your assistance in resolving this issue. Please let me know if there are any further steps or information required from my end.
Thank you
Hi,
I was wondering if there were any resources for training on custom data (video/images + COLMAP camera poses)?
Also, is there a way for exporting the model to a renderable PLY format?
For the DTU dataset, I noticed your code has normalized the camera intrinsics in the convert.py? As a result, the pose and intrinsics in the code are hard to understand and it is not convenient to use this code for 3rd dataset (Waymo or Mipnerf360).
Thank you for the incredible work. I have 2 questions:
Thank you for your time.
Hi, thanks for the great work. I have some question about the advantages and limitations of feature matching cost volume formulation:
Thanks
As in the paper, Sect. 3.2, the opacity is predicted from the matching confidence with 2 convolution layers. However, I can only find the function map_pdf_to_opacity
in the code that maps densities to opacities.
I wonder which one is the final implementation. Looking forward to your reply!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.