eladrich / latent-nerf Goto Github PK
View Code? Open in Web Editor NEWOfficial Implementation for "Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures"
License: MIT License
Official Implementation for "Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures"
License: MIT License
Is there a specific conda environment that's able to be provided? Getting errors on a p3.2xl which is what the V100 is on an aws EC2
It seems like that I encode a high resolution(over 3000 x 4000) will still be OOM, I mean not for train, just for testing the limit of encoding.
I tried to reproduce the result with default parameters. However, I got worse results than the reported results.
Dear eladrich,
Thanks for your great repo! I tried to add a start_shading_iter. However I find that the result of adding "lambertian" shading is kinda strange. I just wanna ask what may cause this problem? And also I find that if finetune the training epoch from 5000 to 10000, sometimes it will a produce a result which whole color is black? Has anybody faced the same problem before?
What is the easiest way to convert a NeRF generated using this project into a mesh (.obj, for example)?
Can you share the training cost of your awesome work? thx~
Hi, I set the epoch from 5000 to 10000, and adjusted the learning rate from 1e-3 to 5e-4, and got quite good results. However, this learning rate seems not good for every object. I am a bit confused with the setting of lr and epoch. Can somebody give me some advice? Thx a lot!
Hi, I keep getting Cuda out of memory...
Which parameters should I change for this?
Thanks for the great research
Seems that diffuse reflectance (a.k.a dot product shading) should work with any number of channels, but
I'm wondering what the light color, and ambient light color should be in latent space. Dream Fusion uses light color [.9, .9, .9] and ambient light color [.1, .1, .1] in RGB space. How does this translate to latent space?
pls let me know if you are planning to release a google collab since in September it was going to be released soon
When running the unconstrained Latent-NeRF for text-to-3D, demo command (below), I get a runtime error during the tilegrid encoding
run command: python3 -m scripts.train_latent_nerf --config_path demo_configs/latent_nerf/sand_castle.yaml
Error:
import _gridencoder as _backend
ModuleNotFoundError: No module named '_gridencoder'
During handling of the above exception, another exception occurred:
21 errors detected in the compilation of "/tmp/tmpxft_000039fb_00000000-6_gridencoder.cpp1.ii".
ninja: build stopped: subcommand failed.
Hi,
The results look great. I was wondering these results do miss high frequency details. Do you know why?
This is perfect work! If I want to use my data(eg, some images) as input, how to modify nerf_dataset.py. Can you give me some advice? Thank you very much!
Thanks for your great research.
Compared with the raw version of Dreamfusion(or Stable Dreamfusion), it seems that the orientation loss has been ignored in Latent-NeRF. This loss about normal is designed for better 3D geometry. Do you ignore this loss for some reason?
I use the default codebase and command for training:
python -m scripts.train_latent_nerf --log.exp_name 'sand_castle' --guide.text 'a highly detailed sand castle' --render.nerf_type latent
And I get this strange result '5001_rgb.mp4'. Does anyone have some advice?
Hi, thanks for your great work! When I tried training the sketched-guided latent nerf myself, I found the "teddy.obj"
cannot be loaded properly due to unclear reasons, and the result I got with demo_configs/lego_man.yaml
looks like trained from scratch and has no relation with the sketch shape. Then I tried loading the "teddy.obj"
file with the trimesh
library and found that it was loaded as a Scene
object instead of a Trimesh
object.
I have fixed this issue by converting the Scene
into a Trimesh
and exporting the result into a new obj file according to this link. But I still hope you can check the "teddy.obj"
file in case someone else meets the same issue.
Thanks for your interesting work again!
Hello, I'm trying to reproduce the german shepherd example in the paper by using the animal.obj file in the shapes folder but it's far from the quality presented in the paper. I'm just modifying the demo_config for lego man with the animal.obj, is there anything else I need to add to reproduce it?
Hi,
Thank you for your nice work and open source.
Could you give me an example command to run Textual Inversion? I run it with this command python -m scripts.train_latent_nerf --log.exp_name 'textual inversion backpack' --guide.text 'a backpack that looks like *' --render.nerf_type latent --guide.concept_name=cat-toy, but cannot get a good result, is there some problem with my command?
Hope to receive your reply. Thanks.
I tried to run the command for snadcastle as per the readme which was:
python -m scripts.train_latent_nerf --config_path demo_configs/latent_nerf/sand_castle.yaml
I have installed gridencoder from the stable-dreamfusion repo, not sure how to resolve the error.
But it throws a TypeError: at Grid_encode_forward call.
Below is the trace output
`/usr/lib/python3.8/runpy.py:192 in _run_module_as_main │
│ │
│ 189 │ main_globals = sys.modules["main"].dict │
│ 190 │ if alter_argv: │
│ 191 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 192 │ return _run_code(code, main_globals, None, │
│ 193 │ │ │ │ │ "main", mod_spec) │
│ 194 │
│ 195 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.8/runpy.py:85 in run_code │
│ │
│ 82 │ │ │ │ │ loader = loader, │
│ 83 │ │ │ │ │ package = pkg_name, │
│ 84 │ │ │ │ │ spec = mod_spec) │
│ ❱ 85 │ exec(code, run_globals) │
│ 86 │ return run_globals │
│ 87 │
│ 88 def run_module_code(code, init_globals=None, │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/scripts/train_latent_nerf.py:17 in │
│ │
│ 14 │ │ trainer.train() │
│ 15 │
│ 16 if name == 'main': │
│ ❱ 17 │ main() │
│ │
│ /home/vghorpad/stable-diff/venv_sdfusion20/lib/python3.8/site-packages/pyrallis/argparsing.py:15 │
│ 8 in wrapper_inner │
│ │
│ 155 │ │ │ argspec = inspect.getfullargspec(fn) │
│ 156 │ │ │ argtype = argspec.annotations[argspec.args[0]] │
│ 157 │ │ │ cfg = parse(config_class=argtype, config_path=config_path) │
│ ❱ 158 │ │ │ response = fn(cfg, *args, **kwargs) │
│ 159 │ │ │ return response │
│ 160 │ │ │
│ 161 │ │ return wrapper_inner │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/scripts/train_latent_nerf.py:14 in main │
│ │
│ 11 │ if cfg.log.eval_only: │
│ 12 │ │ trainer.full_eval() │
│ 13 │ else: │
│ ❱ 14 │ │ trainer.train() │
│ 15 │
│ 16 if name == 'main': │
│ 17 │ main() │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/training/trainer.py:124 in train │
│ │
│ 121 │ def train(self): │
│ 122 │ │ logger.info('Starting training ^^') │
│ 123 │ │ # Evaluate the initialization │
│ ❱ 124 │ │ self.evaluate(self.dataloaders['val'], self.eval_renders_path) │
│ 125 │ │ self.nerf.train() │
│ 126 │ │ │
│ 127 │ │ pbar = tqdm(total=self.cfg.optim.iters, initial=self.train_step, │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/training/trainer.py:173 in evaluate │
│ │
│ 170 │ │ │
│ 171 │ │ for i, data in enumerate(dataloader): │
│ 172 │ │ │ with torch.cuda.amp.autocast(enabled=self.cfg.optim.fp16): │
│ ❱ 173 │ │ │ │ preds, preds_depth, preds_normals = self.eval_render(data) │
│ 174 │ │ │ │
│ 175 │ │ │ pred, pred_depth, pred_normals = tensor2numpy(preds[0]), tensor2numpy(preds │
│ 176 │ │ │ │ preds_normals[0]) │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/training/trainer.py:261 in eval_render │
│ │
│ 258 │ │ ambient_ratio = data['ambient_ratio'] if 'ambient_ratio' in data else 1.0 │
│ 259 │ │ light_d = data['light_d'] if 'light_d' in data else None │
│ 260 │ │ │
│ ❱ 261 │ │ outputs = self.nerf.render(rays_o, rays_d, staged=True, perturb=perturb, light_d │
│ 262 │ │ │ │ │ │ │ │ ambient_ratio=ambient_ratio, shading=shading, force_a │
│ 263 │ │ │
│ 264 │ │ pred_depth = outputs['depth'].reshape(B, H, W) │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/renderer.py:410 in render │
│ │
│ 407 │ │ │ results['weights_sum'] = weights_sum │
│ 408 │ │ │
│ 409 │ │ else: │
│ ❱ 410 │ │ │ results = _run(rays_o, rays_d, **kwargs) │
│ 411 │ │ │
│ 412 │ │ return results │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/renderer.py:282 in run_cuda │
│ │
│ 279 │ │ │ │ │
│ 280 │ │ │ │ xyzs, dirs, deltas = self.raymarching.march_rays(n_alive, n_step, rays_a │
│ 281 │ │ │ │ │
│ ❱ 282 │ │ │ │ sigmas, rgbs, normals = self(xyzs, dirs, light_d, ratio=ambient_ratio, s │
│ 283 │ │ │ │ self.raymarching.composite_rays(n_alive, n_step, rays_alive, rays_t, sig │
│ 284 │ │ │ │ │
│ 285 │ │ │ │ rays_alive = rays_alive[rays_alive >= 0] │
│ │
│ /home/vghorpad/stable-diff/venv_sdfusion20/lib/python3.8/site-packages/torch/nn/modules/module.p │
│ y:1190 in _call_impl │
│ │
│ 1187 │ │ # this function, and just call forward. │
│ 1188 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1189 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1190 │ │ │ return forward_call(*input, **kwargs) │
│ 1191 │ │ # Do not call functions when jit is used │
│ 1192 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1193 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/network_grid.py:108 in forward │
│ │
│ 105 │ │ │
│ 106 │ │ if shading == 'albedo': │
│ 107 │ │ │ # no need to query normal │
│ ❱ 108 │ │ │ sigma, color = self.common_forward(x) │
│ 109 │ │ │ normal = None │
│ 110 │ │ │
│ 111 │ │ else: │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/network_grid.py:62 in │
│ common_forward │
│ │
│ 59 │ │ # x: [N, 3], in [-bound, bound] │
│ 60 │ │ │
│ 61 │ │ # sigma │
│ ❱ 62 │ │ h = self.encoder(x, bound=self.bound) │
│ 63 │ │ │
│ 64 │ │ h = self.sigma_net(h) │
│ 65 │
│ │
│ /home/vghorpad/stable-diff/venv_sdfusion20/lib/python3.8/site-packages/torch/nn/modules/module.p │
│ y:1190 in _call_impl │
│ │
│ 1187 │ │ # this function, and just call forward. │
│ 1188 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1189 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1190 │ │ │ return forward_call(*input, **kwargs) │
│ 1191 │ │ # Do not call functions when jit is used │
│ 1192 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1193 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/encoders/gridencoder/grid.py:149 │
│ in forward │
│ │
│ 146 │ │ prefix_shape = list(inputs.shape[:-1]) │
│ 147 │ │ inputs = inputs.view(-1, self.input_dim) │
│ 148 │ │ │
│ ❱ 149 │ │ outputs = grid_encode(inputs, self.embeddings, self.offsets, self.per_level_scal │
│ 150 │ │ outputs = outputs.view(prefix_shape + [self.output_dim]) │
│ 151 │ │ │
│ 152 │ │ #print('outputs', outputs.shape, outputs.dtype, outputs.min().item(), outputs.ma │
│ │
│ /home/vghorpad/stable-diff/venv_sdfusion20/lib/python3.8/site-packages/torch/cuda/amp/autocast_m │
│ ode.py:97 in decorate_fwd │
│ │
│ 94 │ def decorate_fwd(*args, **kwargs): │
│ 95 │ │ if cast_inputs is None: │
│ 96 │ │ │ args[0]._fwd_used_autocast = torch.is_autocast_enabled() │
│ ❱ 97 │ │ │ return fwd(*args, **kwargs) │
│ 98 │ │ else: │
│ 99 │ │ │ autocast_context = torch.is_autocast_enabled() │
│ 100 │ │ │ args[0]._fwd_used_autocast = False │
│ │
│ /home/vghorpad/stable-diff/latent-nerf/src/latent_nerf/models/encoders/gridencoder/grid.py:49 in │
│ forward │
│ │
│ 46 │ │ else: │
│ 47 │ │ │ dy_dx = None │
│ 48 │ │ │
│ ❱ 49 │ │ _backend.grid_encode_forward(inputs, embeddings, offsets, outputs, B, D, C, L, S │
│ 50 │ │ │
│ 51 │ │ # permute back to [B, L * C] │
│ 52 │ │ outputs = outputs.permute(1, 0, 2).reshape(B, L * C) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: grid_encode_forward(): incompatible function arguments. The following argument types are supported:
1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: at::Tensor, arg4: int, arg5: int, arg6: int, arg7: int, arg8: float, arg9: int, arg10: Optional[at::Tensor], arg11: int, arg12: bool, arg13: int) -> None
Invoked with: tensor([[0.5000, 0.5000, 0.5000],
[0.5000, 0.5000, 0.5000],
[0.5000, 0.5000, 0.5000],
...,
[0.5000, 0.5000, 0.5000],
[0.5000, 0.5000, 0.5000],
[0.5000, 0.5000, 0.5000]], device='cuda:0'), tensor([[-7.7486e-07, 5.3644e-05],
[-8.2314e-05, -7.3612e-05],
[-3.8505e-05, 2.6822e-05],
...,
[ 2.7418e-05, -5.0962e-05],
[ 6.2227e-05, 7.5281e-05],
[ 4.2677e-05, 9.2626e-05]], device='cuda:0', dtype=torch.float16), tensor([ 0, 4920, 18744, 51512, 136696, 352696, 876984, 1401272,
1925560, 2449848, 2974136, 3498424, 4022712, 4547000, 5071288, 5595576,
6119864], device='cuda:0', dtype=torch.int32), tensor([[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
...,
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]]], device='cuda:0', dtype=torch.float16), 16512, 3, 2, 16, 0.4666666666666666, 16, None, 1, False`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.