haofengac / monodepth-fpn-pytorch Goto Github PK
View Code? Open in Web Editor NEWSingle Image Depth Estimation with Feature Pyramid Network
License: MIT License
Single Image Depth Estimation with Feature Pyramid Network
License: MIT License
Thanks for your great work!
Have you experimented with arccos based normal loss. Does the performance vary when using arccos instead of (1 - normalized inner product)?
The code I am referring to is -
prod = ( grad_fake[:,:,None,:] @ grad_real[:,:,:,None] ).squeeze(-1).squeeze(-1)
fake_norm = torch.sqrt( torch.sum( grad_fake2, dim=-1 ) )
real_norm = torch.sqrt( torch.sum( grad_real2, dim=-1 ) )
I am training your code on the Stanford 2D-3D-S dataset, but I got the predict depth map between 0 and 1, but the real depth can reach to 128, Do you know why?
Hi, I can't find constants.py in the repository, however it does present in the imports, what can I do with that?
I can't find the code for self-supervise training.
I Can read from the kitti dataset code, that there is some kitti_training_images.txt file demanded, but I can't see any such file from any kitti depth and raw dataset downloads, Can you make it more clear as to how we need to procure thekitti dataset in order to directly use the code
Hi, is a pretrained model available ? It would greatly help with depelopment times. Thank you in advance!
d5, d4, d3, d2 = self.up1(self.agg1(p5)), self.up2(self.agg2(p4)), self.up3(self.agg3(p3)), self.agg4(p2)
,,H,W = d2.size()
vol = torch.cat( [ F.upsample(d, size=(H,W), mode='bilinear') for d in [d5,d4,d3,d2] ], dim=1 )
请问一下,这几行是什么意思?作用是干什么的?
Hi @xanderchf,
Thanks for your great work. It would be a great help if you can you release the full code.
Hi,
Thanks for posting your code!
I have a question for the depth maps in kitti dataset. You mentioned that before training you use the NYu tool box to fill in the sparse depth maps of kitti dataset. I am wondering:
Are the sparse depth image from the official website under the section "depth completion/depth prediction" with 14G?
Can you tell more specifically how to use the tool box to fill in the sparse depth map? And how accurate is it to use the filled-in depth map as ground truth label.
Thank you very much!
Hi,
Thanks for the code.
I am about to use this repo for training to be able to estimate the depth for the images acquired from a stereo endoscope under the water. As I can see this and most of the mono depth methods are applied for the street and cars. Is there anything that I should do or not to do for training the model when my aim is to get the depth for underwater and small distant objects?
Also I noticed the amount of overlap between stereo images for me is not as much as typical images from the street views. So my problem is a smaller amount of the overlap.
Thanks for reading
Now I faced a couple of problems: there are no such modules as "constants", "utils" and etc, but its not a major problem. I'm trying to run main_fpn.py with python main_fpn.py --epochs 40 --cuda --bs 4 --num_workers 3 --output_dir output_dir/
, but only got NaN's as output from network and as values of losses.
Could you please write more detailed instructions how to run your code?
Hi @xanderchf,
How many training images in NYUv2 did you use to train the network? I trained my model with the small dataset and got very bad results.
Hi @xanderchf
Thanks for your code!
But I can't find the utils.py , It would be helpful if you can release this part of code.
Thanks!
Hi @xanderchf
Thanks to your post, I tried to fill in missing values in ground truth by using NYU toolbox - fill_depth_colorization in matlab.
and applied colorized map,
like this.
But I'm curious about one thing,
when you compute errors between ground truth and prediction,
did you use filled depth or project depth?
In other words, is your network outputs filled depth or project depth?
Thanks
Hi I just want to know if you make any modification for the file process_raw.m from https://github.com/janivanecky/Depth-Estimation/tree/master/dataset or other file in the official Toolbox for the NYU dataset. Im getting an error about the get_synched_framed, its seem that it cannot go through the scene subfile (like basement_0001a), its return 0 file in the scene but all the files are there. I did it on Octave in Linux Mint and I used the Raw dataset, Single File (~428 GB) from https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html.
Thanks a lot for your help.
There are a lot of information about evaluation of this network and benchmarks.
But could you explain how do you calculate this metrics (ARD, SRD, metrics with threshold and etc.) and submit your code for them?
In your repo, the grad loss factor is set to 10.
The reason is to make depth loss and grad loss in a same range? For example, 0.0x?
I'm training your model on the NYU Depth V2 dataset but can't avoid the model's predictions quickly degenerating to outputting a depth value for the entire image. How did you avoid this? The model doesn't appear to learn at all, the loss doesn't decrease, it just oscillates around the same value, and very little gradient is passed to any of the parameters, if any. I've tried different learning rates with the same result.
First of all. thanks for your code. I don't know the setting about hte param : DOUBLE_BIAS and WEIGHT_DECAY . it show me that:
[epoch 0][iter 10] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 20] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 30] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 40] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 50] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 60] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
[epoch 0][iter 70] loss: nan RMSElog: nan grad_loss: nan normal_loss: nan
You do transforms.ToTensor() on depth after loading it. I found that PyTorch will actually do depth = depth/255.
I guess, when computing loss, it would be better to multiply 255 in order to get a correct loss. I haven't read your code thoroughly, and don't know what you have done when you preprocessed the data. I feel confused why fake and real must be multiplied by 10 when computing loss.
I'm a beginner in this field. Thanks for your great work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.