Comments (8)
@a-jahani Please correct if I am wrong. There are two ways to obtain GT depth for the Kitti test setup (eigen split or kitti split doesn't matter): 1) Calibration + Velodyne data (Lidar) or 2) The official ground truth depth images (un interpolated) provided by the Official Kitti providers.
People so far (including this work) used to follow 1) but the idea is to slowly shift towards 2) ??
If I am not wrong there is an interpolated version (completed depth) of the GT depth images by the offiicial Kitti providers, was it for qualitative comparisons only? The sparse (un interpolated) GT depth (obtained either from 1) or 2)) should alone be used for quantitative evaluation like how everyone in this field reported?
from sfmlearner-pytorch.
Main criteria is from sparse ground truth. The key idea is that interpolated data is not real data, thus comparing your prediction with that does not make much sense.
The interpolation can be used for qualitative result where you can subjectively decide whether your prediction looks like the interpolated ground truth.
The probleme with quantitative result with interpolated data resides in the plane boundaries. between a pixel belonging to a foreground plane (say the car) and the next one on the background, there is a discontinuity, but you don't know for sure where. Interpolate "blurs" the discontinuity and some actually good points from your prediction may appear wrong because the interpolated value is a mid point between fore and background while it should not be.
Hope I was clear enough ! For depth evulation with KITTI, you can look at the first paper using the now usual measurements : https://papers.nips.cc/paper/5539-depth-map-prediction-from-a-single-image-using-a-multi-scale-deep-network.pdf
from sfmlearner-pytorch.
Hi, @ClementPinard
Yes your answer is very clear! Thank you very much!
Another question is that some papers use the sparse depth ground truth provided by the KITTI depth benchmark, while this paper use the depth ground truth computed from other information and parameters. Will there be large differences in these two kinds of depth annotations? I can only see from the visualization that these two depth ground truths appear to be very similar.
from sfmlearner-pytorch.
Officially it is supposed to be the exact same. The depth benchmark is just a ready-to-go depth image instead of LiDar data + calibration.
Now, if you look at other datasets, you can see slight differences, especially with Odometry, where groundtruth pose has probably been smoothed compared to raw data + calibration.
I think it's safe to say that the evaluation is pretty much the same here, because the LiDar and fixed calibration are pretty reliable.
from sfmlearner-pytorch.
I do not suggest validating against interpolated points(Haven't seen anyone doing that) but you can use interpolated to train your network and my experiments shows there is a boost in it if your interpolation is good without weird artifact.
@ClementPinard I have evaluated the "Lidar data + calibration" and the post process KITTI depth data for Eigen split. In reality, as you said they should be exactly same but there are quite different noises and artifacts that affects the LiDAR measurements and LiDAR and post process depth are not same and LiDAR is not reliable measurement. Right now the research of depth estimation for KITTI is at the point where using LiDAR for evaluation should be revisited.
I have also shown if you use ground truth from new Kitti benchmark for training you get a huge performance boost (comparing row one and three)
more info was discussed here:
#mrharicot/monodepth#166 (comment)
from sfmlearner-pytorch.
@koutilya40192 Yes you are right in all of your sentences. evaluating on 1) is not good as the ground truth is wrong 2) is better but still it is not dense so your algorithm might predict very wrong result on those regions while getting good numbers.
There is no interpolated version (completed depth) of the GT depth images by the offiicial Kitti providers. Some researchers interpolate themselfs using different methods and some use the interpolated for visualization only and some use it for training and none (as far as I know) use it for quantitative evaluation. For quantitative evaluation it's either 1) or 2) and I suggest use 2) and submit your result to the benchmark.
from sfmlearner-pytorch.
The interpolated point is not good, especially because of depth discontinuities between fore and background The interpolated will be very wrong. The only way to have a dense ground truth is to interpolated the 3D point cloud to have a mesh and then to project the mesh to the camera frame.
However you then need a much denser 3D point, and from different points of view, because here we only have the POV of the car.
It's going to take some work to have a truly dense depth ground truth enabled dataset to validate these algorithms with.
As for applying the 2), I'll see what we can do to provide a script that does exactly that, be it only indicating from the README where to get the data or a brand new testing script.
from sfmlearner-pytorch.
Thanks for your responses @a-jahani and @ClementPinard . That really clears a lot of my doubts.
from sfmlearner-pytorch.
Related Issues (20)
- Query regarding depth map. HOT 2
- Large Errors on Pose Prediction Network HOT 3
- why the gpu memory cost of tensorflow version is larger than pytorch version HOT 2
- Weird results from pretrained model on KITTI images HOT 4
- Question about using oxts data HOT 1
- Cannot run `train.py` with nohup HOT 2
- imread during inference load the image as uint8 HOT 4
- How About the Flops, fps and parameter of this model? HOT 1
- regarding the predicted depth map during inverse warp HOT 2
- How to visualize the warped image (ref_img_wapred) HOT 2
- regarding inverse_warping HOT 10
- Is the image input of depth network fixed? HOT 2
- Question about diff
- How to load training dataset
- Regarding the depth used for generating target image HOT 5
- Question about the poses predicted by the posenet HOT 2
- about the pose scale HOT 2
- difference of the predicted translation and ground truth vectors HOT 7
- samples in test_files_eigen dont exist in the KITTI
- Seq 10 is not in the test_secens.txt HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sfmlearner-pytorch.