GithubHelp home page GithubHelp logo

Regarding Ground truth poses about cmrnet HOT 9 CLOSED

cattaneod avatar cattaneod commented on July 22, 2024
Regarding Ground truth poses

from cmrnet.

Comments (9)

cattaneod avatar cattaneod commented on July 22, 2024 2

Hi,
since the ground truth poses provided with the KITTI dataset are not consistent around loops, we generated our ground truth using a SLAM system, so there is no fixed transformation between ours and the KITTI original poses.

They should be poses of LiDAR, but i will double check to be sure.

from cmrnet.

cattaneod avatar cattaneod commented on July 22, 2024 1

The GT poses provided in data/ are poses of the LiDAR.
When using a different dataset, make sure that you change the preprocessing script (e.g., change the velo2cam transformation).

Keep in mind that:
"The Data Loader requires a local point cloud for each camera frame, the point cloud must be expressed with respect to the camera_2 reference frame, BUT (very important) with a different axes representation: X-forward, Y-right, Z-down."
Make sure to adapt everything according to your dataset.

To check if your data is correct, you can visualize the LiDAR map projected in the ground truth pose (by setting H_init to the identity), and verify that the projected map and the image are aligned.

from cmrnet.

RaviBeagle avatar RaviBeagle commented on July 22, 2024 1
  #  BUT (very important) with a different axes representation: 
  #  X-forward, Y-right, Z-down
  #           ^ z                .x
  #           |                 /
  #           |           to:  +-------> y
  #           | .x             |
  #           |/               |
  # y <-------+                v z

  local_map_interchanged = torch.tensor([np.array(local_map[0]), np.array(
     local_map[1]) * -1, np.array(local_map[2] * -1),np.array(local_map[3])])

What exactly is the purpose of the different axes representation. Is the purpose to store the lidar point cloud for each frame in the camera co-ordinate system or something else ?

from cmrnet.

RaviBeagle avatar RaviBeagle commented on July 22, 2024 1

After converting the lidar in the camera frame L118, the point cloud have this frame (x right, y down, z forward):

@cattaneod : Thank you so much for taking time to explain it clearly. My bad that I did not generate the Tr values in the calib.txt correctly. I was only translating the co-ordinates without considering the camera co-ordinate frame.

With that corrected I can see that the training is going well! The training loss is reducing by the epoch!
Now will run the full training and evaluate the results..

from cmrnet.

RaviBeagle avatar RaviBeagle commented on July 22, 2024

Hello @cattaneod ,

Just to confirm on this point:
"They should be poses of LiDAR, but i will double check to be sure."
Are the GT poses the poses of the LIDAR ? I am training on the synthetic data set generated from CARLA and the training loss is not reducing. Just wanted to confirm. Thank you

from cmrnet.

RaviBeagle avatar RaviBeagle commented on July 22, 2024

Hello @cattaneod . Thanks

  1. Poses: Yes. Now I see I made a mistake. I used vehicle poses instead of LIDAR poses. This I will correct

  2. Regarding "BUT (very important) with a different axes representation: X-forward, Y-right, Z-down."
    Make sure to adapt everything according to your dataset" as I understood, there is nothing I have to do for this.
    Its done by in line L119

  3. Now that you mentioned the velo2cam transformation, I looked carefully at the pykitti code and I think I found a bug in that code.
    Possible pykitti issue . For now I will proceed with a minor fix and hope the
    training goes well.

Thanks for this hint! "To check if your data is correct, you can visualize the LiDAR map projected in the ground truth pose (by setting H_init to the identity), and verify that the projected map and the image are aligned"
Will work it out.

from cmrnet.

RaviBeagle avatar RaviBeagle commented on July 22, 2024

Hello @cattaneod

"The Data Loader requires a local point cloud for each camera frame, the point cloud must be expressed with respect to the camera_2 reference frame, BUT (very important) with a different axes representation: X-forward, Y-right, Z-down."
Make sure to adapt everything according to your dataset.

Well, it appears that I am wrong about the preprocessing script at L119
That line is interchanging axes, but it does not achieve the required representation.

Now I am a bit confused. At what point should I have the Local Point Cloud in that representation required ?

from cmrnet.

cattaneod avatar cattaneod commented on July 22, 2024

Hi,

I made some screenshots for better understanding.
The lidar point clouds in the KITTI dataset have this representation (x forward, y left, z up):
image

After converting the lidar in the camera frame L118, the point cloud have this frame (x right, y down, z forward):
image

And finally, after applying the transformation in L119, the final point cloud looks like this (x forward, y right, z down):
image

So, just make sure that your final point cloud is in the camera frame (non lidar), and has the final reference frame.

from cmrnet.

cattaneod avatar cattaneod commented on July 22, 2024
  #  BUT (very important) with a different axes representation: 
  #  X-forward, Y-right, Z-down
  #           ^ z                .x
  #           |                 /
  #           |           to:  +-------> y
  #           | .x             |
  #           |/               |
  # y <-------+                v z

  local_map_interchanged = torch.tensor([np.array(local_map[0]), np.array(
     local_map[1]) * -1, np.array(local_map[2] * -1),np.array(local_map[3])])

What exactly is the purpose of the different axes representation. Is the purpose to store the lidar point cloud for each frame in the camera co-ordinate system or something else ?

Honestly, there is no real purpose for this representation, you can change it, but then you must also change the function that project the lidar point cloud into the virtual image plane.

from cmrnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.