GithubHelp home page GithubHelp logo

Comments (12)

cattaneod avatar cattaneod commented on August 24, 2024 1

Yes, it is done in kitti_maps.py#L119

from cmrnet.

cattaneod avatar cattaneod commented on August 24, 2024

Hi @RaviBeagle ,

although CMRNet is scene agnostic, it's not camera agnostic, which means that it can only be deployed using the same camera sensor as used during training. To overcome this issue, we proposed CMRNet++ which is scene and sensor agnostic, but the code is not publicly available at the moment.

In order to train CMRNet on a different dataset, the first and most important step is to check the quality of the ground truth poses. If the GT poses are not accurate, you probably won't be able to train CMRNet.
Then, you should preprocess the dataset, by generating the map of every sequence, and generating a local map for each camera image, as done in https://github.com/cattaneod/CMRNet/blob/master/preprocess/kitti_maps.py. As mentioned in the README, the local maps should have this reference frame: X-forward, Y-right, Z-down.

After the dataset is preprocessed, you should adapt the data loader (https://github.com/cattaneod/CMRNet/blob/master/DatasetVisibilityKitti.py), make sure to change the camera calibration to your own calibration parameters (the images should also be undistorted beforehand).

Finally, you should change the image_size parameter to a suitable size for you images (take into account that both width and height should be multiple of 64, due to the architecture of the network).

from cmrnet.

RaviBeagle avatar RaviBeagle commented on August 24, 2024

Thanks a lot for the hints.
We are now preparing our vehicle setup to capture the dataset. The sensor setup is as follows:

  1. Velodyne HDL-16 as our LiDAR for Localization. Driving area has already the point cloud maps generated with same sensor.
  2. MYNTEYE S1030 Stereo camera. This camera provides grayscale images.

The plan is to run the ROS HDL localization package to generate the GT poses and capture the sensor data with approximate time synchronization. Do you see any limitations of our setup ?

from cmrnet.

cattaneod avatar cattaneod commented on August 24, 2024

I'm not familiar with the ROS HDL package, but if the generated GT poses are accurate enough, I don't see any big limitations.
CMRNet is intended for monocular localization, since you have a stereo setup, I can guess that you can further improve the localization performance by including the second camera.

from cmrnet.

RaviBeagle avatar RaviBeagle commented on August 24, 2024

Hello @cattaneod ,
Due to delays in setting up our vehicle and sensors, we plan in the meantime to test out a synthetic dataset from CARLA (https://github.com/jedeschaud/kitti_carla_simulator) to train. With some modifications to the scripts we hope to have a dataset similar to KITTI. The CMRNet would not need any modifications in that case, if I am right ?

Thanks

from cmrnet.

RaviBeagle avatar RaviBeagle commented on August 24, 2024

Hello @cattaneod ,

Have managed to generate KITTI type of dataset from the CARLA simulator. Here one question on a comment you put in the documentation:

"The Data Loader requires a local point cloud for each camera frame, the point cloud must be expressed with respect to the camera_2 reference frame, BUT (very important) with a different axes representation: X-forward, Y-right, Z-down."

Is this done by the preprocess/kitti_maps.py or is it something I have to do at the time of generation ?

from cmrnet.

RaviBeagle avatar RaviBeagle commented on August 24, 2024

Finally, you should change the image_size parameter to a suitable size for you images (take into account that both width and height should be multiple of 64, due to the architecture of the network).

I have set the RGB image size in CARLA as 1344x512 which is multiple of 64

Yet I get some error during training:

File "/home/sxv1kor/Temp/CMRNet/DatasetVisibilityKitti.py", line 180, in __getitem__
    img = self.custom_transform(img, img_rotation, h_mirror)
  File "/home/sxv1kor/Temp/CMRNet/DatasetVisibilityKitti.py", line 135, in custom_transform
    rgb = normalization(rgb)
  File "/home/sxv1kor/anaconda3/envs/cmrnet2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sxv1kor/anaconda3/envs/cmrnet2/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 269, in forward
    return F.normalize(tensor, self.mean, self.std, self.inplace)
  File "/home/sxv1kor/anaconda3/envs/cmrnet2/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 360, in normalize
    return F_t.normalize(tensor, mean=mean, std=std, inplace=inplace)
  File "/home/sxv1kor/anaconda3/envs/cmrnet2/lib/python3.7/site-packages/torchvision/transforms/functional_tensor.py", line 959, in normalize
    tensor.sub_(mean).div_(std)
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

from cmrnet.

cattaneod avatar cattaneod commented on August 24, 2024

As the error says, your images have 4 channels instead of 3 [R,G,B].

Sorry, I can't really help you with different datasets.

from cmrnet.

RaviBeagle avatar RaviBeagle commented on August 24, 2024

Yes, indeed that was the problem. Thanks.
I have saved the images as RGB and now the training is happening.

from cmrnet.

whu-lyh avatar whu-lyh commented on August 24, 2024

Yes, it is done in kitti_maps.py#L119

Hi @cattaneod , Thanks for your amazing job. I wonder how you fetch the surrounding point cloud that forms the local submap? Specifically, I mean here. The code confuses me quite a lot.

from cmrnet.

cattaneod avatar cattaneod commented on August 24, 2024

Hi @cattaneod , Thanks for your amazing job. I wonder how you fetch the surrounding point cloud that forms the local submap? Specifically, I mean here. The code confuses me quite a lot.

What exactly is not clear?
To generate a local submap around the camera pose, I first transform the global map into a local map, such that the camera pose is the origin of the point cloud

local_map = torch.mm(pose, local_map).t()

Then, I crop the point cloud around the origin (which now is the camera pose). Specifically, i take 25 meters on the left and right, 100 meters in the front, and -10 in the back, to account for the random initial position H_init that could be behind the real pose.

indexes = local_map[:, 1] > -25.  # Crop 25 meters to the right
indexes = indexes & (local_map[:, 1] < 25.)  # Crop 25 meters to the left
indexes = indexes & (local_map[:, 0] > -10.)  # Crop 10 meters to the back
indexes = indexes & (local_map[:, 0] < 100.)  # Crop 100 meters to the front

from cmrnet.

whu-lyh avatar whu-lyh commented on August 24, 2024

Ohhhhh, I figure it out. Thanks very much!

from cmrnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.