GithubHelp home page GithubHelp logo

Comments (7)

puzzlepaint avatar puzzlepaint commented on July 18, 2024 2

I just added basic support to replay .mkv files (re-using the live input code) directly in badslam in this commit: 10751c6

from badslam.

tkircher avatar tkircher commented on July 18, 2024 1

The way to do this is to open a handle to the .mkv and use k4a_playback_get_next_capture() instead of k4a_device_get_capture(). The rest of the processing pipeline is the same.

from badslam.

puzzlepaint avatar puzzlepaint commented on July 18, 2024

There was some related discussion on this previously in #35, but unfortunately I am not aware of existing public code to do this.

from badslam.

brookman avatar brookman commented on July 18, 2024

Thanks for the response.
I have found several tools to export image sequences from an k4arecorder .mkv file. One of which is k4a_mkv2image.
It didn't include the necessary text files like "rgb.txt", "depth.txt" and "calibration.txt" so I have forked it and added the missing stuff: https://github.com/brookman/k4a_mkv2image.
It already generates useful data which I can use to run badslam, but I'm not 100% sure about a few things:

  • Do we really have to add 0.5 to the cx and cy values for the calibration?
  • Does "Use photometric residuals" really have to be turned off?
  • Why does lowering the "Keyframe interval" to low values (like 1) produce more errors?
  • What should "Raw to metric depth conversion factor" be set to? As far as I know does the K4a produce 16 bit depth values measured in mm from the camera origin. So 0 - 65.536m. The default scaling value is 0.0002.
  • What is a good value for "Maximum depth to use"?

I could provide you with mkv files or image file based data-sets if that helps.

An impression from a first test recording:
image

from badslam.

puzzlepaint avatar puzzlepaint commented on July 18, 2024
  • Depends on the coordinate system convention that your source for cx/cy uses. For calibration.txt in datasets, badslam assumes that the coordinates are such that (0, 0) is in the center of the top-left image pixel. For the PinholeCamera4f class that badslam uses to store the parameters during runtime, the convention instead is that (0, 0) is the top-left corner of the top-left image pixel. Thus, when loading a dataset, badslam adds 0.5 to cx and cy (see here). So, it seems very unlikely to me that you have to add 0.5 for saving, since that would result in adding 1 overall, which would mean that a third convention would be used, which would be very unusual. So, most likely, either you have to subtract 0.5 (if your source uses the "pixel-corner" convention) or do nothing at all (if your source uses the "pixel-center" convention). I think that getting this wrong has very little effect to the extent that it almost does not matter, but it is still nice to have it correct of course.
  • During testing with a global-shutter camera, enabling photometric residuals usually helped very little. It only helped if the geometry alone was absolutely not sufficient to enable localizing the camera. At the same time, it makes the system slower. Also, the color camera on the new Kinect is a rolling shutter camera, which makes it even less useful than a global-shutter camera, given that badslam does not model the rolling shutter effect. So, I would hardly expect a benefit from enabling photometric residuals, or I guess that it might even be harmful, but feel free to experiment with it, maybe I am wrong.
  • Maybe the system gets too busy then and cannot keep up with new camera images? Hard to say without looking into it.
  • If the depth is in millimeters, then --depth_scaling should be 1000, respectively the "Raw to metric depth conversion factor" should be 1 / 1000 = 0.001.
  • I don't have enough experience with the new Kinect, but I had the impression that its depth quality is very good. So, probably a good value is to simply set this to something arbitrary very high (e.g., 9999) to ensure that no depth values are removed. This setting was mainly intended for the old first-generation Kinect, where anything beyond 3 meters depth could have really large errors, such that it could make sense to simply cut it off rather than use it.

As a general comment, badslam applies some preprocessing to live Kinect images before using them: It undistorts the images and reprojects them into the same point of view. This is not applied when loading datasets. So, your pipeline would need to do these steps as well somewhere. I saw that you transform the depth images into the color images, but I am not sure whether you also undistort the images. Does the color intrinsics printing here yield any non-zero distortion coefficients? If yes, then the images would also need to be undistorted if they aren't yet, and you would need to write out the calibration.txt parameters of the undistorted images, not the original parameters.

By the way, I also remembered some old code that I once used to convert a single .mkv file to the dataset format. However, it was written as throw-away code, only to be used for that single conversion, so it is horribly messy and has some values hard-coded in. So, if the k4a_mkv2image tool works, then that might be better (edit: or just adapt the preprocessing from the live input code instead).

from badslam.

brookman avatar brookman commented on July 18, 2024

Thank you very much for the detailed response.
I have managed to run badslam (offline, non-real-time) on my datasets using the following parameters badslam folder_name --depth_scaling 1000 --target_frame_rate 0 --restrict_fps_to 0 --keyframe_interval 1 --max_num_ba_iterations_per_keyframe 50 --num_scales 10 --max_depth 6 --gui_run with rather good results.
I haven't looked into the distortion yet, but will try to do so later.

Generally it should not be too hard to extend badslam to read from the .mkv file instead of the live device. I did some c++ tests but have currently no time to integrate it.

from badslam.

chuong avatar chuong commented on July 18, 2024

@brookman Thanks for k4a_mkv2image tool.
Just a note that the command to generate the associated.txt file will produce a UTF16 encoded text file:
python associate.py my_file/rgb.txt my_file/depth.txt > my_file/associated.txt

and BADSLAM will fail to read it:
rgbd_video_io_tum_datas:183 ERR| Cannot read association line!

Convert this text file to UTF8 encoding will fix the reading problem.

from badslam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.