GithubHelp home page GithubHelp logo

tianyilim / range_libc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kctess5/range_libc

0.0 0.0 0.0 18.38 MB

Optimized ray cast methods for 2D occupancy grids including the CDDT algorithm. Written in C++ and CUDA with Python wrappers.

License: Other

Shell 0.75% C++ 61.51% Python 15.66% Cuda 7.62% CMake 1.32% Cython 13.14%

range_libc's Introduction

RangeLibc: Modernized

NOTE @maintainers (Nic/Edo): This remote points to a fork of RangeLibc on TY's personal github for now. But, creating a submodule on the git.ee servers when possible would be ideal. (or, you can give me free GH stars)

Repo Roadmap

  • Add tests
  • Instructions to compile with/without CUDA
  • Add benchmark code
  • Make improvements/modernization to underlying algos
  • Update code structure

This library provides for different implementations of 2D raycasting for 2D occupancy grids, including the Compressed Directional Distance Transform (CDDT) algorithm as proposed in this publication. The code is written and optimized in C++, and Python wrappers are also provided.

Building the Code

This code was built in Ubuntu 22.04 with and without GPU support.

Requirements

For GPU support, execute the following if using Ubuntu 22.04. Taken from Nvidia installation instructions for CUDA. Follow the guide if you have another distribution. Running these commands may take a while!

# Follow the instructions to install CUDA and dependencies
sudo apt-get install linux-headers-$(uname -r)	# installs required linux headers for your linux version
sudo apt-key del 7fa2af80

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.deb -O ~/Downloads/cuda_ubuntu2204-amd64.deb
sudo dpkg -i ~/Downloads/cuda_ubuntu2204-amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
sudo apt-get -y install nvidia-gds

# Add CUDA to paths. These will also be appended to your bashrc.
export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

echo 'export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc

Now, reboot your system: sudo reboot

Then, verify the installation by calling nvcc --version. Sample output:

$ nvcc --version
  nvcc: NVIDIA (R) Cuda compiler driver
  Copyright (c) 2005-2023 NVIDIA Corporation
  Built on Tue_Aug_15_22:02:13_PDT_2023
  Cuda compilation tools, release 12.2, V12.2.140
  Build cuda_12.2.r12.2/compiler.33191640_0

C++ code

# clone the repository
git clone [email protected]:tianyilim/range_libc.git
cd range_libc
mkdir build
cd build
cmake ..
make

Python Wrappers

To build the code and its associated Python wrappers for use in Python code, do the following. You may have to install Cython if you do not already have it on your system.

# clone the repository
git clone [email protected]:tianyilim/range_libc.git
cd range_libc_dist/pywrapper
# for an in place build, do this:
python setup.py build_ext --inplace
# for a system wide install, do this:
python setup.py install
# to compile with the GPU kernels, do this:
WITH_CUDA=ON python setup.py install
# this should take a few seconds to run
python test.py

To see example usage of the Python wrappers (using the ROS specific helpers) see https://github.com/mit-racecar/particle_filter. See the /docs folder for documentation.

Cite

This repo is a modification to the original rangelibc publication.

@article{walsh17,
    author = {Corey Walsh and 
              Sertac Karaman},
    title  = {CDDT: Fast Approximate 2D Ray Casting for Accelerated Localization},
    volume = {abs/1705.01167},
    url    = {http://arxiv.org/abs/1705.01167},
    year   = {2017}}

Code structure

TODO

RangeLibc Algorithms Overview

Range Method Performance Comparison

The above benchmarks were performed on an NVIDIA Jetson TX1 with a single thread. For a better treatment, see the paper associated with this library: http://arxiv.org/abs/1705.01167

Bresenham's Line (BL)

Bresenham's line algorithm [1] is one of the most widely used methods for two dimensional ray casting in occupancy grids. The algorithm incrementally determines the set of pixels that approximate the trajectory of a query ray starting from the query point (x,y){query} and progressing in the theta{query} direction one pixel at a time. The algorithm terminates once the nearest occupied pixel is discovered, and the euclidean distance between that occupied pixel and (x,y)_{query} is reported. This algorithm is widely implemented in particle filters due to its simplicity and ability to operate on a dynamic map. The primary disadvantage is that it is slow, potentially requiring hundreds of memory accesses for a single ray cast. While average performance is highly environment dependent, Bresenham's Line algorithm is linear in map size in the worst case.

Ray Marching (RM/RMGPU)

Ray marching [2] is a well known algorithm, frequently used to accelerate fractal or volumetric graphical rendering applications. The basic idea can be understood very intuitively. Imagine that you are in an unknown environment, with a blindfold on. If an oracle tells you the distance to the nearest obstacle, you can surely move in any direction by at most that distance without colliding with any obstacle. By applying this concept recursively, one can step along a particular ray by the minimum distance to the nearest obstacle until colliding with some obstacle. The following figure demonstrates this idea graphically (from [3]).

Ray Marching

In the occupancy grid world, it is possible to precompute the distance to the nearest obstacle for every discrete state in the grid via the euclidean distance transform.

This method is implemented both for the CPU and GPU in RangeLibc. The GPU implementation is the fastest available method for large batches of queries.

Pseudocode:

# compute the distance transform of the map
def precomputation(omap):
	distance_transform = euclidean_dt(omap)

# step along the (x,y,theta) ray until colliding with an obstacle
def calc_range(x,y,theta):
	t = 0.0
	coeff = 0.99
	while t < max_range:
		px, py = x + cos(theta) * t, y + sin(theta) * t

		if px or py out of map bounds:
			return max_range

		dist = distance_transform[px,py]
		if dist == 0.0:
			return sqrt((x - px)^2 + (y - py)^2)

		t += max(dist*coeff, 1.0)

Analysis

Precomputation: O(|theta_discretization|*|edge pixels in occupancy grid|+|theta_discretization|*|occupied pixels|*log(|occupied pixels|)) Pruning: O(|map width|*|map height|*|theta_discretization|*log(min(|occupied pixels|, longest map dimension))) Calc range: O(log(min(|occupied pixels|, longest map dimension))) Memory: O(|theta_discretization|*|edge pixels|) - in practice much smaller, due to pruning

Pros

  • Fast calc_range on average
  • Space efficent
  • Easy to implement, and easy to implement on a GPU
  • Fairly fast to compute distance transform
  • Extends easily to 3D

Cons

  • Poor worst case performance - degenerate case similar to Bresenham's line
  • High degree of control flow divergence (for parallelization)
  • Not great for incrementally changing the map

Compressed Directional Distance Transform (CDDT/PCDDT) (ours)

The Compressed Directional Distance Transform (CDDT) algorithm uses a compressed data structure to represent map geometry in a way which allows for fast queries. An optional pruning step removes unneeded elements in the data structure for slightly faster operation (PCDDT). For a full description of the algorithm, see the associated paper: http://arxiv.org/abs/1705.01167

Pseudocode:

# for the given theta, determine a translation that will ensure the 
# y coordinate of every pixel in the rotated map will be positive
def y_offset(theta):
	pass

# give the range of y coordinates that the pixel overlaps with
def y_bounds(pixel):
	return range(min(pixel.corners.y), max(pixel.corners.y))

# build the CDDT datastructure
def precomputation(omap):
	# prune any unimportant geometry from the map
	edgeMap = morphological_edge_transform(omap)

	# build the empty LUT data structure
	compressed_lut = []
	for each theta in |theta_discretization|:
		projection_lut = []
		for each i in range(lut_widths[theta]):
			projection_lut.append([])
		compressed_lut.append(projection_lut)

	# populate the LUT data structure
	for each theta in |theta_discretization|:
		for each occupied pixel (x,y) in omap:
			pixel.rotate(theta)
			pixel.translate(y_offset(theta))
			lut_indices = y_bounds(pixel)
			
			for each index in lut_indices:
				compressed_lut[theta][index].append(pixel.center.x)

	# sort each LUT bin for faster access via binary search
	for each theta in |theta_discretization|:
		for each i in compressed_lut[theta].size():
			sort(compressed_lut[theta][i])

# (optional) remove unused entries from the LUT to save space
# highly recommended for static maps
def prune():
	# build an empty table of sets to keep track of which
	# indices in the CDDT data structure are used
	collision_table = []
	for theta in range(theta_discretization):
		collision_row = []
		for i in range(compressed_lut[theta].size()):
			collision_row.append(set())
		collision_table.append(collision_row)

	# ray cast from every possible (x,y,theta) state, keeping track
	# of which LUT entries are used
	for x in range(omap.width):
		for y in range(omap.height):
			for theta in range(theta_discretization):
				# keep track of which object in the LUT is found to collide
				# with the following ray cast query
				calc_range(x,y,theta) implies (lut_bin, lut_bin_index)
				collision_table[theta][lut_bin].add(lut_bin_index)

	# remove any element of the LUT that is not in the collision table
	for theta in range(theta_discretization):
		for i in range(compressed_lut[theta].size()):
			new_lut_bin = []
			for obstacle in compressed_lut[theta][i]:
				if obstacle in collision_table:
					new_lut_bin.append(obstacle)
				else: continue
			compressed_lut[theta][i] = new_lut_bin

# compute the distance to the nearest obstacle in the (x,y,theta) direction
def calc_range(x,y,theta):
	angle_index, discrete_angle, flipped_search = discretize_theta(theta)
	lut_x, lut_y = rotate(x, y, discrete_angle)

	if omap.occupied(x,y):
		return 0.0

	lut_bin = compressed_lut[angle_index][(int)lut_y]
	if flipped_search:
		nearest_obstacle_x = lut_bin.next_lesser_element(lut_x)
	else:
		nearest_obstacle_x = lut_bin.next_greater_element(lut_x)
	
	distance = abs(nearest_obstacle_x - lut_x)
	return distance

Analysis

Precomputation: O(|width|*|height|) for 2D grid. In general O(dk) where d is the dimensionality of the grid, and k is the number of grid locations. Calc range: worst case O(|longest map dimension|), on average much faster (close to logarithmic performance in scene size) Memory: O(|width|*|height|) for 2D grid. In general O(k) where k is the number of grid locations.

Pros

  • Fast calc_range, in practice nearly constant time
  • Radial symmetry optimizations can provide additional speed in the right context
  • Potential for online incremental compressed LUT modification for use in SLAM (would need to store additional metadata)
  • Space efficent
  • Fast construction time

Cons

  • Slow pruning time (optional)
  • Approximate due to the discrete theta space
  • Can be difficult to implement well
  • Curse of dimensionality in higher dimensions

Giant Lookup Table (GLT)

Precompute distances for all possible (x,y,theta) states and store the results in a big table for fast lookup.

Pseudocode:

# For every (x,y,theta) in a predefined grid, use Besenham's line or 
# ray maching to build the table
def precomputation(omap):
	giant_LUT[width][height][theta_discretization] = -1
	for x in range(omap.width):
		for y in range(omap.height):
			for theta in range(theta_discretization):
				giant_LUT[x][y][theta] = calc_range(x,y,theta)

# simply read from the table
# note: interpolation between the two closest discrete 
#       thetas would be more accurate but slower
def calc_range(x,y,theta):
	return giant_LUT[int(x), int(y), discrete(theta)]

Analysis

Precomputation: O(|theta_discretization|*|width|*|height|*O(calc_range)) Memory: O(|theta_discretization|*|width|*|height|) Calc range: O(1)

Pros

  • Very fast calc_range
  • Easy to implement

Cons

  • Very slow construction time
  • Approximate due to the discrete theta space
  • Very large memory requirement
  • Curse of dimensionality in higher dimensions

References

  1. J. Bresenham. "Algorithm for Computer Control of a Digital Plotter," IBM Systems Journal, vol. 4, no. 1, pp. 25-30, 1965.
  2. K. Perlin and E. M. Hoffert. "Hypertexture," Computer Graphics, vol 23, no. 3, pp. 297-306, 1989.
  3. M. Pharr, and R. Fernando. "Chapter 8. Per-Pixel Displacement Mapping with Distance Functions" in GPU gems 2: Programming techniques for high-performance graphics and general-purpose computation, 3rd ed. United States: Addison-Wesley Educational Publishers, 2005.

range_libc's People

Contributors

kctess5 avatar tianyilim avatar sertackaraman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.