GithubHelp home page GithubHelp logo

drprojects / point_geometric_features Goto Github PK

View Code? Open in Web Editor NEW
43.0 2.0 5.0 83 KB

Python wrapper around C++ utility to compute local geometric features of a point cloud

License: MIT License

Python 9.81% C++ 87.76% CMake 2.43%
3d python numpy machine-learning geometric-features point-cloud cpu fast features nanoflann

point_geometric_features's Introduction

Point Geometric Features

python C++ license

📌 Description

The pgeof library provides utilities for fast, parallelized computing ⚡ of local geometric features for 3D point clouds ☁️ on CPU .

️List of available features ️👇
  • linearity
  • planarity
  • scattering
  • verticality (two formulations)
  • normal_x
  • normal_y
  • normal_z
  • length
  • surface
  • volume
  • curvature
  • optimal neighborhood size

pgeof allows computing features in multiple fashions: on-the-fly subset of features a la jakteristics, array of features, or multiscale features. Moreover, pgeof also offers functions for fast K-NN or radius-NN searches 🔍.

Behind the scenes, the library is a Python wrapper around C++ utilities. The overall code is not intended to be DRY nor generic, it aims at providing efficient as possible implementations for some limited scopes and usages.

🧱 Installation

From binaries

python -m pip install pgeof 

or

python -m pip install git+https://github.com/drprojects/point_geometric_features

Building from sources

pgeof depends on Eigen library, Taskflow, nanoflann and nanobind. The library adheres to PEP 517 and uses scikit-build-core as build backend. Build dependencies (nanobind, scikit-build-core, ...) are fetched at build time. C++ third party libraries are embedded as submodules.

# Clone project
git clone --recurse-submodules https://github.com/drprojects/point_geometric_features.git
cd point_geometric_features

# Build and install the package
python -m pip install .

🚀 Using Point Geometric Features

Here we summarize the very basics of pgeof usage. Users are invited to use help(pgeof) for further details on parameters.

At its core pgeof provides three functions to compute a set of features given a 3D point cloud and some precomputed neighborhoods.

import pgeof

# Compute a set of 11 predefined features per points
pgeof.compute_features(
    xyz, # The point cloud. A numpy array of shape (n, 3)
    nn, # CSR data structure see below
    nn_ptr, # CSR data structure see below
    k_min = 1 # Minimum number of neighbors to consider for features computation
    verbose = false # Basic verbose output, for debug purposes
)
# Sequence of n scales feature computation
pgeof.compute_features_multiscale(
    ...
    k_scale # array of neighborhood size
)
# Feature computation with optimal neighborhood selection as exposed in Weinmann et al., 2015
# return a set of 12 features per points (11 + the optimal neighborhood size)
pgeof.compute_features_optimal(
    ...
    k_min = 1, # Minimum number of neighbors to consider for features computation
    k_step = 1, # Step size to take when searching for the optimal neighborhood
    k_min_search = 1, # Starting size for searching the optimal neighborhood size. Should be >= k_min 
)

⚠️ Please note that for theses three functions the neighbors are expected in CSR format. This allows expressing neighborhoods of varying sizes with dense arrays (e.g. the output of a radius search).

We provide very tiny and specialized k-NN and radius-NN search routines. They rely on nanoflann C++ library and should be faster and lighter than scipy and sklearn alternatives.

Here are some examples of how to easily compute and convert typical k-NN or radius-NN neighborhoods to CSR format (nn and nn_ptr are two flat uint32 arrays):

import pgeof
import numpy as np

# Generate a random synthetic point cloud and k-nearest neighbors
num_points = 10000
k = 20
xyz = np.random.rand(num_points, 3).astype("float32")
knn, _ = pgeof.knn_search(xyz, xyz, k)

# Converting k-nearest neighbors to CSR format
nn_ptr = np.arange(num_points + 1) * k
nn = knn.flatten()

# You may need to convert nn/nn_ptr to uint32 arrays
nn_ptr = nn_ptr.astype("uint32")
nn = nn.astype("uint32")

features = pgeof.compute_features(xyz, nn, nn_ptr)
import pgeof
import numpy as np

# Generate a random synthetic point cloud and k-nearest neighbors
num_points = 10000
radius = 0.2
k = 20
xyz = np.random.rand(num_points, 3).astype("float32")
knn, _ = pgeof.radius_search(xyz, xyz, radius, k)

# Converting radius neighbors to CSR format
nn_ptr = np.r_[0, (knn >= 0).sum(axis=1).cumsum()]
nn = knn[knn >= 0]

# You may need to convert nn/nn_ptr to uint32 arrays
nn_ptr = nn_ptr.astype("uint32")
nn = nn.astype("uint32")

features = pgeof.compute_features(xyz, nn, nn_ptr)

At last, and as a by-product, we also provide a function to compute a subset of features on the fly. It is inspired by the jakteristics python package (while being less complete but faster). The list of features to compute is given as an array of EFeatureID.

import pgeof
from pgeof import EFeatureID
import numpy as np

# Generate a random synthetic point cloud and k-nearest neighbors
num_points = 10000
radius = 0.2
k = 20
xyz = np.random.rand(num_points, 3)

# Compute verticality and curvature
features = pgeof.compute_features_selected(xyz, radius, k, [EFeatureID.Verticality, EFeatureID.Curvature])

Known limitations

Some functions only accept float scalar types and uint32 index types, and we avoid implicit cast / conversions. This could be a limitation in some situations (e.g. point clouds with double coordinates or involving very large big integer indices). Some C++ functions could be templated / to accept other types without conversion. For now, this feature is not enabled everywhere, to reduce compilation time and enhance code readability. Please let us know if you need this feature !

By convention, our normal vectors are forced to be oriented towards positive Z values. We make this design choice in order to return consistently-oriented normals.

Testing

Some basic tests and benchmarks are provided in the tests directory. Tests can be run in a clean and reproducible environments via tox (tox run and tox run -e bench).

💳 Credits

This implementation was largely inspired from Superpoint Graph. The main modifications here allow:

  • parallel computation on all points' local neighborhoods, with neighborhoods of varying sizes
  • more geometric features
  • optimal neighborhood search from this paper
  • some corrections on geometric features computation

Some heavy refactoring (port to nanobind, test, benchmarks), packaging, speed optimization, feature addition (NN search, on the fly feature computation...) were funded by:

Centre of Wildfire Research of Swansea University (UK) in collaboration with the Research Institute of Biodiversity (CSIC, Spain) and the Department of Mining Exploitation of the University of Oviedo (Spain).

Funding provided by the UK NERC project (NE/T001194/1):

'Advancing 3D Fuel Mapping for Wildfire Behaviour and Risk Mitigation Modelling'

and by the Spanish Knowledge Generation project (PID2021-126790NB-I00):

‘Advancing carbon emission estimations from wildfires applying artificial intelligence to 3D terrestrial point clouds’.

License

Point Geometric Features is licensed under the MIT License.

point_geometric_features's People

Contributors

drprojects avatar loicland avatar rjanvier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

point_geometric_features's Issues

pgeof as a pip package

Packaging pgeof as a pip package, free from the conda dependency of eigen3 would be great.

@rjanvier I think you were thinking of working on this at some point. Is this still something you are considering ?

Migrate from distutils to setuptools

Hi Damien,

For a bit of context, I currently use @drprojects/superpoint_transformer to make some forest segmentation for the @3DFin project. It works pretty well but we want to lower the technical cost of using it (in an inference context) for scientist interested in the field of Forest management.

I started to review compiled dependency and evaluate if we can offer pre build wheels (via https://github.com/pypa/cibuildwheel). I first noticed that all your C++ based python module/dependencies use distutils as a build system. distutils is deprecated and I think it did not even work anymore with python 3.12. So future users could have hard time to install your dependencies. Would you agree if I try to port pgeof to setuptools build system (which is the "drop in" replacement for distutils)?

Multi scale features computation

Hi Damien,
I have a little window to try to implement multi scale feature computation we talked about (see #4).
I would like to know how do you see the thing? would you want to rely on an aggregation scheme (I don't think a mean makes sense for all features) or output the whole multi scale feature map?
Thank in advance,
Romain

Memory leak of pgeof

Hello,
as mentionned in this issue I think pgeof has a memory leaks too:
If we take the demo.py script with a slightly faster knn :

from pgeof import pgeof
import numpy as np
import tracemalloc
tracemalloc.start()
num_iter=200
marqueur=0
snapshot=tracemalloc.take_snapshot()
import gc 
from scipy.spatial import cKDTree
def get_x(n_points=4000000):
    import torch
    x_min=[-15,46]
    y_min=[-45,30]
    z_min=[0,7]
    num_class=9
    x=torch.rand((n_points,3))
    x[:,0]=x[:,0]*(x_min[1]-x_min[0])+x_min[0]
    x[:,1]=x[:,1]*(y_min[1]-y_min[0])+y_min[0]
    x[:,2]=x[:,2]*(z_min[1]-z_min[0])+z_min[0]


    return x


def Query_CPU(
        xyz_query, xyz_search, K,r):

    kdtree=cKDTree(xyz_search)
    distances, neighbors = kdtree.query(xyz_query, k=K, distance_upper_bound=r,workers=-1)
    neighbors[distances==float('inf')]=-1
    return distances, neighbors 

for j in range(num_iter):
    # Generate a random synthetic point cloud
    num_points = int(1e5)
    xyz=get_x(num_points).numpy()

    # Manually generating random neighbors in CSR format
    nn_ptr = np.r_[0, np.random.randint(low=0, high=30, size=num_points).cumsum()]
    nn = np.random.randint(low=0, high=num_points, size=nn_ptr[-1])

    # Converting k-nearest neighbors to CSR format
    from sklearn.neighbors import NearestNeighbors
    k = 20
    kneigh = Query_CPU(xyz, xyz, k,20)
    nn_ptr = np.arange(num_points + 1) * k
    nn = kneigh[1].flatten()

    # Converting radius neighbors to CSR format
    # from sklearn.neighbors import NearestNeighbors
    # radius = 0.1
    # rneigh = NearestNeighbors(radius=radius).fit(xyz).radius_neighbors(xyz)
    # nn_ptr = np.r_[0, np.array([x.shape[0] for x in rneigh[1]]).cumsum()]
    # nn = np.concatenate(rneigh[1])

    # Make sure xyz are float32 and nn and nn_ptr are uint32
    xyz = xyz.astype('float32')
    nn_ptr = nn_ptr.astype('uint32')
    nn = nn.astype('uint32')

    # Make sure arrays are contiguous (C-order) and not Fortran-order
    xyz = np.ascontiguousarray(xyz)
    nn_ptr = np.ascontiguousarray(nn_ptr)
    nn = np.ascontiguousarray(nn)


    geof = pgeof(
    xyz, nn, nn_ptr, k_min=10, k_step=1, k_min_search=15,
        verbose=True)  
    marqueur+=1
    gc.collect()

    if marqueur>1e1:
        snapshot2=tracemalloc.take_snapshot()
        marqueur=0
        Top_stats=snapshot2.compare_to(snapshot,'lineno')
        print("TOP 10 differneces")
        for stat in Top_stats[:10]:
            print(stat)
        current,peak=tracemalloc.get_traced_memory()
        print(f"Current memory usage is {current/10**6}MB; Peak was {peak/10**6}MB")
        top_stats=snapshot2.statistics('traceback')
        stat=top_stats[0]
        print("%s memory blocks: %.1f in MiB: " % (stat.count,stat.size/1024**2))
        for line in stat.traceback.format():
            print(line)

the memory is slightly increasing each iteration and I tried to run the script with valgrind and I indeed go some memory definitely lsot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.