GithubHelp home page GithubHelp logo

ccinc / 3d-ml Goto Github PK

View Code? Open in Web Editor NEW
15.0 15.0 3.0 112 KB

A versatile framework for 3D machine learning built on Pytorch Lightning and Hydra [looking for contributors!]

Shell 1.79% Makefile 1.63% Python 96.58%
3d 3d-deep-learning deep-learning point-cloud pytorch s3dis segmentation

3d-ml's People

Contributors

aaronfderybel avatar ccinc avatar dependabot[bot] avatar jaswanthbjk avatar leo-stan avatar pre-commit-ci[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

3d-ml's Issues

Add Paris-Lille-3D Dataset

The Paris-Lille-3D is a Dataset and a Benchmark on Point Cloud Classification. The data has been produced by a Mobile Laser System (MLS) in two different cities in France (Paris and Lille).

The Point Cloud has been labeled entirely by hand with 50 different classes to help the research community on automatic point cloud segmentation and classification algorithms.

https://npm3d.fr/paris-lille-3d

Add TorchSparse/SparseConv3d models

This adds additional complexity due to the collation of the datasets. OpenPoints datasets are batched "densely", i.e. 16 batches of data in the shape [2048, 3] are batched into a single tensor of shape [16, 2048, 3] (implemented based on the original TP3D code here

def from_data_list(data_list):
). Some models/backends, such as sparse convolutions, require the data to be batched differently, i.e. into a shape of [2048*16, 4], where the 4th column is a "batch index". This can be done by using the pytorch geometric collation functions (https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.Batch.from_data_list), which collate in this manner by default.

TorchPoints accomplishes this by setting a configuration option in the model to define whether it uses "dense" or "sparse" data. We would likely need to do the same, and have the dataloader batch according to this configuration option. Ref: https://github.com/torch-points3d/torch-points3d/blob/66e8bf22b2d98adca804c753ac3f0013ff4ec731/torch_points3d/datasets/base_dataset.py#L160-L174

Unit and Integration Testing

Issue for tracking how to incorporate testing within the repo.

Unit testing

  • Ensure datasets remain downloadable and usable
  • Ensure datasets are loaded in the correct torch_geometric.data.Data format
  • Unit testing for custom transforms
  • Test that models get built correctly. Some good examples from PyG repository.

Integration testing

  • Test data->model pipeline end-to-end for combinations of datasets and models

Add DALES Aerial Lidar Dataset

We present the Dayton Annotated Laser Earth Scan (DALES) data set, a new large-scale aerial LiDAR data set with nearly a half-billion points spanning 10 square kilometers of area. DALES contains forty scenes of dense, labeled aerial data spanning multiple scene types, including urban, suburban, rural, and commercial. The data was hand-labeled by a team of expert LiDAR technicians into eight categories: ground, vegetation, cars, trucks, poles, power lines, fences, and buildings. We present the entire data set, split into testing and training, and provided in 3 different data formats. The goal of this data set is to help advance the field of deep learning within aerial LiDAR.

https://udayton.edu/engineering/research/centers/vision_lab/research/was_data_analysis_and_processing/dale.php

Installation issues

Hi @CCInc ,

I've tried installing this repo and occured an error while running ./install_openpoints.sh.

My versions of software components:
Using pip in virtual environment.
Python version: Python 3.10.8
pip version: pip 22.3.1
nvcc version:
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
gcc version: 7.5.
OS info:
linux version:
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
kernel version:
5.4.0-125-generic

commands ran:

#add recursive comment otherwise openpoints folder content is not included (because it's a submodule)
git clone https://github.com/CCInc/3d-ml.git --recursive

#create virtual env with python, go inside it.
cd 3d-ml
python -m virtualenv env_3d
source env_3d/bin/activate

#install pytorch with pip
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu116
#install pytorch geo with pip, ${CUDA} = cu116
pip install pyg-lib torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.13.0+cu116.html
pip install torch-geometric

#install additional requirements
pip install -r requirements.txt

#install openpoints as root
sudo ./install_openpoints.sh

I receive the following warnings

cuda/emd_kernel.cu(178): error: identifier "CHECK_EQ" is undefined

cuda/emd_kernel.cu(265): error: identifier "CHECK_EQ" is undefined

cuda/emd_kernel.cu(382): error: identifier "CHECK_EQ" is undefined

I also receive some warnings:

.local/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.

.local/lib/python3.10/site-packages/setuptools/command/easy_install.py:160: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.

What is the recommended way to install this repo if not using conda?
Also noticed that python <= 3.8 is not supported.

Add full S3DIS dataset

TP3D and PointNeXt allow processing the full S3DIS dataset, which allows sampling by room, boxes, cylinders and spheres (while the PyG dataset is already preprocessed to 1x1m boxes).

Refactor Generic Model Code by Task

Have generic base models for tasks such as classification and segmentation, which commonly share the same metrics (such as loss, accuracy, and iou). This would likely entail refactoring out most methods besides the step and forward logic, to preprocess the data for the downstream model plugin as needed.

Rewrite Modelnet2048 using PyG InMemoryDataset

Currently, custom code is written to handle the downloading/processing of modelnet2048. It should be rewritten in the context of a pytorch geometric InMemoryDataset, which has helper functions to handle downloading and processing of the data from the h5py input format (and removing the custom implemented download functions)

See:
https://pytorch-geometric.readthedocs.io/en/latest/notes/create_dataset.html
https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/datasets/s3dis.py <- the pyg s3dis dataset also comes from a h5py source, very similar to modelnet2048

Add Dataset Transforms Pipeline

Add support for Pytorch Geometric data transformation pipelines for train, test, and validation datasets. Reimplement some TP3D transform methods within the new repo, such as AddFeatsByKeys and GridSampling3d. Investigate using OpenPoints transforms.

Run tests in Docker image

  • Move CI tests to docker image
  • Test various CUDA versions
  • Test both conda and pip installation
  • Try to remove --user from open_points conda install

Add documentation

  • Convert docstring to Read the doc
  • How to get started
  • How to run simple example
  • How to add new model/dataset
  • How to contribute

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.