GithubHelp home page GithubHelp logo

nvidia-ai-iot / lidar_ai_solution Goto Github PK

View Code? Open in Web Editor NEW
1.2K 1.2K 195.0 114.42 MB

A project demonstrating Lidar related AI solutions, including three GPU accelerated Lidar/camera DL networks (PointPillars, CenterPoint, BEVFusion) and the related libs (cuPCL, 3D SparseConvolution, YUV2RGB, cuOSD,).

License: Other

CMake 0.27% Python 21.86% C++ 52.03% Cuda 11.86% Shell 1.10% C 12.27% Makefile 0.61%

lidar_ai_solution's Introduction

Lidar AI Solution

This is a highly optimized solution for self-driving 3D-lidar repository. It does a great job of speeding up sparse convolution/CenterPoint/BEVFusion/OSD/Conversion.

title

Pipeline overview

pipeline

GetStart

$ git clone --recursive https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution
$ cd Lidar_AI_Solution
  • For each specific task please refer to the readme in the sub-folder.

3D Sparse Convolution

A tiny inference engine for 3d sparse convolutional networks using int8/fp16.

  • Tiny Engine: Tiny Lidar-Backbone inference engine independent of TensorRT.
  • Flexible: Build execution graph from ONNX.
  • Easy To Use: Simple interface and onnx export solution.
  • High Fidelity: Low accuracy drop on nuScenes validation.
  • Low Memory: 422MB@SCN FP16, 426MB@SCN INT8.
  • Compact: Based on the CUDA kernels and independent of cutlass.

CUDA BEVFusion

CUDA & TensorRT solution for BEVFusion inference, including:

  • Camera Encoder: ResNet50 and finetuned BEV pooling with TensorRT and onnx export solution.
  • Lidar Encoder: Tiny Lidar-Backbone inference independent of TensorRT and onnx export solution.
  • Feature Fusion: Camera & Lidar feature fuser with TensorRT and onnx export solution.
  • Pre/Postprocess: Interval precomputing, lidar voxelization, feature decoder with CUDA kernels.
  • Easy To Use: Preparation, inference, evaluation all in one to reproduce torch Impl accuracy.
  • PTQ: Quantization solutions for mmdet3d/spconv, Easy to understand.

CUDA CenterPoint

CUDA & TensorRT solution for CenterPoint inference, including:

  • Preprocess: Voxelization with CUDA kernel
  • Encoder: 3D backbone with NV spconv-scn and onnx export solution.
  • Neck & Header: RPN & CenterHead with TensorRT and onnx export solution.
  • Postprocess: Decode & NMS with CUDA kernel
  • Easy To Use: Preparation, inference, evaluation all in one to reproduce torch Impl accuracy.
  • QAT: Quantization solutions for traveller59/spconv, Easy to understand.

CUDA PointPillars

CUDA & TensorRT solution for pointpillars inference, including:

  • Preprocess: Voxelization & Feature Extending with CUDA kernel
  • Detector: 2.5D backbone with TensorRT and onnx export solution.
  • Postprocess: Parse bounding box, class type and direction
  • Easy To Use: Preparation, inference, evaluation all in one to reproduce torch Impl accuracy.

CUDA-V2XFusion

Training and inference solutions for V2XFusion.

  • Easy To Use: Provides easily reproducible solutions for training, quantization, and ONNX export.
  • Quantification friendly:PointPillars based backbone with pre-normalization which can reduce quantization error.
  • Feature Fusion: Camera & Lidar feature fuser and onnx export solution.
  • PTQ: Quantization solutions for V2XFusion, easy to understand.
  • Sparsity: 4:2 structural sparsity support.
  • Deepstream sample: Sample inference using CUDA, TensorRT/Triton in NVIDIA DeepStream SDK 7.0.

cuOSD(CUDA On-Screen Display Library)

Draw all elements using a single CUDA kernel.

  • Line: Plotting lines by interpolation(Nearest or Linear).
  • RotateBox: Supports drawn with different border colors and fill colors.
  • Circle: Supports drawn with different border colors and fill colors.
  • Rectangle: Supports drawn with different border colors and fill colors.
  • Text: Supports stb_truetype and pango-cairo backends, allowing fonts to be read via TTF or using font-family.
  • Arrow: Combination of arrows by 3 lines.
  • Point: Plotting points by interpolation(Nearest or Linear).
  • Clock: Time plotting based on text support

cuPCL(CUDA Point Cloud Library)

Provide several GPU accelerated Point Cloud operations with high accuracy and high performance at the same time: cuICP, cuFilter, cuSegmentation, cuOctree, cuCluster, cuNDT, Voxelization(incoming).

  • cuICP: CUDA accelerated iterative corresponding point vertex cloud(point-to-point) registration implementation.
  • cuFilter: Support CUDA accelerated features: PassThrough and VoxelGrid.
  • cuSegmentation: Support CUDA accelerated features: RandomSampleConsensus with a plane model.
  • cuOctree: Support CUDA accelerated features: Approximate Nearest Search and Radius Search.
  • cuCluster: Support CUDA accelerated features: Cluster based on the distance among points.
  • cuNDT: CUDA accelerated 3D Normal Distribution Transform registration implementation for point cloud data.

YUVToRGB(CUDA Conversion)

YUV to RGB conversion. Combine Resize/Padding/Conversion/Normalization into a single kernel function.

  • Most of the time, it can be bit-aligned with OpenCV.
    • It will give an exact result when the scaling factor is a rational number.
    • Better performance is usually achieved when the stride can divide by 4.
  • Supported Input Format:
    • NV12BlockLinear
    • NV12PitchLinear
    • YUV422Packed_YUYV
  • Supported Interpolation methods:
    • Nearest
    • Bilinear
  • Supported Output Data Type:
    • Uint8
    • Float32
    • Float16
  • Supported Output Layout:
    • CHW_RGB/BGR
    • HWC_RGB/BGR
    • CHW16/32/4/RGB/BGR for DLA input
  • Supported Features:
    • Resize
    • Padding
    • Conversion
    • Normalization

Thanks

This project makes use of a number of awesome open source libraries, including:

  • stb_image for PNG and JPEG support
  • pybind11 for seamless C++ / Python interop
  • and others! See the dependencies folder.

Many thanks to the authors of these brilliant projects!

lidar_ai_solution's People

Contributors

byte-deve avatar hopef avatar mchi-zg avatar rahulk4102 avatar sunnyqgg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lidar_ai_solution's Issues

bug in Lidar_AI_Solution/CUDA-BEVFusion/qat/test-mAP-for-cuda.py

I found that the "show" option in Lidar_AI_Solution/CUDA-BEVFusion/qat/test-mAP-for-cuda.py is not effective. Could you please advise me on how to use this project to run on the test set of nusencese and visualize the results? Are there any related codes available?

export onnx error

感谢作者提供这么优秀的rpo,我在导出自己训练的BEVfusion模型在运行export-scn.py导出的lidar.backbone.xyz.onnx如图所示:
1684995031975

运行export-scn.py脚本控制台打印:
fd5ebee8c79063f505113627760a85a

Segmentation fault (core dumped) when run tool/run.sh

I follow the readme,prepare the models and data,build engine,,but when run tool/run.sh,the follow error exist:

==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.44275 ms
[⏰ [NoSt] ImageNrom]: 5.44451 ms
[⏰ Lidar Backbone]: 3.04624 ms
[⏰ Camera Depth]: 0.03789 ms
[⏰ Camera Backbone]: 2.55898 ms
[⏰ Camera Bevpool]: 0.35421 ms
[⏰ VTransform]: 0.61219 ms
[⏰ Transfusion]: 1.63635 ms
[⏰ Head BoundingBox]: 3.45373 ms
Total: 11.700 ms

==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.44605 ms
[⏰ [NoSt] ImageNrom]: 5.41389 ms
[⏰ Lidar Backbone]: 3.03587 ms
[⏰ Camera Depth]: 0.03789 ms
[⏰ Camera Backbone]: 2.55693 ms
[⏰ Camera Bevpool]: 0.36045 ms
[⏰ VTransform]: 0.60826 ms
[⏰ Transfusion]: 1.62582 ms
[⏰ Head BoundingBox]: 3.45293 ms
Total: 11.678 ms

tool/run.sh: line 41: 2643 Segmentation fault (core dumped) ./build/bevfusion $DEBUG_DATA $DEBUG_MODEL $DEBUG_PRECISION

Thank for your reply!

Cuda failure: invalid configuration argument in file Lidar_AI_Solution/CUDA-CenterPoint/src/preprocess.cpp:106 error status: 9

GPU has cuda devices: 2
----device id: 0 info----
GPU : NVIDIA RTX A6000
Capbility: 8.6
Global memory: 48651MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
----device id: 1 info----
GPU : NVIDIA RTX A6000
Capbility: 8.6
Global memory: 48685MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)

Total 2

<<<<<<<<<<<
load file: ../data/test/3615d82e7e8546fea5f181157f42e30b.bin
find points num: 6
Cuda failure: invalid configuration argument in file Lidar_AI_Solution/CUDA-CenterPoint/src/preprocess.cpp:106 error status: 9
Aborted (core dumped)

disable quant layer

in Centerpoint example, why do you only disable the quantization for conv_input and the first sparse basic block for conv1? What about the rest convs?

SparseConvolution.cu

Thank you very much for your outstanding work.
I see that the implementation of SparseConvolution on the deployment side gives the .so file and I was wondering if its code(.cu) will be released.

centerpoint Inference time

The official test case shows that inference time/frame is 42ms,but I had test on my own orin device,inference time is 61ms per frame. I want to know is there any detail should be consider to enhance the inference speed.

Inference results issue

I have tried my own data with CUDA-BEVFusion and BEVFusion directly without training. The BEVFusion gives good infercence results. But CUDA-BEVFusion can not detect person,the vehicle detection is good. Any clue for this?
Best regards

centerpoint train

Thank you very much for your open source,Which project should I use to train my own data ,

plugin for SparseConvolution

Hi:
I can see that there is custom onnx node "SparseConvolution" in Centerpoint middle encoder onnx file. However, I can't find its plugin or how this onnx is converted to tensorrt engine. Can you guide me to where to find it?

Question about customized dataset

Hi, thanks for sharing the great work and congrats to the achievements!

I have a question regarding to customized dataset:

This repo is mainly designed to implement the model inference highly effeciently, so if i want to follow your work, the first thing to do is to trainng customized dataset with the official BEVFusion repo within Pytorch, right?

onnx input voxel num

In the onnx generated, the input seems to be of static shape. However, in inferencing, the input voxel number will always change. Does the sparsee convolution inference not depend on the input shape of onnx?

load_tensor to load in_features.torch.fp16.tensor

I got an error when using load_tensor from funcs to load 3DSparseConvolution/workspace/centerpoint/in_features.torch.fp16.tensor,
it shows this error:

from funcs import load_tensor
b = load_tensor("/vol_slow/Lidar_AI_Solution/libraries/3DSparseConvolution/workspace/centerpoint/in_features.torch.fp16.tensor")
print(b)

Screenshot from 2023-05-29 13-50-25

When I print the datatype, it shows int32, is this an error?

center point onnx file

I don‘t find the script to convert pth to onnx file. If I want to use my own weights, what do I have to do? Keep same name and input and out shape like the onnx file that you provide ?

你好,请问报这个错如何解决

/usr/bin/ld: warning: libprotobuf.so.17, needed by /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::RegisterAllTypes(google::protobuf::Metadata const*, int)' /usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::io::CodedOutputStream::WriteVarint32SlowPath(unsigned int)'
/usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::io::CodedOutputStream::WriteVarint64SlowPath(unsigned long)' /usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormat::SerializeUnknownFields(google::protobuf::UnknownFieldSet const&, google::protobuf::io::CodedOutputStream*)'
/usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::ArenaImpl::AllocateAligned(unsigned long)' /usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::AssignDescriptors(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, google::protobuf::internal::MigrationSchema const*, google::protobuf::Message const* const*, unsigned int const*, google::protobuf::Metadata*, google::protobuf::EnumDescriptor const**, google::protobuf::ServiceDescriptor const**)'
/usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray(google::protobuf::UnknownFieldSet const&, unsigned char*)' /usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::Message::ParseFromIstream(std::istream*)'
/usr/bin/ld: /home/chenyang/Lidar_AI_Solution_3/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to `google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, void (*)(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&))'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/bevfusion.dir/build.make:678:bevfusion] 错误 1
make[1]: *** [CMakeFiles/Makefile2:111:CMakeFiles/bevfusion.dir/all] 错误 2
make: *** [Makefile:91:all] 错误 2

Generate tiny engine for spconv

Thanks for the great work. I already tested inference on example data. In issue #4 , you mentioned that you generated a tiny engine in a standalone way for spconv. Can you elaborate on how can I generate tiny engine if I make some changes to spconv code in cuda-bevfusion ?

rulebook

Does every sparseconvolution have to have rulebook?

执行bash tool/run.sh报错

在执行bash tool/run.sh的时候报错,请问该怎么解决?
fatal error: stb_image.h: 没有那个文件或目录
#include <stb_image.h>
^~~~~~~~~~~~~
compilation terminated.
CMakeFiles/bevfusion.dir/build.make:504: recipe for target 'CMakeFiles/bevfusion.dir/src/main.cpp.o' failed
make[2]: *** [CMakeFiles/bevfusion.dir/src/main.cpp.o] Error 1
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/bevfusion.dir/all' failed
make[1]: *** [CMakeFiles/bevfusion.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

centerpoint engine setup failed

hello Lidar_AI_Solution team @hopef ,

type bash tool/build.trt.sh, error occurs as follows:
[05/24/2023-16:26:52] [E] Error[10]: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node ConvTranspose_56 + BatchNormalization_57 + Relu_58.)
[05/24/2023-16:26:52] [E] Error[2]: [builder.cpp::buildSerializedNetwork::417] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.)
[05/24/2023-16:26:52] [E] Engine could not be created from network
[05/24/2023-16:26:52] [E] Building engine failed
[05/24/2023-16:26:52] [E] Failed to create engine from model or file.
[05/24/2023-16:26:52] [E] Engine set up failed
rpn_centerhead_sim.8503.log

My environment is
CUDA 11.4
TensorRT 8.5.3.1
GPU: 3060
Ubuntu 20.04

The full log is attached as well.
It seems the error comes from the onnx file.
Please help figure it out.
Thanks in advance.

Precision of SparseConvolution

Hello,

I see in you PTQ.onnx file and export-scn.py, your precisions of the first three SparseConvolution are 'fp16', is this because there is too much loss of accuracy with precision 'int8' ?

Thanks

Version of Tensorrt

Hello

I see in the README of CUDA-CenterPoint, you use 'CUDA 11.4 + cuDNN 8.4.1 + TensorRT 8.4.12.5' at Tesla Platform, but I can't find the version 8.4.12.5 of Tensorrt. In addition, I see that you set CUDA_TOOLKIT_ROOT_DIR=/root/.kiwi/lib/cuda-11.8 and TENSORRT_ROOT=/root/.kiwi/lib/TensorRT-8.5.3.1-cuda11x in the CmakeLists of CUDA-CenterPoint. So should I follow the version in the CmakeLists or README?

Thanks

spconv version

Hi:
In Centerpoint quantization, you used _conv_forward as the forward function, so I assume you used spconv version 2.3.0+?

您好,在运行CUDA-BEVfusion时,遇到了缺库的问题,但已经安装

(lssEnv) shijie@shijie-ThinkStation-K:~/Lidar_AI_Solution/CUDA-BEVFusion$ bash tool/run.sh

|| MODEL: resnet50int8
|| PRECISION: int8
|| DATA: example-data
|| USEPython: OFF
||
|| TensorRT: /home/shijie/TensorRT-8.6.1.6/lib
|| CUDA: /usr/local/cuda-11.6/
|| CUDNN: /usr/local/cuda-11.6/lib64

Try to get the current device SM
Current CUDA SM: 86
Configuration done!
-- Configuring done
-- Generating done
-- Build files have been written to: /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/build
[ 63%] Built target bevfusion_core
[ 68%] Linking CXX executable bevfusion
/usr/bin/ld: warning: libprotobuf.so.17, needed by /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so, not found (try using -rpath or -rpath-link)
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::RegisterAllTypes(google::protobuf::Metadata const*, int)' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::OnShutdownRun(void ()(void const), void const*)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::Arena::OnArenaAllocation(std::type_info const*, unsigned long) const' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::fixed_address_empty_string[abi:cxx11]'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::io::CodedOutputStream::WriteVarint64SlowPath(unsigned long)' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::DestroyMessage(void const*)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::ArenaImpl::AllocateAlignedAndAddCleanup(unsigned long, void (*)(void*))' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::ArenaImpl::AllocateAligned(unsigned long)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormatLite::UInt64Size(google::protobuf::RepeatedField<unsigned long> const&)' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormatLite::WriteFloatArray(float const*, int, google::protobuf::io::CodedOutputStream*)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::InitSCCImpl(google::protobuf::internal::SCCInfoBase*)' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormatLite::WriteDoubleArray(double const*, int, google::protobuf::io::CodedOutputStream*)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::WireFormatLite::Int32Size(google::protobuf::RepeatedField<int> const&)' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::internal::AssignDescriptors(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, google::protobuf::internal::MigrationSchema const*, google::protobuf::Message const* const*, unsigned int const*, google::protobuf::Metadata*, google::protobuf::EnumDescriptor const**, google::protobuf::ServiceDescriptor const**)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::Message::SpaceUsedLong() const' /home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to google::protobuf::io::CodedInputStream::SkipFallback(int, int)'
/home/shijie/Lidar_AI_Solution/CUDA-BEVFusion/../libraries/3DSparseConvolution/libspconv/lib/x86_64/libspconv.so: undefined reference to `google::protobuf::internal::WireFormatLite::Int64Size(google::protobuf::RepeatedField const&)'
collect2: error: ld returned 1 exit status
CMakeFiles/bevfusion.dir/build.make:651: recipe for target 'bevfusion' failed
make[2]: *** [bevfusion] Error 1
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/bevfusion.dir/all' failed
make[1]: *** [CMakeFiles/bevfusion.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

(lssEnv) shijie@shijie-ThinkStation-K:~/Lidar_AI_Solution/CUDA-BEVFusion$ sudo apt install libprotobuf-dev
[sudo] password for shijie:
Reading package lists... Done
Building dependency tree
Reading state information... Done
libprotobuf-dev is already the newest version (3.0.0-9.1ubuntu1.1).
The following packages were automatically installed and are no longer required:
gir1.2-goa-1.0 gir1.2-snapd-1
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 103 not upgraded.

大佬可以看看怎么解决吗,我也尝试过从源码安装这个库

Can not find any matched kernel 4 x 16

Hello, i have converted my centerpoint model to onnx. I only use 4 feature as input while use 5 in your code, then the error occurs.

Pass argument constructor
Assert failed 💀. false in file src/spconv/implicit-gemm.cu:346, message: Can not find any matched kernel 4 x 16

Is it my bug in my code or the project don't support other input shape?

Jetson AGX Xavier infer Error

Lidar_AI_Solution/CUDA-BEVFusion/src/common/tensor.cu(135): error: calling a device function("__half") from a host function("arange_kernel_host") is not allowed

1 error detected in the compilation of "/tmp/tmpxft_00006d55_00000000-4_tensor.cpp4.ii".
CMake Error at bevfusion_core_generated_tensor.cu.o.Release.cmake:280 (message):
Error generating file
/work/share/usr/cyy/BEV/Lidar_AI_Solution/CUDA-BEVFusion/build/CMakeFiles/bevfusion_core.dir/src/common/./bevfusion_core_generated_tensor.cu.o

support of custom dataset

Hi:
Does your sparse convolution support inference with input shape N*4? When I'm trying to inference, it says

image

CUDA-BEVFusion build head.bbox model failed

Thanks for the great work.

In CUDA-BEVFusion, when I run "bash tool/build_trt_engine.sh" to build engine file, head.bbox model build failed, but have no error info.
Could you please help me check what the problem is?

你好,我在cuda-centerpoint程序,遇到Detection NUM= 0的情况

GPU has cuda devices: 1
----device id: 0 info----
  GPU : NVIDIA GeForce GTX 1080 Ti 
  Capbility: 6.1
  Global memory: 11156MB
  Const memory: 64KB
  SM in a block: 48KB
  warp size: 32
  threads in a block: 1024
  block dim: (1024,1024,64)
  grid dim: (2147483647,65535,65535)

Total 2

<<<<<<<<<<<
load file: ../data/test/291e7331922541cea98122b607d24831.bin
find points num: 239911
[TIME] Voxelization:            3.09219 ms
valid_num: 85179
[TIME] 3D Backbone:             0.35184 ms
[TIME] RPN + Head:              24.59392 ms
CUDA kernel failed : no kernel image is available for execution on the device
[TIME] Decode + NMS:            0.43117 ms
Detection NUM: 0
Saved prediction in: ../data/prediction/291e7331922541cea98122b607d24831.txt
>>>>>>>>>>>

<<<<<<<<<<<
load file: ../data/test/3615d82e7e8546fea5f181157f42e30b.bin
find points num: 267057
[TIME] Voxelization:            2.24326 ms
valid_num: 106004
[TIME] 3D Backbone:             0.20611 ms
[TIME] RPN + Head:              23.10170 ms
CUDA kernel failed : no kernel image is available for execution on the device
[TIME] Decode + NMS:            0.55421 ms
Detection NUM: 0
Saved prediction in: ../data/prediction/3615d82e7e8546fea5f181157f42e30b.txt
>>>>>>>>>>>

Perf Report: 
    Voxelization: 2.66773 ms.
    3D Backbone: 0.278976 ms.
    RPN + Head: 23.8478 ms.
    Decode + NMS: 0.492688 ms.
    Total: 27.2872 ms.

build.trt.sh code

trt_version=8406

if [ ! -f "model/rpn_centerhead_sim.plan.${trt_version}" ]; then
    echo Building the model: model/rpn_centerhead_sim.plan.${trt_version}, this will take 2 minutes. Wait a moment 🤗🤗🤗~.
    trtexec --onnx=model/rpn_centerhead_sim.onnx \
        --saveEngine=model/rpn_centerhead_sim.plan.${trt_version} \
        --workspace=4096 --fp16 --outputIOFormats=fp16:chw \
        --inputIOFormats=fp16:chw --verbose --dumpLayerInfo \
        --dumpProfile --separateProfileRun \
        --profilingVerbosity=detailed > model/rpn_centerhead_sim.${trt_version}.log 2>&1

    rm -rf model/rpn_centerhead_sim.plan
    dir=`pwd`
    ln -s ${dir}/model/rpn_centerhead_sim.plan.${trt_version} model/rpn_centerhead_sim.plan
else
    echo Model model/rpn_centerhead_sim.plan.${trt_version} already build 🙋🙋🙋.
fi

你好,请教一下,我按照readme流程进行操作的,tensort版本是8.4.0.6,build.trt.sh生成的log显示转换没有问题。
请问是bash tool/build.trt.sh 这一步中tensor的版本不对应造成的吗?
谢谢

大佬,编译报错

make[2]: *** [CMakeFiles/bevfusion_core.dir/build.make:112:CMakeFiles/bevfusion_core.dir/src/bevfusion/bevfusion_core_generated_camera-vtransform.cu.o] 错误 1
/usr/include/c++/11/type_traits:79:52: error: redefinition of ‘constexpr const _Tp std::integral_constant<_Tp, __v>::value’
79 | template<typename _Tp, _Tp __v>
| ^
/usr/include/c++/11/type_traits:67:29: note: ‘constexpr const _Tp value’ previously declared here
67 | static constexpr _Tp value = __v;
| ^~~~~
/usr/include/c++/11/type_traits:79:52: error: redefinition of ‘constexpr const _Tp std::integral_constant<_Tp, __v>::value’
79 | template<typename _Tp, _Tp __v>
| ^
/usr/include/c++/11/type_traits:67:29: note: ‘constexpr const _Tp value’ previously declared here
67 | static constexpr _Tp value = __v;
| ^~~~~
/usr/include/c++/11/type_traits:79:52: error: redefinition of ‘constexpr const _Tp std::integral_constant<_Tp, __v>::value’
79 | template<typename _Tp, _Tp __v>
| ^
/usr/include/c++/11/type_traits:67:29: note: ‘constexpr const _Tp value’ previously declared here
67 | static constexpr _Tp value = __v;
| ^~~~~
CMake Error at bevfusion_core_generated_lidar-voxelization.cu.o.Release.cmake:280 (message):
Error generating file
/home/chenyang/Lidar_AI_Solution_2/CUDA-BEVFusion/build/CMakeFiles/bevfusion_core.dir/src/bevfusion/./bevfusion_core_generated_lidar-voxelization.cu.o

BEVFUSION, PTX JIT compilation failed, code = cudaErrorInvalidPtx

root@fe3435b23fa0:/mnt/Lidar_AI_Solution-master/CUDA-BEVFusion# ./tool/run.sh 
==========================================================
||  MODEL: resnet50int8
||  PRECISION: int8
||  DATA: example-data
||  USEPython: OFF
||
||  TensorRT: /usr/lib/x86_64-linux-gnu/
||  CUDA: /usr/local/cuda
||  CUDNN: /usr/lib/x86_64-linux-gnu/
==========================================================
Try to get the current device SM
Current CUDA SM: 80
Configuration done!
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/Lidar_AI_Solution-master/CUDA-BEVFusion/build
Consolidate compiler generated dependencies of target bevfusion_core
[ 63%] Built target bevfusion_core
Consolidate compiler generated dependencies of target bevfusion
[ 68%] Building CXX object CMakeFiles/bevfusion.dir/src/main.cpp.o
[ 72%] Linking CXX executable bevfusion
[100%] Built target bevfusion
Create by resnet50int8, int8
CUDA Runtime error create_frustum_kernel # a PTX JIT compilation failed, code = cudaErrorInvalidPtx [ 218 ] in file /mnt/Lidar_AI_Solution-master/CUDA-BEVFusion/src/bevfusion/camera-geometry.cu:209
./tool/run.sh: line 41:   583 Aborted                 (core dumped) ./build/bevfusion $DEBUG_DATA $DEBUG_MODEL $DEBUG_PRECISION

how to solve this?

Compilation of ".plan" file

Hi,

As you said in another issue, you implemented a tiny engine in a standalone way and build some ".plan" files from ONNX models. And I want know if these ".plan" model file can be compiled with other ".engine" file or ".trt" file together?

Thanks

How can I convert a lidar pcd file to points.tensor ?

Thanks for this great work. Based on cuda-bevfusion, I can run on orin with 45 ms time consumption per frame which is very impressive.
Now, I want to try with my own data. How can I convert a lidar pcd file to points.tensor ?

Cuda failure: invalid configuration argument in file Lidar_AI_Solution/CUDA-CenterPoint/src/preprocess.cpp:106 error status: 9

GPU has cuda devices: 2
----device id: 0 info----
GPU : NVIDIA RTX A6000
Capbility: 8.6
Global memory: 48651MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
----device id: 1 info----
GPU : NVIDIA RTX A6000
Capbility: 8.6
Global memory: 48685MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)

Total 2

<<<<<<<<<<<
load file: ../data/test/3615d82e7e8546fea5f181157f42e30b.bin
find points num: 6
Cuda failure: invalid configuration argument in file Lidar_AI_Solution/CUDA-CenterPoint/src/preprocess.cpp:106 error status: 9
Aborted (core dumped)

core dumped by CUDA Runtime error spconv::setup_hash_and_route

(base) qiu@qiu-System-Product-Name:~/Lidar_AI_Solution/CUDA-BEVFusion$ bash tool/run.sh

|| MODEL: resnet50
|| PRECISION: fp16
|| DATA: example-data
|| USEPython: OFF
||
|| TensorRT: /usr/local/TensorRT-8.5.1.7/lib
|| CUDA: /usr/local/cuda
|| CUDNN: /usr/local//cuda/lib64

Try to get the current device SM
Current CUDA SM: 75
Configuration done!
-- Configuring done
-- Generating done
-- Build files have been written to: /home/qiu/Lidar_AI_Solution/CUDA-BEVFusion/build
[ 63%] Built target bevfusion_core
[100%] Built target bevfusion
Create by resnet50, fp16

Camerea Backbone 🌱 is Static Shape model
Inputs: 2
0.img : {1 x 6 x 3 x 256 x 704} [Float16]
1.depth : {1 x 6 x 1 x 256 x 704} [Float16]
Outputs: 2
0.camera_depth_weights : {6 x 118 x 32 x 88} [Float16]
1.camera_feature : {6 x 32 x 88 x 80} [Float16]


Camerea VTransform 🌱 is Static Shape model
Inputs: 1
0.feat_in : {1 x 80 x 360 x 360} [Float16]
Outputs: 1
0.feat_out : {1 x 80 x 180 x 180} [Float16]


Transfusion 🌱 is Static Shape model
Inputs: 2
0.camera : {1 x 80 x 180 x 180} [Float16]
1.lidar : {1 x 256 x 180 x 180} [Float16]
Outputs: 1
0.middle : {1 x 512 x 180 x 180} [Float16]


BBox 🌱 is Static Shape model
Inputs: 1
0.middle : {1 x 512 x 180 x 180} [Float16]
Outputs: 6
0.reg : {1 x 2 x 200} [Float16]
1.height : {1 x 1 x 200} [Float16]
2.dim : {1 x 3 x 200} [Float16]
3.rot : {1 x 2 x 200} [Float16]
4.vel : {1 x 2 x 200} [Float16]
5.score : {1 x 10 x 200} [Float16]

==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 1.13885 ms
[⏰ [NoSt] ImageNrom]: 12.50054 ms

CUDA Runtime error spconv::setup_hash_and_route<<<(num_input + 1023) / 1024, 1024, 0, stream>>>( hash_.get(), route_mask_.ptr(), route_.ptr(), indices, num_input, prob.kv) # no kernel image is available for execution on the device, code = cudaErrorNoKernelImageForDevice [ 209 ] in file src/spconv/rulebook.cu:138
tool/run.sh: line 41: 28104 Aborted (core dumped) ./build/bevfusion $DEBUG_DATA $DEBUG_MODEL $DEBUG_PRECISION

thanks for reply

[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:9: Message type "onnx2trt_onnx.ModelProto" has no field named "version".

thanks for your great work! I meet a question when i run "bash tool/build.trt.sh", the error is :
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:9: Message type "onnx2trt_onnx.ModelProto" has no field named "version".

My environment:
cuda11.7
tensorRT 8.5.3
ubuntu18
protobuf 3.6.1

Whether this project requires a higher version of protobuf?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.