GithubHelp home page GithubHelp logo

derryhub / bevformer_tensorrt Goto Github PK

View Code? Open in Web Editor NEW
399.0 4.0 64.0 413 KB

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

License: Apache License 2.0

CMake 0.12% C++ 15.94% Cuda 30.20% Shell 0.01% Python 53.15% C 0.14% Dockerfile 0.43%
quantization bevformer tensorrt-plugins cuda int8-inference pytorch

bevformer_tensorrt's People

Contributors

derryhub avatar firestonelib avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bevformer_tensorrt's Issues

TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported: 1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

Hello,

I have got a error when I transport ONNX to TRT model by using sh samples/bevformer/tiny/onnx2trt.sh -d 0.
The detailed log information is shown as following:

[02/25/2023-12:57:54] [TRT] [V] Loaded shared library libcublasLt.so.11
[02/25/2023-12:57:54] [TRT] [V] Using cublasLt as core library tactic source
[02/25/2023-12:57:54] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +650, GPU +274, now: CPU 2219, GPU 2942 (MiB)
[02/25/2023-12:57:54] [TRT] [V] Trying to load shared library libcudnn.so.8
[02/25/2023-12:57:54] [TRT] [V] Loaded shared library libcudnn.so.8
[02/25/2023-12:57:54] [TRT] [V] Using cuDNN as plugin tactic source
[02/25/2023-12:57:55] [TRT] [V] Using cuDNN as core library tactic source
[02/25/2023-12:57:55] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +178, GPU +266, now: CPU 2397, GPU 3208 (MiB)
[02/25/2023-12:57:55] [TRT] [W] TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.1.1
[02/25/2023-12:57:55] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[02/25/2023-12:57:55] [TRT] [V] Constructing optimization profile number 0 [1/1].
[02/25/2023-12:57:55] [TRT] [E] 1: [constantBuilder.cpp::addSupportedFormats::32] Error Code 1: Internal Error (Constant output type does not support bool datatype.)
[02/25/2023-12:57:55] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
Traceback (most recent call last):
File "tools/bevformer/onnx2trt.py", line 262, in
main()
File "tools/bevformer/onnx2trt.py", line 257, in main
calibrator=calibrator,
File "/root/paddlejob/workspace/env_run/BEVFormer_Tensorrt/det2trt/convert/onnx2tensorrt.py", line 63, in build_engine
engine = runtime.deserialize_cuda_engine(plan)
TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

Invoked with: <tensorrt.tensorrt.Runtime object at 0x7fae1710bd30>, None

Error executing samples/test_trt_ops.sh

Hi, I got this output when executing sh samples/test_trt_ops.sh. I noticed that I was using pytorch 1.11, so I upgraded to pytorch 1.12.0. However, this error remains. Any ideas? Thank you!

Loaded tensorrt plugins from /home/sly/NEXT/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so
WARNING: Logging before flag parsing goes to stderr.
W0424 21:00:52.183286 281473776302240 collision.py:11] No FCL -- collision checking will not work

#################### Running GridSampler2DTestCase ####################
test_fp16_bicubic_border_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... /home/sly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/utils.py:294: UserWarning: `add_node_names' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `add_node_names` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/home/sly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/utils.py:294: UserWarning: `do_constant_folding' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `do_constant_folding` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/home/sly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/nn/modules/module.py:1402: UserWarning: positional arguments and argument "destination" are deprecated. nn.Module.state_dict will not accept them in the future. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
  warnings.warn(
/home/sly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/tensorrt/__init__.py:75: DeprecationWarning: Context managers for TensorRT types are deprecated. Memory will be freed automatically when the reference count reaches 0.
  warnings.warn(
ERROR
test_fp16_bicubic_border_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bicubic_reflection_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bicubic_reflection_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bicubic_zeros_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bicubic_zeros_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bilinear_border_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bilinear_border_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bilinear_reflection_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
test_fp16_bilinear_reflection_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... ERROR
(Omitted)```

How to fix the ImportError?

When I use sh samples/bevformer/base/pth2onnx.sh -d 0,
get error:
ImportError: cannot import name 'iou3d_cuda' from partially initialized module 'third_party.bevformer.ops.iou3d' (most likely due to a circular import) (/workspace/BEVFormer_tensorrt/./third_party/bevformer/ops/iou3d/__init__.py)
Thx

trt evaluate error

Hi, I met an error while using sh samples/bevformer/plugin/tiny/trt_evaluate.sh.
It seems like something went wrong during context = engine.create_execution_context() and I found the context is None.

  File "tools/bevformer/evaluate_trt.py", line 173, in <module>
    main()
  File "tools/bevformer/evaluate_trt.py", line 119, in main
    inputs, outputs, bindings = allocate_buffers(
  File "/root/autodl-tmp/BEVFormer_tensorrt/./det2trt/utils/tensorrt.py", line 51, in allocate_buffers
    context.set_binding_shape(binding_id, dims)
AttributeError: 'NoneType' object has no attribute 'set_binding_shape'

Install project error

Hello, thanks for your great work.
When I install this project, I meet the following errors during 'make -j$(nproc)'

/root/BEVFormer_tensorrt/TensorRT/plugin/multi_scale_deformable_attn/multiScaleDeformableAttnKernel.cu(883): error: expression must be a modifiable lvalue                                                                 detected during instantiation of "void ms_deformable_im2col_cuda_int8(const int8_4 *, float, const int32_t *, const T *, const int8_t *, float, const int8_t *, float, int, int, int, int, int, int, i$t, int, int8_4 *, float, cudaStream_t) [with T=float]"                                                                                                                                                           (965): here

1 error detected in the compilation of "/root/BEVFormer_tensorrt/TensorRT/plugin/multi_scale_deformable_attn/multiScaleDeformableAttnKernel.cu".
CMakeFiles/tensorrt_ops.dir/build.make:159: recipe for target 'CMakeFiles/tensorrt_ops.dir/plugin/multi_scale_deformable_attn/multiScaleDeformableAttnKernel.cu.o' failed
make[2]: *** [CMakeFiles/tensorrt_ops.dir/plugin/multi_scale_deformable_attn/multiScaleDeformableAttnKernel.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....

Docker install and mmlabs fail on git clone

Something seems to be off with "git clone" not working during the docker build phase. I'm told not to expect to be able to depend on that since docker builds don't come with ssh keys and github doesn't like anonymity.

I tried alternative installs for OpenMMLabs, like OpenMim and "mim install" commands, but that fails as well with a timeout error below:

WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe28d672910>: Failed to establish a new connection: [Errno 111] Connection refused'))': /openmim/

Is this something that someone else has experienced and solved?

INT8 accuray issue for Multiscale deformable attention

hello,
I'm curious about the accuray of FUNCTION:ms_deformable_im2col_cuda_int8()
channels /= 4;
const int value_step = num_heads * spatial_size * channels;
const int output_step = num_heads * num_query * channels;
const int points_step = num_query * points_per_group;
const int weight_step = num_heads * num_query * num_levels * num_point;
const int offset_step = weight_step * 2;

for (int batch_index = 0; batch_index < batch_size; batch_index++) {
ms_deformable_im2col_gpu_kernel_int8<__half2>
<<<GET_BLOCKS(num_kernels), THREADS_PER_BLOCK, 0, stream>>>(
num_kernels, data_value, scale_value, data_spatial_shapes,
data_reference_points, data_sampling_offsets, scale_offset,
data_attn_weight, scale_weight, 1, spatial_size, num_heads,
channels, num_levels, num_query, num_point, points_per_group,
data_col, scale_out);
data_value += value_step;//
data_col += output_step;
data_reference_points += points_step;
data_sampling_offsets += offset_step;
data_attn_weight += weight_step;
}
For data_value += value_step; const int output_step = num_heads * num_query * channels;
channels have already been divisible by 4,this is not next batch data for data_value , right?

does anyone try to deploy this great repo in jetson orin? Unsupported operator Gridsampler2DTRT

first of all, thanks for this great job, i 've deployed this project in x86 ubuntu system successfully, however, i encountered lots of problems when i tried to deploy it in jetson orin pack. for example, it showed Unsupported operator Gridsampler2DTRT during running test_trt_ops.py even i installed onnx 1.12.0, torch 1.12.0 in orin jetson pack. i noticed that someone has encountered similar problem as i showed, it was solved via installed onnx1.12 and torch 1.12 synchronously. but the difference is that i installed tensorrt8.5.2.2 in orin!!

i don't know whether the difference of tensorrt version would result in such issue, by the way, i also encountered nan_to_num plugin issue as well. thanks alot, if you can help me

When doing yolox/trt_evaluate.sh for fp32 tensorrt model, what's the reason this error occurred? how can i fix it?

trt_model checkpoints/tensorrt/yolox_x_8x8_300e_coco_20211126_140254-1ef88d67.trt
loading annotations into memory...
Done (t=0.55s)
creating index...
index created!
[ ] 0/4952, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/2d/evaluate_trt.py", line 142, in
main()
File "tools/2d/evaluate_trt.py", line 103, in main
engine, context, input_shapes=input_shapes_, output_shapes=output_shapes_
File "./det2trt/utils/tensorrt.py", line 49, in allocate_buffers
context.set_binding_shape(binding_id, dims)
AttributeError: 'NoneType' object has no attribute 'set_binding_shape'

ImportError: cannot import name 'iou3d_cuda'

ImportError: cannot import name 'iou3d_cuda' from partially initialized module 'third_party.bev_mmdet3d.ops.iou3d' (most likely due to a circular import) (/workspace/BEVFormer_tensorrt/./third_party/bev_mmdet3d/ops/iou3d/init.py)

ScatterND error when running onnx2trt.sh with TensorRT 8.5.2

Hi, I met the following error when executing the command sh samples/bevformer/plugin/tiny/onnx2trt.sh -d 0 :

...(Omitted)
[06/11/2023-18:30:12] [TRT] [V] node_of_onnx::ScatterND_1203 [Reshape] inputs: [onnx::Reshape_1135 -> (4, 1, 6, 2500)[FLOAT]], [onnx::Reshape_1202 -> (5)[INT32]], 
[06/11/2023-18:30:12] [TRT] [V] Registering layer: node_of_onnx::ScatterND_1203 for ONNX node: node_of_onnx::ScatterND_1203
[06/11/2023-18:30:12] [TRT] [E] [graphShapeAnalyzer.cpp::analyzeShapes::1872] Error Code 4: Miscellaneous (IShuffleLayer node_of_onnx::ScatterND_1203: reshape changes volume to multiple of original volume. Reshaping [4,1,6,2500] to [1,1,6,1,1].)
ERROR: Failed to parse the ONNX file.
In node 502 (parseGraph): INVALID_NODE: Invalid Node - node_of_onnx::ScatterND_1203
[graphShapeAnalyzer.cpp::analyzeShapes::1872] Error Code 4: Miscellaneous (IShuffleLayer node_of_onnx::ScatterND_1203: reshape changes volume to multiple of original volume. Reshaping [4,1,6,2500] to [1,1,6,1,1].)

It seems that the ScatterND is trying to falsely reshaping something...

My environment is slightly different with the author's recommendation. I am using TensorRT 8.5.2 instead of 8.5.1 since I am using a Jetson Orin which does not has a 8.5.1 version. My pytorch version is 1.12 and my CUDA version is 11.4.

Do you think this error is related to TensorRT version? I found the following bug fix of TensorRT 8.5.2 in TensorRT release note: The ONNX parser recognizes the allowzero attribute on Reshape operations for opset 5 and higher, even though ONNX spec requires it only for opset 14 and higher. Setting this attribute to 1 can correct networks that are incorrect for empty tensors, and let TensorRT analyze the memory requirements for dynamic shapes more accurately. Could this bug fix be related to my issue?

Any help is much appreciated!

torch.onnx.export transform bevformer to onnx occure error aten::mul not support

when i train model from https://github.com/fundamentalvision/BEVFormer, and transform to onnx,occur error:
/home/xjw/Desktop/cyp/bxie/BevFormer/BEVFormer/tools/onnx_utils.py", line 74, in get_onnx_model
torch.onnx.export(
File "/root/miniconda3/lib/python3.8/site-packages/torch/onnx/init.py", line 350, in export
return utils.export(
File "/root/miniconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 163, in export
_export(
File "/root/miniconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 1074, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/root/miniconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 731, in _model_to_graph
graph = _optimize_graph(
File "/root/miniconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 249, in _optimize_graph
_C._jit_pass_canonicalize_graph_fuser_ops(graph)
RuntimeError: 0 INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":608, please report a bug to PyTorch. We don't have an op for aten::mul but it isn't a special case. Argument types: Tensor, bool,

bevformer transform to tensorrt also need onnx, did you encounter this error ? How did you resolve it?
Thanks!

onnx2trt with nv_half and nv_half2 failed

Command:
python tools/bevformer/onnx2trt.py configs/bevformer/plugin/bevformer_tiny_trt_p.py checkpoints/onnx/bevformer_tiny_epoch_24_cp.onnx

Error (Catch by faulthandler):

[02/24/2023-11:15:11] [TRT] [V] Searching for input: onnx::Expand_1007
[02/24/2023-11:15:11] [TRT] [V] Searching for input: onnx::Expand_1008
[02/24/2023-11:15:11] [TRT] [V] node_of_onnx::Expand_1009 [Expand] inputs: [onnx::Expand_1007 -> ()[INT32]], [onnx::Expand_1008 -> (0)[INT32]], 
[02/24/2023-11:15:11] [TRT] [V] Registering layer: onnx::Expand_1007 for ONNX node: onnx::Expand_1007
Fatal Python error: Segmentation fault

Current thread 0x0000ffffa6e5a9c0 (most recent call first):
  File "/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/convert/onnx2tensorrt.py", line 38 in build_engine
  File "tools/bevformer/onnx2trt.py", line 259 in main
  File "tools/bevformer/onnx2trt.py", line 271 in <module>
Segmentation fault (core dumped)

And File "/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/convert/onnx2tensorrt.py", line 38 in build_engine is the code for parsing onnx model, then I tried to regenerate onnx model(successfully) and convert to trt, but failed again.

Maybe the onnx model is not correct? So I checked with onnx.checker.check_model(onnx_model), Incorrect indeed

But there was no error(only warnings) occured in the process of pth2onnx with nv_half or nv_half2,
Warning below with command: python tools/pth2onnx.py configs/bevformer/plugin/bevformer_tiny_trt_p.py checkpoints/pytorch/bevformer_tiny_epoch_24.pth --opset_version 13 --cuda --flag cp

Loaded tensorrt plugins from /data/projects/bevformer_tensorrt/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./third_party/bevformer/models/detectors/mvx_two_stage.py:86: UserWarning: DeprecationWarning: pretrained is a deprecated                     key, please consider using init_cfg
  warnings.warn(
load checkpoint from local path: checkpoints/pytorch/bevformer_tiny_epoch_24.pth
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/onnx/utils.py:294: UserWarning: `add_node_names' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `add_node_names` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/nn/modules/module.py:1402: UserWarning: positional arguments and argument "destination" are deprecated. nn.Module.state_dict will not accept them in the future. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
  warnings.warn(
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/detector/bevformer.py:34: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  img_feats_reshaped.append(img_feat.view(B, int(BN / B), C, H, W))
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/detector/bevformer.py:40: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  assert len(img_feats[0]) == 1
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
  cx = center[0] - center[0].new_tensor(ow * 0.5)
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:16: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
  cy = center[1] - center[1].new_tensor(oh * 0.5)
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py:320: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  spatial_shapes[lvl, 0] = int(h)
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py:321: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  spatial_shapes[lvl, 1] = int(w)
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py:199: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  reference_points = reference_points * torch.tensor(
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py:207: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  ).view(1, 1, 1, 3) + torch.tensor(
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py:226: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  int(reference_points_cam.shape[3]),
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py:600: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  spatial_shapes=torch.tensor([[bev_h, bev_w]], device=query.device),
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py:601: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  level_start_index=torch.tensor([0], device=query.device),
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/temporal_self_attention.py:409: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert (spatial_shapes[:, 0] * spatial_shapes[:, 1]).sum() == value.shape[1]
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/temporal_self_attention.py:461: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if reference_points.shape[-1] == 2:
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/spatial_cross_attention.py:256: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  self.num_cams, -1, int(reference_points_cam.size(3)), 2
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/spatial_cross_attention.py:751: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert (spatial_shapes[:, 0] * spatial_shapes[:, 1]).sum() == value.shape[1]
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/spatial_cross_attention.py:767: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if reference_points.shape[-1] == 2:
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py:392: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  spatial_shapes=torch.tensor([[bev_h, bev_w]], device=query.device),
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py:393: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  level_start_index=torch.tensor([0], device=query.device),
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/decoder.py:443: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert (spatial_shapes[:, 0] * spatial_shapes[:, 1]).sum() == value.shape[1]
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/decoder.py:460: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if reference_points.shape[-1] == 2:
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/modules/decoder.py:96: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert reference_points.shape[-1] == 3
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/dense_heads/bevformer_head.py:249: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  for lvl in range(int(hs.shape[0])):
/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/models/dense_heads/bevformer_head.py:258: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert reference.shape[-1] == 3
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[W shape_type_inference.cpp:436] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding)
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[W shape_type_inference.cpp:436] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding)
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
ONNX file has been saved in checkpoints/onnx/nv_half/bevformer_tiny_epoch_24_cp.onnx

And here is the ONNX model generated

Rellay hope someone could help!

IndexError: Argument passed to at() was not in the map.

Getting the following error while converting to onnx:

File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 605, in _optimize_graph
    _C._jit_pass_peephole(graph, True)
IndexError: Argument passed to at() was not in the map.

can not use onnxsim

image

code:

import os
import onnx
import torch
from onnxsim import simplify

onnx_file = "checkpoints/onnx/bevformer_tiny_epoch_24_cp.onnx"
onnx_sim_path = onnx_file.replace(".onnx", "_sim.onnx")
model = onnx.load(onnx_file)
model_sim, check = simplify(model)
assert check, "Simplified ONNX model could not be validated"
onnx.save(model_sim, onnx_sim_path)

build bev_mmdert3d error: Invalid distribution name or version syntax: bev-mmdet3d--0.1

@DerryHub

Build and Install Part of Ops in MMDetection3D
cd ${PROJECT_DIR}/third_party/bev_mmdet3d
python setup.py build develop

when building bev_mmdet3d, I got an error:
Exit status: error: Invalid distribution name or version syntax: bev-mmdet3d--0.1
then I check the setup.py, I found a syntax bug in the line
so I pull a request to solve that problem: #54
Would you help me to review and merge that request ?

benchmark on orin

has anyone successfully deployed on orin? what's the infer time like?

amples/bevformer/tiny/pth2onnx.sh error,

hi, I run ./amples/bevformer/tiny/pth2onnx.sh, my terminal displayed as follows: how do I fix it? Thanks

my env:

bev-mmdet3d 0.1
comm 0.1.4
mmcv-full 1.5.0
mmdeploy 0.14.0
mmdet 2.25.1
pytorch-quantization 2.1.2
torch 1.10.0+cu113
torchaudio 0.10.0+rocm4.1
torchvision 0.11.0+cu113
TensorRT-8.5.3.1
cuda-11.3

(bevformer) lin@PC:~/code/BEVFormer_tensorrt$ sh samples/bevformer/tiny/pth2onnx.sh -d 0
Running on the GPU: 0
WARNING: Logging before flag parsing goes to stderr.
W0908 16:44:15.548062 139772466808640 collision.py:11] No FCL -- collision checking will not work
Loaded tensorrt plugins from /home/lin/code/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so
Traceback (most recent call last):
  File "tools/pth2onnx.py", line 9, in <module>
    from det2trt.convert import pytorch2onnx
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/__init__.py", line 1, in <module>
    from .models import *
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/models/__init__.py", line 1, in <module>
    from .backbones import *
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/models/backbones/__init__.py", line 1, in <module>
    from .csp_darknet import CSPDarknetQ
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/models/backbones/csp_darknet.py", line 11, in <module>
    from ..utils import CSPLayer
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/models/utils/__init__.py", line 2, in <module>
    from .onnx_ops import rotate, rotate_p, rotate_p2
  File "/home/lin/code/BEVFormer_tensorrt/./det2trt/models/utils/onnx_ops.py", line 400, in <module>
    torch.onnx.register_custom_op_symbolic("aten::grid_sampler", grid_sampler_sym, 13)
  File "/home/lin/software/miniconda3/envs/bevformer/lib/python3.8/site-packages/torch/onnx/__init__.py", line 404, in register_custom_op_symbolic
    utils.register_custom_op_symbolic(symbolic_name, symbolic_fn, opset_version)
  File "/home/lin/software/miniconda3/envs/bevformer/lib/python3.8/site-packages/torch/onnx/utils.py", line 1255, in register_custom_op_symbolic
    ns, op_name = get_ns_op_name_from_custom_op(symbolic_name)
  File "/home/lin/software/miniconda3/envs/bevformer/lib/python3.8/site-packages/torch/onnx/utils.py", line 1245, in get_ns_op_name_from_custom_op
    raise RuntimeError("Failed to register operator {}. The domain {} is already a used domain."
RuntimeError: Failed to register operator aten::grid_sampler. The domain aten is already a used domain.
(bevformer) lin@PC:~/code/BEVFormer_tensorrt$ 

sh sample/test_trt_ops.sh error

I followed the steps,until
image
a lot of error appeared.
image

I try to skip this step and make the next step
image
There it is,Tensors of type TensorImpl do not have sizes。
`
sh samples/bevformer/base/pth_evaluate.sh -d 0
Running on the GPU: 0
Loaded tensorrt plugins from /home/ying/data/code/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so
load checkpoint from local path: checkpoints/pytorch/bevformer_r101_dcn_24ep.pth
2023-02-01 10:06:11,641 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.0.conv2 is upgraded to version 2.
WARNING: Logging before flag parsing goes to stderr.
I0201 10:06:11.641491 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.0.conv2 is upgraded to version 2.
2023-02-01 10:06:11,644 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.1.conv2 is upgraded to version 2.
I0201 10:06:11.644044 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.1.conv2 is upgraded to version 2.
2023-02-01 10:06:11,645 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.2.conv2 is upgraded to version 2.
I0201 10:06:11.645901 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.2.conv2 is upgraded to version 2.
2023-02-01 10:06:11,647 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.3.conv2 is upgraded to version 2.
I0201 10:06:11.647749 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.3.conv2 is upgraded to version 2.
2023-02-01 10:06:11,649 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.4.conv2 is upgraded to version 2.
I0201 10:06:11.649613 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.4.conv2 is upgraded to version 2.
2023-02-01 10:06:11,651 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.5.conv2 is upgraded to version 2.
I0201 10:06:11.651456 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.5.conv2 is upgraded to version 2.
2023-02-01 10:06:11,653 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.6.conv2 is upgraded to version 2.
I0201 10:06:11.653303 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.6.conv2 is upgraded to version 2.
2023-02-01 10:06:11,655 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.7.conv2 is upgraded to version 2.
I0201 10:06:11.655148 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.7.conv2 is upgraded to version 2.
2023-02-01 10:06:11,656 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.8.conv2 is upgraded to version 2.
I0201 10:06:11.656990 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.8.conv2 is upgraded to version 2.
2023-02-01 10:06:11,658 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.9.conv2 is upgraded to version 2.
I0201 10:06:11.658817 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.9.conv2 is upgraded to version 2.
2023-02-01 10:06:11,660 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.10.conv2 is upgraded to version 2.
I0201 10:06:11.660640 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.10.conv2 is upgraded to version 2.
2023-02-01 10:06:11,662 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.11.conv2 is upgraded to version 2.
I0201 10:06:11.662476 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.11.conv2 is upgraded to version 2.
2023-02-01 10:06:11,664 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.12.conv2 is upgraded to version 2.
I0201 10:06:11.664302 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.12.conv2 is upgraded to version 2.
2023-02-01 10:06:11,666 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.13.conv2 is upgraded to version 2.
I0201 10:06:11.666151 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.13.conv2 is upgraded to version 2.
2023-02-01 10:06:11,668 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.14.conv2 is upgraded to version 2.
I0201 10:06:11.668013 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.14.conv2 is upgraded to version 2.
2023-02-01 10:06:11,669 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.15.conv2 is upgraded to version 2.
I0201 10:06:11.669847 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.15.conv2 is upgraded to version 2.
2023-02-01 10:06:11,671 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.16.conv2 is upgraded to version 2.
I0201 10:06:11.671697 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.16.conv2 is upgraded to version 2.
2023-02-01 10:06:11,673 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.17.conv2 is upgraded to version 2.
I0201 10:06:11.673552 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.17.conv2 is upgraded to version 2.
2023-02-01 10:06:11,675 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.18.conv2 is upgraded to version 2.
I0201 10:06:11.675375 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.18.conv2 is upgraded to version 2.
2023-02-01 10:06:11,677 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.19.conv2 is upgraded to version 2.
I0201 10:06:11.677223 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.19.conv2 is upgraded to version 2.
2023-02-01 10:06:11,679 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.20.conv2 is upgraded to version 2.
I0201 10:06:11.679056 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.20.conv2 is upgraded to version 2.
2023-02-01 10:06:11,680 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.21.conv2 is upgraded to version 2.
I0201 10:06:11.680887 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.21.conv2 is upgraded to version 2.
2023-02-01 10:06:11,682 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.22.conv2 is upgraded to version 2.
I0201 10:06:11.682716 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer3.22.conv2 is upgraded to version 2.
2023-02-01 10:06:11,684 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.0.conv2 is upgraded to version 2.
I0201 10:06:11.684802 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer4.0.conv2 is upgraded to version 2.
2023-02-01 10:06:11,689 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.1.conv2 is upgraded to version 2.
I0201 10:06:11.689321 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer4.1.conv2 is upgraded to version 2.
2023-02-01 10:06:11,692 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.2.conv2 is upgraded to version 2.
I0201 10:06:11.692627 140030840562880 logging.py:107] ModulatedDeformConvPack img_backbone.layer4.2.conv2 is upgraded to version 2.
WARNING!!!!, Only can be used for obtain inference speed!!!!
[ ] 0/162, elapsed: 0s, ETA:/home/ying/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "tools/bevformer/evaluate_pth.py", line 138, in
main()
File "tools/bevformer/evaluate_pth.py", line 103, in main
bev_embed, outputs_classes, outputs_coords = model(
File "/home/ying/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ying/data/code/BEVFormer_tensorrt/mmcv/mmcv/parallel/data_parallel.py", line 50, in forward
return super().forward(*inputs, **kwargs)
File "/home/ying/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/ying/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/detector/bevformer.py", line 41, in forward_trt
bev_embed, outputs_classes, outputs_coords = self.pts_bbox_head.forward_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/dense_heads/bevformer_head.py", line 134, in forward_trt
outputs = self.transformer.forward_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py", line 196, in forward_trt
bev_embed = self.get_bev_features_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/modules/transformer.py", line 160, in get_bev_features_trt
bev_embed = self.encoder.forward_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py", line 136, in forward_trt
output = layer.forward_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/modules/encoder.py", line 454, in forward_trt
query = self.attentions[attn_index].forward_trt(
File "/home/ying/data/code/BEVFormer_tensorrt/./det2trt/models/modules/temporal_self_attention.py", line 274, in forward_trt
output = MultiScaleDeformableAttnFunction.apply(
File "/home/ying/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 118, in decorate_fwd
return fwd(*args, **kwargs)
File "/home/ying/data/code/BEVFormer_tensorrt/./third_party/bevformer/models/modules/multi_scale_deformable_attn_function.py", line 142, in forward
output = ext_module.ms_deform_attn_forward(
RuntimeError: Tensors of type TensorImpl do not have

sizes`
image

onnx2trt failed with UNSUPPORTED_NODE

Command:
python tools/bevformer/onnx2trt.py configs/bevformer/bevformer_base_trt.py checkpoints/onnx/bevformer_r101_dcn_24ep.onnx

Error:

[02/17/2023-15:44:16] [TRT] [V] Parsing node: node_of_onnx::Cast_3779 [nan_to_num]
[02/17/2023-15:44:16] [TRT] [V] Searching for input: aten::nan_to_num_3775
[02/17/2023-15:44:16] [TRT] [V] node_of_onnx::Cast_3779 [nan_to_num] inputs: [aten::nan_to_num_3775 -> (4, 1, 6, 40000, 1)[BOOL]], [optional input, not set], [optional input, not set], [optional input, not set], 
[02/17/2023-15:44:16] [TRT] [I] No importer registered for op: nan_to_num. Attempting to import as plugin.
[02/17/2023-15:44:16] [TRT] [I] Searching for plugin: nan_to_num, plugin_version: 1, plugin_namespace: 
ERROR: Failed to parse the ONNX file.
In node 2599 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

Hi, thanks for your great job!
But unfortunately, I got above error, could it be due to the installation problems?

TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported: 1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

Hi Derry,

I got an error when I try to run scripts samples/bevformer/base/onnx2trt.sh and samples/bevformer/plugin/base/onnx2trt_fp16.sh to transport ONNX to TRT model. The error message is:

File "/home/tianyu/BEVFormer_tensorrt/./det2trt/convert/onnx2tensorrt.py", line 60, in build_engine
    engine = runtime.deserialize_cuda_engine(plan)
TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
    1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

I will really appreciate it if you can give me some support!

Thank you

qat model

Can you provide me with a qat(quantizaiton aware train) model.thanks

No importer registered for op: nan_to_num. Attempting to import as plugin.

When I convert on orin, I get an error, my torch version is 1.12
[03/31/2023-10:54:43] [TRT] [V] Searching for input: onnx::Greater_2181
[03/31/2023-10:54:43] [TRT] [V] node_of_onnx::And_2182 [Greater] inputs: [onnx::Greater_2180 -> (4, 1, 6, 2500, 1)[FLOAT]], [onnx::Greater_2181 -> ()[FLOAT]],
[03/31/2023-10:54:43] [TRT] [V] Registering layer: onnx::Greater_2181 for ONNX node: onnx::Greater_2181
[03/31/2023-10:54:43] [TRT] [V] Registering layer: node_of_onnx::And_2182 for ONNX node: node_of_onnx::And_2182
[03/31/2023-10:54:43] [TRT] [V] Registering tensor: onnx::And_2182 for ONNX tensor: onnx::And_2182
[03/31/2023-10:54:43] [TRT] [V] node_of_onnx::And_2182 [Greater] outputs: [onnx::And_2182 -> (4, 1, 6, 2500, 1)[BOOL]],
[03/31/2023-10:54:43] [TRT] [V] Parsing node: node_of_aten::nan_to_num_2183 [And]
[03/31/2023-10:54:43] [TRT] [V] Searching for input: onnx::And_2175
[03/31/2023-10:54:43] [TRT] [V] Searching for input: onnx::And_2182
[03/31/2023-10:54:43] [TRT] [V] node_of_aten::nan_to_num_2183 [And] inputs: [onnx::And_2175 -> (4, 1, 6, 2500, 1)[BOOL]], [onnx::And_2182 -> (4, 1, 6, 2500, 1)[BOOL]],
[03/31/2023-10:54:43] [TRT] [V] Registering layer: node_of_aten::nan_to_num_2183 for ONNX node: node_of_aten::nan_to_num_2183
[03/31/2023-10:54:43] [TRT] [V] Registering tensor: aten::nan_to_num_2183 for ONNX tensor: aten::nan_to_num_2183
[03/31/2023-10:54:43] [TRT] [V] node_of_aten::nan_to_num_2183 [And] outputs: [aten::nan_to_num_2183 -> (4, 1, 6, 2500, 1)[BOOL]],
[03/31/2023-10:54:43] [TRT] [V] Parsing node: node_of_onnx::Cast_2187 [nan_to_num]
[03/31/2023-10:54:43] [TRT] [V] Searching for input: aten::nan_to_num_2183
[03/31/2023-10:54:43] [TRT] [V] node_of_onnx::Cast_2187 [nan_to_num] inputs: [aten::nan_to_num_2183 -> (4, 1, 6, 2500, 1)[BOOL]], [optional input, not set], [optional input, not set], [optional input, not set],
[03/31/2023-10:54:43] [TRT] [I] No importer registered for op: nan_to_num. Attempting to import as plugin.
[03/31/2023-10:54:43] [TRT] [I] Searching for plugin: nan_to_num, plugin_version: 1, plugin_namespace:
ERROR: Failed to parse the ONNX file.
In node 1482 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

deserialize_cuda_engine(): incompatible function arguments

[04/18/2023-10:51:12] [TRT] [I] MatMul_2931: broadcasting input1 to make tensors conform, dims(input0)=[900,1,256][NONE] dims(input1)=[1,256,256][NONE].
[04/18/2023-10:51:12] [TRT] [I] MatMul_2935: broadcasting input1 to make tensors conform, dims(input0)=[900,1,256][NONE] dims(input1)=[1,256,10][NONE].
[04/18/2023-10:51:13] [TRT] [E] 2: [sliceNode.cpp::symbolicExecute::168] Error Code 2: Internal Error (Assertion input.rank() == 1 failed. )
[04/18/2023-10:51:13] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
Traceback (most recent call last):
File "tools/bevformer/onnx2trt.py", line 255, in
main()
File "tools/bevformer/onnx2trt.py", line 243, in main
build_engine(
File "/data/project/BEVFormer_tensorrt-main/./det2trt/convert/onnx2tensorrt.py", line 63, in build_engine
engine = runtime.deserialize_cuda_engine(plan)
TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

Invoked with: <tensorrt.tensorrt.Runtime object at 0x7ff34e8d5230>, Non

Dockerfile half updated?

The dockerfile seems to be installing a mix of dependencies between Open MMLabs, Nvidia, PyTorch, and TensorRT. I'm struggling to reconcile them.

I try mmcv==1.5.0 but it forces mmcv-full==1.7.0 which fails on the unit testing. And finding the right CUDA version that also connects properly with TensorRT is a struggle, as there's a lot of overlap.

Is there a way to reconcile these?

AssertionError: Samples in split doesn't match samples in predictions.

Dear author,
Thanks for your great work.
As restricted by my memory, I have to run this project on nuScenes v1.0-mini, then following error occured,
the error indicates that the number of sample tokens of prediction is 162, while the number od sample tokens of annotation is only 81.
Could you please give me some suggestions to solve this problem, thank you very much!

sh samples/bevformer/base/pth_evaluate.sh -d 0
Running on the GPU: 0
Loaded tensorrt plugins from /workspace/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so
load checkpoint from local path: checkpoints/pytorch/bevformer_r101_dcn_24ep.pth

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 162/162, 2.1 task/s, elapsed: 79s, ETA: 0s
Formating bboxes of pts_bbox
Start to convert detection format...
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 162/162, 20.3 task/s, elapsed: 8s, ETA: 0s
Results writes to work_dirs/results_eval/pts_bbox/results_nusc.json
Evaluating bboxes of pts_bbox

Loading NuScenes tables for version v1.0-mini...
23 category,
8 attribute,
4 visibility,
911 instance,
12 sensor,
120 calibrated_sensor,
31206 ego_pose,
8 log,
10 scene,
404 sample,
31206 sample_data,
18538 sample_annotation,
4 map,
Done loading in 0.508 seconds.

Reverse indexing ...
Done reverse indexing in 0.1 seconds.

Initializing nuScenes detection evaluation
Loaded results from work_dirs/results_eval/pts_bbox/results_nusc.json. Found detections for 162 samples.
Loading annotations for mini_val split from nuScenes version: v1.0-mini
100%|
██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 81/81 [00:00<00:00, 327.86it/s]
Loaded ground truth annotations for 81 samples.
Traceback (most recent call last):
File "/workspace/BEVFormer_tensorrt/tools/bevformer/evaluate_pth.py", line 138, in
main()
File "/workspace/BEVFormer_tensorrt/tools/bevformer/evaluate_pth.py", line 122, in main
metric = dataset.evaluate(results, jsonfile_prefix='work_dirs/results_eval') #modified by bocheng.hu
File "/workspace/BEVFormer_tensorrt/./third_party/bevformer/datasets/nuscenes_dataset.py", line 522, in evaluate
ret_dict = self._evaluate_single(result_files[name])
File "/workspace/BEVFormer_tensorrt/./third_party/bevformer/datasets/nuscenes_dataset.py", line 869, in _evaluate_single
self.nusc_eval = NuScenesEval_custom(
File "/workspace/BEVFormer_tensorrt/./third_party/bevformer/datasets/nuscenes_eval.py", line 672, in init
assert set(self.pred_boxes.sample_tokens) == set(
AssertionError: Samples in split doesn't match samples in predictions.

A runtimeerror was encountered while converting the model to onnx

Hello,
When I use sh samples/bevformer/base/pth2onnx.sh -d 0, I got a runtime error:

RuntimeError: CUDA out of memory. Tried to allocate 236.00 MiB (GPU 0; 9.78 GiB total capacity; 8.04 GiB already allocated; 82.31 MiB free; 8.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Looks like enough memory, and I use RTX 3080.
Do I need to switch to a graphics card with more memory?
Thx.

RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

#################### Running RotateTestCase ####################
test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... /home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
center[0] -= center[0].new_tensor(ow * 0.5)
/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:16: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
center[1] -= center[1].new_tensor(oh * 0.5)
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
FAIL
test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
ERROR
test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR
test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR

#################### Running RotateTestCase2 ####################
test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR
test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR
test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR
test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR

======================================================================
ERROR: test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 85, in pth2trt
engine = build_engine(f, fp16=fp16)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 67, in build_engine
engine = runtime.deserialize_cuda_engine(plan)
TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
1. (self: tensorrt.tensorrt.Runtime, serialized_engine: buffer) -> tensorrt.tensorrt.ICudaEngine

Invoked with: <tensorrt.tensorrt.Runtime object at 0x7f527eff8df0>, None

======================================================================
ERROR: test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward
output = self.module(*inputs, **self.kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate
return _rotate(img, angle, center, _MODE[interpolation])
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward
img = torch.grid_sampler(img, grid, interpolation, 0, False)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
ERROR: test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward
output = self.module(*inputs, **self.kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate
return _rotate(img, angle, center, _MODE[interpolation])
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward
img = torch.grid_sampler(img, grid, interpolation, 0, False)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
ERROR: test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
ERROR: test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 129, in setUp
BaseTestCase.init(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 33, in init
self.createInputs()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in createInputs
self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()}
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in
self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()}
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
ERROR: test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
ERROR: test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

======================================================================
FAIL: test_fp32 (det2trt.models.utils.test_trt_ops.test_modulated_deformable_conv2d.ModulatedDeformableConv2dTestCase)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 83, in test_fp32
self.fp32_case()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 72, in fp32_case
self.assertLessEqual(cost, delta)
AssertionError: 0.0036153677 not less than or equal to 1e-05

======================================================================
FAIL: test_fp32 (det2trt.models.utils.test_trt_ops.test_modulated_deformable_conv2d.ModulatedDeformableConv2dTestCase2)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 157, in test_fp32
self.fp32_case()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 146, in fp32_case
self.assertLessEqual(cost, delta)
AssertionError: 0.0036153677 not less than or equal to 1e-05

======================================================================
FAIL: test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)

Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 91, in test_fp16_bilinear
self.fp16_case(0.01)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 82, in fp16_case
self.assertLessEqual(cost, delta)
AssertionError: 0.258 not less than or equal to 0.01


Ran 136 tests in 381.644s

FAILED (failures=3, errors=7)

Running RotateTestCase时报错显存不够

Can't find BEVFormer_ptq_max model

Hi, thanks for your great work.
If you can tell me where I can download bevformer_r101_dcn_24ep_ptq_max.pth for convert to ONNX and engine of qdq model ?

No module named 'pytorch_quantization

when I try to use"sh samples/test_trt_ops.sh",I meet the error"No module named 'pytorch_quantization",did I missed some steps?

here is log
Running on the GPU: 0 Traceback (most recent call last): File "tools/test_trt_ops.py", line 6, in <module> import det2trt File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/__init__.py", line 1, in <module> from .models import * File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/models/__init__.py", line 1, in <module> from .backbones import * File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/models/backbones/__init__.py", line 1, in <module> from .csp_darknet import CSPDarknetQ File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/models/backbones/csp_darknet.py", line 11, in <module> from ..utils import CSPLayer File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/models/utils/__init__.py", line 1, in <module> from .scp_layer import CSPLayer File "/media/13data2/xzz/code/tensorRT/BEVFormer_tensorrt/./det2trt/models/utils/scp_layer.py", line 6, in <module> from pytorch_quantization import nn as quant_nn ModuleNotFoundError: No module named 'pytorch_quantization'

SpatialCrossAttentionTRT don't select valid query, no cuda memory problem here?

Thanks for this great repo!
I attempted to implement the original BEVFormer in TensorRT, but encountered issues with the SpatialCrossAttention module. Some of its operations are not well-supported in TensorRT. To address this, I created a no-filter version of the module. However, this implementation resulted in significantly higher CUDA memory usage compared to the original version. I'm wondering if using the SpatialCrossAttentionTRT module would help alleviate this issue?

Best Regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.