Hi, I have optimized the model faster_rcnn_inception_v2 model as TF-TRT INT8 with NMS enabled (ops placed on the CPU) using TF 1.5.2 and the script https://github.com/tensorflow/tensorrt/tree/r1.14+/tftrt/examples/object_detection. I got the below performance with nms enable vs nms disable:
TF-TRT-INT8 (nms enabled): ~96FPS
TF-TRT-INT8 (no nms): ~43 FPS
The model was optimized with batch_size=8, image_shape=[600, 600], and minimum_segment_size=50. For DS-Triton deployment the max_batch_size=8
The issue is now when deploying the model to DeepStream-Triton, I got the below error Input shape axis 0 must equal 8, got shape [5,600,1024,3]
(even though the model was optimized with BS=8):
I0112 01:06:22.313573 2643 model_repository_manager.cc:837] successfully loaded 'faster_rcnn_inception_v2' version 13
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: faster_rcnn_inception_v2
2021-01-12 01:06:36.202139: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for TRTEngineOp_0 input shapes: [[8,600,1024,3]]
2021-01-12 01:06:36.202311: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.7
2021-01-12 01:06:36.203128: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.7
2021-01-12 01:09:20.678239: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:37] DefaultLogger Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
2021-01-12 01:09:20.709545: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for TRTEngineOp_1 input shapes: [[800,14,14,576]]
2021-01-12 01:10:01.273658: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:37] DefaultLogger Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
Runtime commands:
h: Print this help
q: Quit
p: Pause
r: Resume
**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready
** INFO: <bus_callback:167>: Pipeline running
ERROR: infer_trtis_server.cpp:276 TRTIS: failed to get response status, trtis_err_str:INTERNAL, err_msg:2 root error(s) found.
(0) Invalid argument: Input shape axis 0 must equal 8, got shape [5,600,1024,3]
[[{{node Preprocessor/unstack}}]]
(1) Invalid argument: Input shape axis 0 must equal 8, got shape [5,600,1024,3]
[[{{node Preprocessor/unstack}}]]
[[ExpandDims_4/_199]]
0 successful operations.
0 derived errors ignored.
ERROR: infer_trtis_backend.cpp:532 TRTIS server failed to parse response with request-id:1 model:
0:03:46.539871495 2643 0x7f0cf80022a0 WARN nvinferserver gstnvinferserver.cpp:519:gst_nvinfer_server_push_buffer:<primary_gie> error: inference failed with unique-id:1
ERROR from primary_gie: inference failed with unique-id:1
Debug info: gstnvinferserver.cpp(519): gst_nvinfer_server_push_buffer (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
Quitting
ERROR: infer_trtis_server.cpp:276 TRTIS: failed to get response status, trtis_err_str:INTERNAL, err_msg:2 root error(s) found.
(0) Invalid argument: Input shape axis 0 must equal 8, got shape [5,600,1024,3]
[[{{node Preprocessor/unstack}}]]
(1) Invalid argument: Input shape axis 0 must equal 8, got shape [5,600,1024,3]
[[{{node Preprocessor/unstack}}]]
[[ExpandDims_4/_199]]
0 successful operations.
0 derived errors ignored.
ERROR: infer_trtis_backend.cpp:532 TRTIS server failed to parse response with request-id:2 model:
ERROR from qtdemux0: Internal data stream error.
Debug info: qtdemux.c(6073): gst_qtdemux_loop (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/GstQTDemux:qtdemux0:
streaming stopped, reason custom-error (-112)
I0112 01:10:01.644682 2643 model_repository_manager.cc:708] unloading: faster_rcnn_inception_v2:13
I0112 01:10:01.917792 2643 model_repository_manager.cc:816] successfully unloaded 'faster_rcnn_inception_v2' version 13
I0112 01:10:01.918447 2643 server.cc:179] Waiting for in-flight inferences to complete.
I0112 01:10:01.918460 2643 server.cc:194] Timeout 30: Found 0 live models and 0 in-flight requests
App run failed