Compared the run time of an uncompiled and a compiled Tensorflow object detection model. Unlike the example ResNet model (which had about 6x improvement over CPU run time), there is no improvement. I think the following compile messages probably explains why - "Number of operations placed on Neuron runtime: 0" . Question is why no op is placed on Nuron. Because of that the whole graph ran in the CPU, and hence no difference.
WARNING:tensorflow:subgraph neuron_op_3830212827f60cb5, tensor pred_sbbox/range0/_0:0: invalid shape (?,)
WARNING:tensorflow:Not fusing subgraph neuron_op_3830212827f60cb5: --io-config error
WARNING:tensorflow:subgraph neuron_op_6e11ec1372c1bd87, tensor upsample1/ResizeNearestNeighbor0/_6:0: invalid shape (?, ?, ?, 128)
WARNING:tensorflow:Not fusing subgraph neuron_op_6e11ec1372c1bd87: --io-config error
WARNING:tensorflow:subgraph neuron_op_a64bf54893aa3612, tensor upsample0/ResizeNearestNeighbor0/_9:0: invalid shape (?, ?, ?, 256)
WARNING:tensorflow:Not fusing subgraph neuron_op_a64bf54893aa3612: --io-config error
WARNING:tensorflow:subgraph neuron_op_6282ae96789270ea, tensor pred_mbbox/range0/_11:0: invalid shape (?,)
WARNING:tensorflow:Not fusing subgraph neuron_op_6282ae96789270ea: --io-config error
WARNING:tensorflow:subgraph neuron_op_7a2c5493ed9a8729, tensor pred_lbbox/range0/_17:0: invalid shape (?,)
WARNING:tensorflow:Not fusing subgraph neuron_op_7a2c5493ed9a8729: --io-config error
INFO:tensorflow:fusing subgraph neuron_op_6ec953b285e9ba28 with neuron-cc
WARNING:tensorflow:Failed to fuse subgraph neuron_op_6ec953b285e9ba28 with '/home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/bin/neuron-cc compile /tmp/tmpqybqolf_/neuron_op_6ec953b285e9ba28/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpqybqolf_/neuron_op_6ec953b285e9ba28/graph_def.neff --io-config "{\"inputs\": {\"input/input_data0/_8:0\": [[1, 416, 416, 3], \"float32\"]}, \"outputs\": [\"darknet/residual10/add:0\", \"darknet/residual18/add:0\", \"conv_lbbox/BiasAdd:0\", \"conv57/LeakyRelu:0\", \"upsample0/ResizeNearestNeighbor/size:0\", \"pred_lbbox/strided_slice:0\", \"pred_lbbox/strided_slice_1:0\"]}"'
INFO:tensorflow:Number of operations in TensorFlow session: 3290
INFO:tensorflow:Number of operations after tf.neuron optimizations: 914
INFO:tensorflow:Number of operations placed on Neuron runtime: 0
INFO:tensorflow:Successfully converted
For comparison, the compile output for the ResNet50 model is as follows.
INFO:tensorflow:fusing subgraph neuron_op_d6f098c01c780733 with neuron-cc
INFO:tensorflow:Number of operations in TensorFlow session: 4638
INFO:tensorflow:Number of operations after tf.neuron optimizations: 556
INFO:tensorflow:Number of operations placed on Neuron runtime: 554
INFO:tensorflow:Successfully converted