dear MarvinTeichmann, I have run your code normally and gotten correct results.<br

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thank for <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Howto perform fast inference (realtime) about kittibox HOT 13 OPEN

marvinteichmann commented on June 30, 2024

Howto perform fast inference (realtime)

from kittibox.

Comments (13)

MarvinTeichmann commented on June 30, 2024 8

People seem to love demo.py. As mentioned earlier, demo.py was not designed as evaluation code and is very slow. Demo.py is meant as a way to understand how the code works. Evaluate.py is meant to be used for evaluation.

However people seem to love demo.py #30, #41, #54. If you don't want to mess around with the evaluation model, modify demo.py to perform evaluation of images in a loop like this:

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

This will avoid building the entire tensorflow graph for each image. This is still not perfect, but way faster then calling the whole demo.py script for each image like this: this.

If you like to measure running time, keep in mind that tensorflow compiles the graph and allocation memory in the first run. So don't measure the time it takes for the first image. See the comment here.

from kittibox.

villanuevab commented on June 30, 2024 3

@lukaspistelak the code that @MarvinTeichmann posted, i.e.,

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

is for KittiSeg's demo.py. For KittiBox, try:

for image in images: 
    feed = {image_pl: image}
    
    # Run KittiBox model on image
    pred_boxes = prediction['pred_boxes_new']
    pred_confidences = prediction['pred_confidences']
    (np_pred_boxes, np_pred_confidences) = sess.run([pred_boxes,
                                                     pred_confidences],
                                                     feed_dict=feed)

Hope this helps.

from kittibox.

MarvinTeichmann commented on June 30, 2024 2

No, sorry. Did not find the time to work on this. For a good start use the loop I have suggested in the comment above.

from kittibox.

MarvinTeichmann commented on June 30, 2024 1

Firstly I am using titan x (pascal) to measure runtime. K40 is rather old, so you might not get the same results. In addition, demo.py is not meant to measure inference time. The image is loaded from the disk in sequential and feed to the graph using placeholders. This is slow according to the tensoflow documentation. In addition, inference is performed only once. The first time inference is run tensorflow selects the subgraph which needs to be computed. The whole think is much faster, if the same op is computed multiple times. And lastly, demo.py plots a visualization in python. Computing a visualization is not considered to be part of the actual detection. (And this can be done in parallel on CPU anyway, so no need to wait till this computation is finished).

To archive the throughput of the paper, images are loaded from the disk in parallel using Tensorflow Queues. It can be assumed, that a real-time system does not store the input on hdd, but is able to provide the data in memory. So this is a fair comparison. In addition the same op (with different input) is evaluated 100 times and the average runtime is reported.

I will provide code for fast inference after ICCV deadline. The purpose of demo.py is to provide easy code so that users not familiar with tensorvision see how the model works. Demo.py is kept simple for this purpose and all the advanced tensorflow queuing stuff is not included.

from kittibox.

bigsnarfdude commented on June 30, 2024 1

I didn't have an opinion of whether the inference time is fast or slow. I have TitanX(Pascal) and just provided output for reference. Thanks @MarvinTeichmann for the code. I look forward to the future releases.

from kittibox.

MarvinTeichmann commented on June 30, 2024 1

Btw. that both of you have an inference time of about 2s show, that the GPU is not the bottleneck in the current setup. One would aspect a Titan X pascal about 2-3 times faster. So most of the time is actually spend in reading the data, loading the computational graph into the gpu, .etc.

For a quick and dirty speed benchmark you can do somethink like this:

# One run to ensure that the tensorflow graph is loaded into the GPU
sess.run([pred_boxes, pred_confidences], feed_dict=feed)
start_time = time()
for i in xrange(100):
   sess.run([pred_boxes, pred_confidences], feed_dict=feed)
total_time = (time() - start_time)/100.0

This should give you an inference speed close to the one cited in the paper.

from kittibox.

bigsnarfdude commented on June 30, 2024

name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:03:00.0
Total memory: 11.90GiB
Free memory: 11.75GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0)
2017-03-09 22:17:07,062 INFO /home/mifs/mttt2/local_disk/RUNS/TensorDetect2/paper_bench/tau5_zoom_0_kitti_2016_11_09_05.57/model.ckpt-179999
2017-03-09 22:17:10,497 INFO Weights loaded successfully.
2017-03-09 22:17:10,497 INFO Starting inference using data/demo2.png as input

2017-03-09 22:17:12,558 INFO 7 Cars detected
2017-03-09 22:17:12,558 INFO
2017-03-09 22:17:12,558 INFO Coordinates of Box 0
2017-03-09 22:17:12,558 INFO x1: 425.5
2017-03-09 22:17:12,558 INFO x2: 464.5
2017-03-09 22:17:12,558 INFO y1: 183.5
2017-03-09 22:17:12,559 INFO y2: 204.5
2017-03-09 22:17:12,559 INFO Confidence: 0.945907235146
2017-03-09 22:17:12,559 INFO

from kittibox.

coolhebei commented on June 30, 2024

@bigsnarfdude , I think you mean the same question? need 2s to do the task.

from kittibox.

bigsnarfdude commented on June 30, 2024

Tensorflow devs have documented that "feed_dict" is one of the slower methods of passing data. (My thoughts: If "feed_dict" is used for current inference calculations, then I would imagine that other methods may increase inference speed if the pipeline is optimized).

Two different docs provide better ways of getting data to the GPU for both inference and training:

from kittibox.

coolhebei commented on June 30, 2024

Thank for @bigsnarfdude @MarvinTeichmann
The slower speed of my procedure is contributed to

my car is k40 maxvell, which is much slower than titan x pascal (about 180ms vs 30ms for vgg16)
the first inference is actually slower than others

Finally, thanks a lot for sharing the code~ An exciting work!

from kittibox.

villanuevab commented on June 30, 2024

Hello! Do you have updates on the code for fast inference?

from kittibox.

lukaspistelak commented on June 30, 2024

hi, I got this error>

   
 softmax = prediction['softmax']
KeyError: 'softmax'

when I want to use your tip>

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

all modules were loaded successfully

from kittibox.

swkonz commented on June 30, 2024

If we wanted to grab the output file after rectangles have been drawn on the image we would need to include

# Apply non-maximal suppression
# and draw predictions on the image
        
output_image, rectangles = kittibox_utils.add_rectangles(
                 hypes, [image], np_pred_confidences,
                 np_pred_boxes, show_removed=False,
                 use_stitching=True, rnn_len=1,
                 min_conf=0.50, tau=hypes['tau'], color_acc=(0, 255, 0))

in that for loop, where 'output_image' would be the image with the rectangles drawn correct?
what is the format of output_image here?

Thanks for your work on this!

from kittibox.

Howto perform fast inference (realtime) about kittibox HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs