GithubHelp home page GithubHelp logo

Comments (13)

MarvinTeichmann avatar MarvinTeichmann commented on June 30, 2024 8

People seem to love demo.py. As mentioned earlier, demo.py was not designed as evaluation code and is very slow. Demo.py is meant as a way to understand how the code works. Evaluate.py is meant to be used for evaluation.

However people seem to love demo.py #30, #41, #54. If you don't want to mess around with the evaluation model, modify demo.py to perform evaluation of images in a loop like this:

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

This will avoid building the entire tensorflow graph for each image. This is still not perfect, but way faster then calling the whole demo.py script for each image like this: this.

If you like to measure running time, keep in mind that tensorflow compiles the graph and allocation memory in the first run. So don't measure the time it takes for the first image. See the comment here.

from kittibox.

villanuevab avatar villanuevab commented on June 30, 2024 3

@lukaspistelak the code that @MarvinTeichmann posted, i.e.,

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

is for KittiSeg's demo.py. For KittiBox, try:

for image in images: 
    feed = {image_pl: image}
    
    # Run KittiBox model on image
    pred_boxes = prediction['pred_boxes_new']
    pred_confidences = prediction['pred_confidences']
    (np_pred_boxes, np_pred_confidences) = sess.run([pred_boxes,
                                                     pred_confidences],
                                                     feed_dict=feed)

Hope this helps.

from kittibox.

MarvinTeichmann avatar MarvinTeichmann commented on June 30, 2024 2

No, sorry. Did not find the time to work on this. For a good start use the loop I have suggested in the comment above.

from kittibox.

MarvinTeichmann avatar MarvinTeichmann commented on June 30, 2024 1

Firstly I am using titan x (pascal) to measure runtime. K40 is rather old, so you might not get the same results. In addition, demo.py is not meant to measure inference time. The image is loaded from the disk in sequential and feed to the graph using placeholders. This is slow according to the tensoflow documentation. In addition, inference is performed only once. The first time inference is run tensorflow selects the subgraph which needs to be computed. The whole think is much faster, if the same op is computed multiple times. And lastly, demo.py plots a visualization in python. Computing a visualization is not considered to be part of the actual detection. (And this can be done in parallel on CPU anyway, so no need to wait till this computation is finished).

To archive the throughput of the paper, images are loaded from the disk in parallel using Tensorflow Queues. It can be assumed, that a real-time system does not store the input on hdd, but is able to provide the data in memory. So this is a fair comparison. In addition the same op (with different input) is evaluated 100 times and the average runtime is reported.

I will provide code for fast inference after ICCV deadline. The purpose of demo.py is to provide easy code so that users not familiar with tensorvision see how the model works. Demo.py is kept simple for this purpose and all the advanced tensorflow queuing stuff is not included.

from kittibox.

bigsnarfdude avatar bigsnarfdude commented on June 30, 2024 1

I didn't have an opinion of whether the inference time is fast or slow. I have TitanX(Pascal) and just provided output for reference. Thanks @MarvinTeichmann for the code. I look forward to the future releases.

from kittibox.

MarvinTeichmann avatar MarvinTeichmann commented on June 30, 2024 1

Btw. that both of you have an inference time of about 2s show, that the GPU is not the bottleneck in the current setup. One would aspect a Titan X pascal about 2-3 times faster. So most of the time is actually spend in reading the data, loading the computational graph into the gpu, .etc.

For a quick and dirty speed benchmark you can do somethink like this:

# One run to ensure that the tensorflow graph is loaded into the GPU
sess.run([pred_boxes, pred_confidences], feed_dict=feed)
start_time = time()
for i in xrange(100):
   sess.run([pred_boxes, pred_confidences], feed_dict=feed)
total_time = (time() - start_time)/100.0

This should give you an inference speed close to the one cited in the paper.

from kittibox.

bigsnarfdude avatar bigsnarfdude commented on June 30, 2024

name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:03:00.0
Total memory: 11.90GiB
Free memory: 11.75GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0)
2017-03-09 22:17:07,062 INFO /home/mifs/mttt2/local_disk/RUNS/TensorDetect2/paper_bench/tau5_zoom_0_kitti_2016_11_09_05.57/model.ckpt-179999
2017-03-09 22:17:10,497 INFO Weights loaded successfully.
2017-03-09 22:17:10,497 INFO Starting inference using data/demo2.png as input

2017-03-09 22:17:12,558 INFO 7 Cars detected
2017-03-09 22:17:12,558 INFO
2017-03-09 22:17:12,558 INFO Coordinates of Box 0
2017-03-09 22:17:12,558 INFO x1: 425.5
2017-03-09 22:17:12,558 INFO x2: 464.5
2017-03-09 22:17:12,558 INFO y1: 183.5
2017-03-09 22:17:12,559 INFO y2: 204.5
2017-03-09 22:17:12,559 INFO Confidence: 0.945907235146
2017-03-09 22:17:12,559 INFO

from kittibox.

coolhebei avatar coolhebei commented on June 30, 2024

@bigsnarfdude , I think you mean the same question? need 2s to do the task.

from kittibox.

bigsnarfdude avatar bigsnarfdude commented on June 30, 2024

Tensorflow devs have documented that "feed_dict" is one of the slower methods of passing data. (My thoughts: If "feed_dict" is used for current inference calculations, then I would imagine that other methods may increase inference speed if the pipeline is optimized).

Two different docs provide better ways of getting data to the GPU for both inference and training:

  1. https://www.tensorflow.org/programmers_guide/reading_data
  2. https://www.tensorflow.org/extend/new_data_formats

from kittibox.

coolhebei avatar coolhebei commented on June 30, 2024

Thank for @bigsnarfdude @MarvinTeichmann
The slower speed of my procedure is contributed to

  1. my car is k40 maxvell, which is much slower than titan x pascal (about 180ms vs 30ms for vgg16)
  2. the first inference is actually slower than others

Finally, thanks a lot for sharing the code~ An exciting work!

from kittibox.

villanuevab avatar villanuevab commented on June 30, 2024

Hello! Do you have updates on the code for fast inference?

from kittibox.

lukaspistelak avatar lukaspistelak commented on June 30, 2024

hi, I got this error>

   
 softmax = prediction['softmax']
KeyError: 'softmax'

when I want to use your tip>

for image in images:
    feed = {image_pl: image}
    softmax = prediction['softmax']
    output = sess.run([softmax], feed_dict=feed)

all modules were loaded successfully

from kittibox.

swkonz avatar swkonz commented on June 30, 2024

If we wanted to grab the output file after rectangles have been drawn on the image we would need to include

# Apply non-maximal suppression
# and draw predictions on the image
        
output_image, rectangles = kittibox_utils.add_rectangles(
                 hypes, [image], np_pred_confidences,
                 np_pred_boxes, show_removed=False,
                 use_stitching=True, rnn_len=1,
                 min_conf=0.50, tau=hypes['tau'], color_acc=(0, 255, 0))

in that for loop, where 'output_image' would be the image with the rectangles drawn correct?
what is the format of output_image here?

Thanks for your work on this!

from kittibox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.