GithubHelp home page GithubHelp logo

freedomtan / gldelegatebench Goto Github PK

View Code? Open in Web Editor NEW
16.0 5.0 5.0 38.56 MB

quick and dirty inference time benchmark for TFLite gles delegate

License: BSD 3-Clause "New" or "Revised" License

Java 100.00%
tflite tensorflow-lite compute-shader gpu tflite-delegate tflite-gpu-delegate benchmark

gldelegatebench's People

Contributors

freedomtan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gldelegatebench's Issues

Tested this App with Samsung Galaxy S9

Thank you very much for this work. I just testet the app with my S9 and would like to share my results with you:

model name CPU 1 thread (ms) CPU 4 threads (ms) GPU (ms)
Mobilenet 43 72 41
PoseNet 47 91 45
DeepLab V3 64 84 155
Mobilenet SSD V2 COCO 72 164 73

Different results on GPU and CPU

Device: Sony XPeria XZ2 (SnapDragon 845)

I added 2 little modification to your app:

  1. filling input buffers with fixed data
private Object[] allocateInputBuffers(int[] shapes){
            int i_size = shapes.length;
            Object inputs[] = new Object[i_size];
            for (int i=0; i < i_size; i++) {
                ByteBuffer i_bytes = ByteBuffer.allocate(shapes[i]);
                float[] floats = new float[shapes[i] / 4];
                for(int j=0; j<floats.length; ++j) {
                    // Filling uniformly -5. ... 5.
                    floats[j] = ((float)j * 10) / floats.length - 5;
                }
                FloatBuffer fb = i_bytes.asFloatBuffer();
                fb.put(floats);
                i_bytes.rewind();
                Log.i("here: ", Float.toString(i_bytes.getFloat()) + " " + Float.toString(i_bytes.getFloat(4)));
                i_bytes.rewind();
                inputs[i] = i_bytes;
            }
            return inputs;
        }
  1. printing first element of each output buffer
String debugMessage = "";

            for (int i=0; i < loops; i++) {
                Object inputs[] = allocateInputBuffers(mModel.getInputShapes());
                Map<Integer, Object> outputs = allocateOutputBuffers(mModel.getOutputShapes());

                startTime = System.currentTimeMillis();
                interpreter.runForMultipleInputsOutputs(inputs, outputs);
                stopTime = System.currentTimeMillis();
                accTime += (stopTime - startTime);

                if(i == loops - 1) {
                    // on last step adding first element of output buffer to debugMessage
                    for (Map.Entry<Integer, Object> entry : outputs.entrySet()) {
                        ByteBuffer bb = (ByteBuffer) entry.getValue();
                        Log.i("here: ", "byteBufferLimit is : " + ((ByteBuffer) entry.getValue()).limit());
                        debugMessage += "    " + Float.toString(bb.getFloat(0)) + "\n";
                    }
                }
            }
            Log.i("here: ", "time: " + accTime/loops);
            resultMessage.setText("avg time: " + accTime/loops + " ms\n"
                + debugMessage);

And I get different outputs for every net (for ex. MobileNet v1).

CPU 1 thread output:

avg time: 97 ms
    -2.0144956

GPU output:

avg time: 21 ms
    0.0

What am I doing wrong? Or is there any bugs in tensorflow-gpu

Performance report for Nexus 5X

Thanks for the app, not sure if anyone cares about Nexus 5X anymore, but these are the results I get:

model name CPU 1 thread (ms) CPU 4 threads (ms) GPU (ms)
Mobilenet 316 88 74
PoseNet 376 105 142
DeepLab V3 479 169 304
Mobilenet SSD V2 COCO 533 180 150

Quite disappointing... I am specifically interested in object-detecion (ssd v2), and when testing the Googles MediaPipe Object Detection Test App it seems to be working quite fast (dont have measurements, only "feeling" when seeing the bounding boxes update). Did you played with MediaPipe? they did mention their OD model is trained with depth multiplier of 0.5, so I guess it's part of it...

deeplab example

Hey. nice work.

Did you have been test deeplab bytebuffer to bitmap?

I tried but. failed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.