The gldelegatebench from freedomtan

gldelegatebench's Issues

Tested this App with Samsung Galaxy S9

Thank you very much for this work. I just testet the app with my S9 and would like to share my results with you:

model name	CPU 1 thread (ms)	CPU 4 threads (ms)	GPU (ms)
Mobilenet	43	72	41
PoseNet	47	91	45
DeepLab V3	64	84	155
Mobilenet SSD V2 COCO	72	164	73

Different results on GPU and CPU

Device: Sony XPeria XZ2 (SnapDragon 845)

I added 2 little modification to your app:

filling input buffers with fixed data

private Object[] allocateInputBuffers(int[] shapes){
            int i_size = shapes.length;
            Object inputs[] = new Object[i_size];
            for (int i=0; i < i_size; i++) {
                ByteBuffer i_bytes = ByteBuffer.allocate(shapes[i]);
                float[] floats = new float[shapes[i] / 4];
                for(int j=0; j<floats.length; ++j) {
                    // Filling uniformly -5. ... 5.
                    floats[j] = ((float)j * 10) / floats.length - 5;
                }
                FloatBuffer fb = i_bytes.asFloatBuffer();
                fb.put(floats);
                i_bytes.rewind();
                Log.i("here: ", Float.toString(i_bytes.getFloat()) + " " + Float.toString(i_bytes.getFloat(4)));
                i_bytes.rewind();
                inputs[i] = i_bytes;
            }
            return inputs;
        }

printing first element of each output buffer

String debugMessage = "";

            for (int i=0; i < loops; i++) {
                Object inputs[] = allocateInputBuffers(mModel.getInputShapes());
                Map<Integer, Object> outputs = allocateOutputBuffers(mModel.getOutputShapes());

                startTime = System.currentTimeMillis();
                interpreter.runForMultipleInputsOutputs(inputs, outputs);
                stopTime = System.currentTimeMillis();
                accTime += (stopTime - startTime);

                if(i == loops - 1) {
                    // on last step adding first element of output buffer to debugMessage
                    for (Map.Entry<Integer, Object> entry : outputs.entrySet()) {
                        ByteBuffer bb = (ByteBuffer) entry.getValue();
                        Log.i("here: ", "byteBufferLimit is : " + ((ByteBuffer) entry.getValue()).limit());
                        debugMessage += "    " + Float.toString(bb.getFloat(0)) + "\n";
                    }
                }
            }
            Log.i("here: ", "time: " + accTime/loops);
            resultMessage.setText("avg time: " + accTime/loops + " ms\n"
                + debugMessage);

And I get different outputs for every net (for ex. MobileNet v1).

CPU 1 thread output:

avg time: 97 ms
    -2.0144956

GPU output:

avg time: 21 ms
    0.0

What am I doing wrong? Or is there any bugs in tensorflow-gpu

which version of tflite used for this benchmark?

Performance report for Nexus 5X

Thanks for the app, not sure if anyone cares about Nexus 5X anymore, but these are the results I get:

model name	CPU 1 thread (ms)	CPU 4 threads (ms)	GPU (ms)
Mobilenet	316	88	74
PoseNet	376	105	142
DeepLab V3	479	169	304
Mobilenet SSD V2 COCO	533	180	150

Quite disappointing... I am specifically interested in object-detecion (ssd v2), and when testing the Googles MediaPipe Object Detection Test App it seems to be working quite fast (dont have measurements, only "feeling" when seeing the bounding boxes update). Did you played with MediaPipe? they did mention their OD model is trained with depth multiplier of 0.5, so I guess it's part of it...

deeplab example

Hey. nice work.

Did you have been test deeplab bytebuffer to bitmap?

I tried but. failed.

freedomtan / gldelegatebench Goto Github PK

gldelegatebench's People

Contributors

Stargazers

Watchers

Forkers

gldelegatebench's Issues

Tested this App with Samsung Galaxy S9

Different results on GPU and CPU

which version of tflite used for this benchmark?

Performance report for Nexus 5X

deeplab example

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs