平台(如果交叉编译请再附上交叉编译目标平台): Platform(Include target platform as well

Thank you very much <a class="user-mention notranslate" data-hovercard-type="user" dat

[Regression]: Performance degradation in MNN 2.8.1 / 2.8.0 for Android CPU about mnn HOT 7 CLOSED

Nick-infinity commented on June 15, 2024

[Regression]: Performance degradation in MNN 2.8.1 / 2.8.0 for Android CPU

from mnn.

Comments (7)

v0jiuqi commented on June 15, 2024 1

测试大模型的时候需要手动添加这几个宏：
-DMNN_LOW_MEMORY=ON
-DMNN_ARM82=ON
只有打开了arm82，才会使用一些加速指令

from mnn.

v0jiuqi commented on June 15, 2024 1

另外，需要设置precision=low才会使用fp16推理，否则是fp32

from mnn.

v0jiuqi commented on June 15, 2024

手机不同性能也不一样吧

from mnn.

Nick-infinity commented on June 15, 2024

Different mobile phones have different performance, right?

Yes you are right, but I am testing with more powerful soc i.e 8 gen 2.

The APK release on mnn-llm can do 26 t/s decode on my 8 gen 2 but the source built libs are doing 2 t/s . I found out that the mnn is not built with -DMNN_LOW_MEMORY cmake flag when using https://github.com/wangzhaode/mnn-llm/blob/master/script/android_build.sh

When I built the MNN libs with the low memory flag I could see a huge boost in performance. The performance is still not as good as the release apk though. I doubt that I am still missing some of the flags and options that were used to compile the mnn libs present in apk

from mnn.

Nick-infinity commented on June 15, 2024

Thank you very much for this information.
MNN_LOW_MEMORY this flag enables int4 weights support but activation are still in fp32. Will precision=low make activations fp16? Also how can I enable precision=low

Maybe setting it to backend config, let me check if mnn-llm can do that

Update: I am using low precision in cpu backend config

from mnn.

v0jiuqi commented on June 15, 2024

具体的一些设置（precision, threadNumber etc.）在 llm.cpp中，关于直接使用mnn来跑 llm ，可以打开宏 -DMNN_BUILD_LLM=ON，然后得到llm_demo这个可执行文件来跑大模型。

from mnn.

Nick-infinity commented on June 15, 2024

Thank you very much @v0jiuqi . You have saved my day and helped me to fix the issue. I can confirm I can get the same perf now as the released apk.
I cant thank you enough <3 .
Thanks again and have a good day

from mnn.

Recommend Projects

[Regression]: Performance degradation in MNN 2.8.1 / 2.8.0 for Android CPU about mnn HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs