GithubHelp home page GithubHelp logo

Comments (7)

v0jiuqi avatar v0jiuqi commented on June 15, 2024 1

测试大模型的时候需要手动添加这几个宏:
-DMNN_LOW_MEMORY=ON
-DMNN_ARM82=ON
只有打开了arm82,才会使用一些加速指令

from mnn.

v0jiuqi avatar v0jiuqi commented on June 15, 2024 1

另外,需要设置precision=low才会使用fp16推理,否则是fp32

from mnn.

v0jiuqi avatar v0jiuqi commented on June 15, 2024

手机不同性能也不一样吧

from mnn.

Nick-infinity avatar Nick-infinity commented on June 15, 2024

Different mobile phones have different performance, right?

Yes you are right, but I am testing with more powerful soc i.e 8 gen 2.

The APK release on mnn-llm can do 26 t/s decode on my 8 gen 2 but the source built libs are doing 2 t/s . I found out that the mnn is not built with -DMNN_LOW_MEMORY cmake flag when using https://github.com/wangzhaode/mnn-llm/blob/master/script/android_build.sh

When I built the MNN libs with the low memory flag I could see a huge boost in performance. The performance is still not as good as the release apk though. I doubt that I am still missing some of the flags and options that were used to compile the mnn libs present in apk

from mnn.

Nick-infinity avatar Nick-infinity commented on June 15, 2024

Thank you very much for this information.
MNN_LOW_MEMORY this flag enables int4 weights support but activation are still in fp32. Will precision=low make activations fp16? Also how can I enable precision=low

Maybe setting it to backend config, let me check if mnn-llm can do that

Update: I am using low precision in cpu backend config

from mnn.

v0jiuqi avatar v0jiuqi commented on June 15, 2024

具体的一些设置(precision, threadNumber etc.)在 llm.cpp中,关于直接使用mnn来跑 llm ,可以打开宏 -DMNN_BUILD_LLM=ON,然后得到llm_demo这个可执行文件来跑大模型。

from mnn.

Nick-infinity avatar Nick-infinity commented on June 15, 2024

Thank you very much @v0jiuqi . You have saved my day and helped me to fix the issue. I can confirm I can get the same perf now as the released apk.
I cant thank you enough <3 .
Thanks again and have a good day

from mnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.