oaid / autokernel Goto Github PK
View Code? Open in Web Editor NEWAutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
License: Apache License 2.0
AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
License: Apache License 2.0
我在ZCU102上面部署了opendla,zcu102是有一个arm的核心,我想用autokernel自定义一些算子的调度,不知道能不能用在opendla上面,。
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s: Assembler messages:
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:808: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:823: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:854: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1168: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1170: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1387: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1397: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1420: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1459: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1608: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1622: Error: unknown vector operation: ` {z}'
/workspace/AutoKernel/autokernel_plugin/src/depthwise/halide_depthwise.s:1652: Error: unknown vector operation: ` {z}'
说明:
优化需求信息需包含以下信息:
1. 背景项目介绍
简单描述应用场景,算法模块简介(可附上相关开源项目/文档链接)
2. 性能需求描述
目前性能,目标性能,目标平台,测试数据维度(shape)
3. 待优化代码块源码
进行初步性能剖析,找出最值得优化的代码块,提供待优化模块的基础代码(C/C++代码实现/python 代码实现)void func()
int main()
{
//测试性能,提供测试数据的维度
func()
//输出目前耗时
}
本期悬赏任务说明:
此类算子添加参考文档 如何快速添加算子, 测试脚本放在 autokernel_plugin/tests
目录下
基于最新版本,实现 avgpooling的移植,验证正确性(难度:简单)已完成:chenjun2hao
基于最新版本,实现 maxpooling的移植,验证正确性(难度:简单)已完成:chenjun2hao
基于最新版本,实现 fc算子的移植,验证正确性(难度:简单)已完成:QingChuanWS
基于最新版本,实现 softmax算子的移植,验证正确性(难度:简单)已完成:ccw1996
基于最新版本,实现 dw_conv(depthwise_convolution)的移植,验证正确性(难度:简单)已完成:QingChuanWS
基于最新版本,实现 normalize算子的移植,验证正确性(难度:简单)已完成:crouchggj
其他算子待定,欢迎大家待补充(难度:简单)
文档提交在 doc
目录下
doc/benchmark.md
(难度:一般)做任务之前,可以先通过AutoKernel初级教程了解AutoKernel项目
正常编译生成libautokernel.so库前提下
在AutoKernel/autokernel_plugin 目录下执行测试程序:./build/tests/tm_classification
报错: ./build/tests/tm_classification: symbol lookup error: ./build/src/libautokernel.so: undefined symbol: register_builtin_node_ops
ps:由于src子目录里面的build.sh脚本运行报错,于是所有的build.sh脚本中的g++编译命令都添加了-ldl -lpthread -lz
求教一下,sioutas2020用的是哪篇论文呀。
谢谢
拉最新的代码,在docker上跑测试例子报错, 没找到动态依赖库
/workspace/AutoKernel/AutoSearch/toolkit/demo_gen: error while loading shared libraries: libHalide.so.10: cannot open shared object file: No such file or directory
1.Halide.h里面用到了c++17,autokernel_plugin的builid.sh中修改-std=c++17才能编译通过
2.tools.py第175行{HALIDE_HOME}/halide-build/inclue出现了这个路径,不同于autokernel_plugin部分,HALIDE_ROOT代表halide项目源码地址,HALIDE_DIR代表halide的安装位置,建议统一下,这里指定{HALIDE_HOME}/halide-build,需要在源码安装的时候指定build目录名字为halide-build,不同人的习惯可能不同。
3.tools.py第64行,{HALIDE_HOME}/bin这个bin目录源码位置没有(可能是Hailde拉取的版本不同),同时安装目录的库存放地址应该是{HALIDE_DIR}/lib
4.执行python3 tools.py --gen ../generator/batch_matmul.cpp -autotune -compute_time
出现错误:
c++: error: ./temp/batch_1_0/0/.registration.cpp: No such file or directory
c++: error: ./temp/batch_1_0/0/.a: No such file or directory
c++: error: ./temp/batch_1_0/1/.registration.cpp: No such file or directory
c++: error: ./temp/batch_1_0/1/.a: No such file or directory
timeout: failed to run command ‘./temp/batch_1_0/0/bench’: No such file or directory
timeout: failed to run command ‘./temp/batch_1_0/1/bench’: No such file or directory
retrain_cost_model: /home/hebingshi/Downloads/autokernel-git-pr/AutoKernel/AutoSearch/src/adams2019/retrain_cost_model.cpp:351: std::map<int, {anonymous}::PipelineSample> {anonymous}::load_samples(const {anonymous}::Flags&): Assertion `dot != string::npos && best_path.substr(dot) == ".sample"' failed.
../src/adams2019/autotune_loop.sh: line 261: 294442 Done find ${SAMPLES} -name "*.sample"
294443 Aborted (core dumped) | ${AUTOSCHED_BIN}/retrain_cost_model --epochs=${BATCH_SIZE} --rates="0.0001" --num_cores=32 --initial_weights=${WEIGHTS} --weights_out=${WEIGHTS} --best_benchmark=${SAMPLES}/best.${PIPELINE}.benchmark.txt --best_schedule=${SAMPLES}/best.${PIPELINE}.schedule.h
文档中gemm的例子,如果把输入矩阵改成如下:
int M = 20;
int N = 30;
int K = 40;
那么从step 2开始使用tiling的优化操作的方法都会出现索引越界的问题,可以请教一下应该如何解决吗?
Aborted
❯ ./06_build.sh 1
step = 1
M N K = 20 30 40 err 0.00 [rep 50] autokernel | blas 0.0112 ms 0.0013 ms
❯ ./06_build.sh 2
step = 2
terminate called after throwing an instance of 'Halide::RuntimeError'
what(): Error: Input buffer b0 is accessed at 23, which is beyond the max (19) in dimension 1
Aborted
如题,如果要编译支持多核(多core)的算子,应该怎么做?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.