Comments (6)
你好,我们的bit packing是把输入channel维度的8个拼成一个byte,不是width或height,这个是不影响卷积滑窗的。不知道这样是不是能回答你的问题?
from bolt.
感谢回复!我总结一下bolt实现的过程,麻烦您给指正:
-
activation input(NCHW, fp16) -> (N_C/8_H_W_c8, fp16)-> bit-packing using transformFromHalf to (N_C/8_H_W_c8) / 8, bint8, -> further steps in convolution_xnor_A55
-
weight (NCHW, fp32) -> bit-packing via model converter function ws_datatype_converter_bnn, (NCHW)/8, U8 -> weight transform to layout N/16_C/8_HW_n16_c8 using convolution_transform_filter_bnn; 这个转换后的权重将会用来在如convolution_xnor_A55 中进行计算。
-
请问我以上的解读是否有错误?另外请问卷积输出Tensor 的format是fp16_NCHW还是N_C/8_HW_c8?
目前我使用(-1,1)模拟二值化训练生成的模型,它的输出和bolt 暂时无法对齐,我想弄清楚,哪里可能出现了问题?感谢你们的帮助!
from bolt.
input: (N_C/8_H_W_c8) / 8, bint8
weight: N/16_C/8_HW_n16_c8
output: N_C/8_HW_c8
目前我使用(-1,1)模拟二值化训练生成的模型,它的输出和bolt 暂时无法对齐,我想弄清楚,哪里可能出现了问题?感谢你们的帮助!
是不是卷积需要padding,padding的值是什么?这个有影响,可以先比较一下没有padding的卷积层结果。
from bolt.
是不是卷积需要padding,padding的值是什么?这个有影响,可以先比较一下没有padding的卷积层结果。
我测试专门去掉了padding。请问bolt中的unit test是否建立过使用(-1,1)做参数以及用乘加计算模拟二值化计算的对齐测试?谢谢
from bolt.
compute/tensor/tests/test_convolution_bnn.cpp,可以看一下是否对你有帮助
from bolt.
input: (N_C/8_H_W_c8) / 8, bint8 weight: N/16_C/8_HW_n16_c8 output: N_C/8_HW_c8
目前我使用(-1,1)模拟二值化训练生成的模型,它的输出和bolt 暂时无法对齐,我想弄清楚,哪里可能出现了问题?感谢你们的帮助! 是不是卷积需要padding,padding的值是什么?这个有影响,可以先比较一下没有padding的卷积层结果。
padding这里我看bolt源码默认是pad 0,我没看完完整的代码,但是我猜底层计算二值卷积的时候是把-1当作0,然后使用xnor和popcount等位运算实现的二值卷积,这样的话padding就需要特殊处理,可以选择pad +1或者-1,不pad的话会二值输出结果就会不一致。我这边实验发现pad -1效果比较好。我猜作者用来测试的bnn模型应该没有这个特殊处理?
from bolt.
Related Issues (20)
- 建议每个模型的gpu kernel binary保存为独立的so HOT 1
- Elewise算子耗时较大 HOT 2
- android平台编译失败 HOT 6
- 如何设置运行时浮点精度为fp16 HOT 4
- 展开OCL kernel中的标量dot操作可以获得更高的GFLOPs HOT 2
- arm cpu dilated conv遇到nchw类型的输入会出错
- 是否支持BGEMM? HOT 3
- TinyBert模型经过post_training_quantization进行INT8量化后,在Linux_X86-64平台推理报错 HOT 4
- version 1.2.1 and 1.3.0 issues HOT 7
- x86_64编译报错 HOT 5
- I have a problem. Does bolt quantization support x86? HOT 2
- x2bolt转化工具报错 HOT 3
- 请问bolt有1.3.1的发布计划吗? HOT 2
- BNN 只支持conv算子么? HOT 1
- Kotlin MultiPlatform Library HOT 1
- 声音克隆的demo可以参考一下吗? HOT 1
- Unable to compile jpeg on Windows HOT 1
- 请问支持mips架构的板子吗 HOT 1
- Can't convert model to int8 precision with post_training_quantization HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bolt.