dingxiaoh / diversebranchblock Goto Github PK
View Code? Open in Web Editor NEWDiverse Branch Block: Building a Convolution as an Inception-like Unit
License: Apache License 2.0
Diverse Branch Block: Building a Convolution as an Inception-like Unit
License: Apache License 2.0
Hello.
Thank you for your interesting work, and code.
I tried using your Diverseblock in ResNet18 (according to your instructions, replacing conv+bn with diverse blocks). My code is based on https://github.com/kuangliu/pytorch-cifar. The accuracy drops from 95.4% to 95.1%. Do you have any ideas for why this is?
Thank you.
Just turn all batchnorm layer in DBB to sync mode in pytorch and BNAndPadLayer will cause no padding operations.
Hi, when I used DiverseBranchBlock to replace Conv-Bn in my network, I met this error
ValueError: some parameters appear in more than one parameter group
Have you met it before?
RT 主要是收敛速度和精度会比repVGG低一些
Hi,
I verified like this:
conv1 = nn.Conv2d(32, 64, 1, 1, 0, bias=True)
conv2 = nn.Conv2d(64, 128, 3, 1, 1, bias=True)
conv = nn.Conv2d(32, 128, 3, 1, 1, bias=True)
k, b = transIII_1x1_kxk(conv1.weight, conv1.bias, conv2.weight, conv2.bias, 1)
conv.weight.copy_(k)
conv.bias.copy_(b)
inten = torch.randn(2, 32, 224, 224)
out1 = conv2(conv1(inten))
out2 = conv(inten)
print((out1 - out2).abs().max())
And the output is 0.11, which is much too great. Have you noticed this ?
DBB模块用于repvgg block,是否能获得比原repvgg更强的性能
Hi! Thanks for your excellent work! I have a question that if I want to use DBB in mmsegmentation, and what should I do? ^_^
目前只看到resnet,请问mobilenet什么时候有哈?
您好!不好意思打扰您!!我最近拜读您的论文,看到了TRANS Ⅲ,不得不说这种变换确实很是新颖,但看到其中您提到的注意点,即如果第二层K*K 如果对输入做了0填充,那么公式8是不成立的,解决方案是用第一次等价过来的卷积的偏置 REP(b1) 作为填充,对这一点我有点不太理解,您能给详细解释一下不成立的原因以及解决方案的原因么?谢谢您!
hi @DingXiaoH , nice work !!! According to borrows your implementation, I'm has realized a plug-in version of DiverseBranchBlock
This plug-in version has the following advantages:
see rd50_dbb_cifar100_224_e100_sgd_calr.yaml
...
MODEL:
CONV:
TYPE: 'Conv2d'
ADD_BLOCKS: ('DiverseBranchBlock',)
...
build resnet50_d with DDB
from zcls.config import cfg
from zcls.model.recognizers.build import build_recognizer
cfg.merge_from_file(args.config_file)
model = build_recognizer(cfg, device=torch.device('cpu'))
see test_dbblock.py
see model_fuse.py
$ python tools/model_fuse.py --help
usage: model_fuse.py [-h] [--verbose] CONFIG_FILE OUTPUT_DIR
Fuse block for ACBlock/RepVGGBLock/DBBlock
positional arguments:
CONFIG_FILE path to config file
OUTPUT_DIR path to output
optional arguments:
-h, --help show this help message and exit
--verbose Print Model Info
Structural Parameterization is really a nice idea !!! By using ACBlock, I improved model precision in a Dataset that is more bigger than ImageNet, hope DBB can make better precision
Last, thanks you again
您好,DBB模块中有一分支是平均池化,那数据下采样尺度变小了,怎么和11和kk的卷积加到一起呢?
您好,我看您论文里面提到了1X1卷积串联3X3卷积,我想问一下3X3卷积串联3X3卷积从理论上能实现吗?如果对特征图不降尺寸,每个卷积都需要padding的话,又应该怎么做呢?
大佬,你DBB为啥不加identity呢? RepVGG里面就尝试了identity。 是有啥考虑吗?
请问作者,训练的模型不转化,直接用来预测、评估可以吗?
请问作者大大如果把DBB模块替换了自己网络的某些位置的卷积块,应该怎样得到paper里的降低参数模型呢,我更改部分模块后参数由240M到330M左右,比较大。
您好, 我在谷歌云盘/百度云下载模型时, 发现resnet18是一个文件夹, 文件夹内没有模型, resnet50有对应的模型, 但是在用convert.py进行转换时,第27行train_model.load_state_dict(ckpt)报错 ,会出现不匹配的key,报错信息如下(部分省略):
RuntimeError: Error(s) in loading state_dict for ResNet:
Missing key(s) in state_dict: "stage1.0.conv2.dbb_avg.bn.bn.weight", "stage1.0.conv2.dbb_avg.bn.bn.bias", "stage1.0.conv2.dbb_avg.bn.bn.running_mean", "stage1.0.conv2.dbb_avg.bn.bn.running_var", "
stage1.0.conv2.dbb_1x1_kxk.bn1.bn.weight", "stage1.0.conv2.dbb_1x1_kxk.bn1.bn.bias", "stage1.0.conv2.dbb_1x1_kxk.bn1.bn.running_mean", "stage1.0.conv2.dbb_1x1_kxk.bn1.bn.running_var", "stage1.1.conv2.dbb_avg.bn.bn.weight", "stage1.1.conv2.dbb_avg.bn.bn.bias", "stage1.1.conv2.dbb_avg.bn.bn.running_mean", "stage1.1.conv2.dbb_avg.bn.bn.running_var", "stage1.1.conv2.dbb_1x1_kxk.bn1.bn.weight", "stage1.1.conv2.dbb_1x1_kxk.bn1.bn.bias", "stage1.1.conv2.dbb_1x1_kxk.bn1.bn.running_mean", "stage1.1.conv2.dbb_1x1_kxk.bn1.bn.running_var", "stage1.2.conv2.dbb_avg.bn.bn.weight", "stage1.2.conv2.dbb_avg.bn.bn.bias", "stage1.2.conv2.dbb_avg.bn.bn.running_mean", "stage1.2.conv2.dbb_avg.bn.bn.running_var", "stage1.2.conv2.dbb_1x1_kxk.bn1.bn.weight", "stage1.2.conv2.dbb_1x1_kxk.bn1.bn.bias", "stage1.2.conv2.dbb_1x1_kxk.bn1.bn.running_mean", "stage1.2.conv2.dbb_1x1_kxk.bn1.bn.running_var"........
Unexpected key(s) in state_dict: "stage1.0.conv2.dbb_avg.bn.weight", "stage1.0.conv2.dbb_avg.bn.bias", "stage1.0.conv2.dbb_avg.bn.running_mean", "stage1.0.conv2.dbb_avg.bn.running_var",
"stage1.0conv2.dbb_avg.bn.num_batches_tracked", "stage1.0.conv2.dbb_1x1_kxk.bn1.weight", "stage1.0.conv2.dbb_1x1_kxk.bn1.bias", "stage1.0.conv2.dbb_1x1_kxk.bn1.running_mean", "stage1.0.conv2.dbb_1x1_kxk.bn1.running_var", "stage1.0.conv2.dbb_1x1_kxk.bn1.num_batches_tracked", "stage1.1.conv2.dbb_avg.bn.weight", "stage1.1.conv2.dbb_avg.bn.bias", "stage1.1.conv2.dbb_avg.bn.running_mean", "stage1.1.conv2.dbb_avg.bn.runing_var", "stage1.1.conv2.dbb_avg.bn.num_batches_tracked", "stage1.1.conv2.dbb_1x1_kxk.bn1.weight", "stage1.1.conv2.dbb_1x1_kxk.bn1.bias", "stage1.1.conv2.dbb_1x1_kxk.bn1.running_mean", "stage1.1.conv2.dbb_1x1_kxk.bn1.running_var", "stage1.1.conv2.dbb_1x1_kxk.bn1.num_batches_tracked", "stage1.2.conv2.dbb_avg.bn.weight", "stage1.2.conv2.dbb_avg.bn.bias", "stage1.2.conv2.dbb_avg.bn.running_mean", "stage1.conv2.dbb_avg.bn.running_var", "stage1.2.conv2.dbb_avg.bn.num_batches_tracked", "stage1.2.conv2.dbb_1x1_kxk.bn1.weight", "stage1.2.conv2.dbb_1x1_kxk.bn1.bias", "stage1.2.conv2.dbb_1x1_kxk.bn1.running_mean", "stage1.2.conv2.dbb_1x1_kxk.bn1.running_var", "stage1.2.conv2.dbb_1x1_kxk.bn1.num_batches_tracked", "stage2.0.conv2.dbb_avg.bn.weight", "stage2.0.conv2.dbb_avg.bn.bias", "stage2.0.conv2.dbb_avg.bn.runing_mean", "stage2.0.conv2.dbb_avg.bn.running_var".......
您好!我想问下,我在尝试使用DBB模块去替换resnet50的shortcut结构,但是替换后,在加载您谷歌云上的预训练model时,出现了加载失败的情况,应该如何解决?
本人初涉这个领域,还望能给些建议。
为什么在README中写道如果用于属于自己模型,实际上需要用DBB取代普通卷积层+BN层,而论文中说法是DBB能够取代单个普通卷积层?
Hi, your ACNet series is very exciting work. Where can I find this paper? Looking forward to reading it!
您好,请问是否提供预训练模型呢?
DiverseBranchBlock/diversebranchblock.py
Line 105 in be15be7
請問你覺得k x k 接k x k conv2d 有方法能融合起來嗎?
Hello author, I replaced DiverseBranchBlock with RepBlock in a model. The performance has been slightly improved, but the number of parameters has been greatly increased. Please ask the author for any suggestions.
Hi,
I just wonder whether here should be F.pad(kernel, [W_pixels_to_pad, W_pixels_to_pad, H_pixels_to_pad, H_pixels_to_pad])
, since the F.pad's padding mode should be set as [padding_left, padding_right, padding_top, padding_bottom
DiverseBranchBlock/dbb_transforms.py
Line 44 in cd627d5
Best
请问有tensorflow1.x的版本吗
Can I repalce IdentityBasedConv1x1 with conv1x1.
大佬您好,看了您的文章之后,我试着用使用DBB模块的Res18网络用于自己的多分类任务中, 使用方法如下:
import torch
import torch.nn as nn
from DiverseBranchBlock.convnet_utils import switch_deploy_flag, switch_conv_bn_impl, build_model
def Dbb_Res(num_classes,pretrained=True):
switch_deploy_flag(False)
switch_conv_bn_impl('DBB')
model = build_model('ResNet-18')
if pretrained ==True:
model.load_state_dict(torch.load('DiverseBranchBlock\ResNet-18_DBB_7099.pth'))
in_features = model.linear.in_features
model.linear = nn.Linear(in_features, num_classes)
return model
但是在实战中效果却一塌糊涂,预训练res18能达到80%的准确率, 我是用如上方法构建的网络,精度只有6% - .-,请问是我这种方法调用不正确吗,如何调整, 麻烦您了!
你好,我拜读了你这篇文章和19年的文章,并且使用netron可视化了你的base、ACB和DBB模块结构,我发现base和ACB都可以正常显示,但是DBB不对劲,方便留个微信或者QQ解决下这个问题吗,谢谢。
大大您好,感谢您的nice work!
我有一个疑问,您在TRANS Ⅲ中是先1x1,再3x3,那么这时候是需要在1x1后用当前的bias进行padding的。那么可否先3x3,再1x1,那么应该就不需要padding了吧。请问您当时有尝试过吗?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.