iamhankai / ghostnet.pytorch Goto Github PK

View Code? Open in Web Editor NEW

521.0 15.0 118.0 622 KB

[CVPR2020] GhostNet: More Features from Cheap Operations

Home Page: https://arxiv.org/abs/1911.11907

Python 100.00%

convolutional-neural-networks mobilenetv3 model-compression pytorch fbnet

ghostnet.pytorch's People

Contributors

Stargazers

Watchers

Forkers

tianzhongwei abdelpakey blueanthony shiyongde holygen chl916185 betterhalfwzm mathpopo alwc ruisongzhou pierrehao yogsin yipzcc louisnust dreihunde-wang robot-ai-machinelearning lilujunai templeblock gaimjkp yudachi-poi-poi piseyyou guang000 dun933 sailyung code-conquer greenteahua hell-to-heaven curiouscat-7 huangwenwenlili ryan-lily baishiruyue autogyro wuxiaolianggit chuxin689766 shengzhang90 github-luffy ambigev jihaonew xyzhai19 feiward himarora lamperougeyxy zhaohaihang sporterman dongjiy2826 ernestinaqiu qaz734913414 sttomato wangdeyu zfxu crystalsixone ysq319 nicefuu congduan-hnu javacr reversesystem001 happyday-lkj zhouchuanyou richexplor paseam holycomfort sunting9999 trevolan77 davinci2018 changle2018 zomkey ujsyehao tanjingme shao15xiang crazyvertigo strongdiamond haichaozhang vipermdl felixzhang7 mjt1312 blyucs vaedan drosemei 18813185122 sailfish009 hitersyw freegliboracle duolong ma3252788 huoyiyang geoffzhang sjjdd wang21jun darrenonly wujie1010 githubltqc herolin12 panghongwei17 chenchong137125 leotongxue huiji0315 raoqiang jawaechan michaelfu123 arthasjax

ghostnet.pytorch's Issues

我将Ghostnet用在resnet中，还需要对网络进行训练吗？

Params/ FLOPs counter

Hello
Could you please share the code you're using to obtain the parameters and FLOPs of your validated models as I'm getting conflicting numbers for ResNet-50 + Ghost using the FLOP/ Param counter provided here - https://github.com/sovrasov/flops-counter.pytorch

Custom Weight Initialization

I noticed you use code for custom weight initialization:

ghostnet.pytorch/ghost_net.py

Lines 162 to 169 in 2c90e67

 def _initialize_weights(self): 

 for m in self.modules(): 

 if isinstance(m, nn.Conv2d): 

 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 

 elif isinstance(m, nn.BatchNorm2d): 

 m.weight.data.fill_(1) 

 m.bias.data.zero_()

I've not seen this before. Is there a reason behind this specific strategy? Do you know the effect this has on the training, and have you compared this with the pytorch default weight initialization? Thank you!

关于论文中ResNet-56和Ghost-ResNet-56训练参数问题

大神你好，我在复现论文ResNet-56实验时遇到以下问题：我训练出resnet56的baseline精度比论文中的高不少，想请教下你的resnet56训练参数。还有Ghost-ResNet-56的训练参数也有baseline一样么？

用ghostnet做anchorfree目标检测的主干网络

你好我用ghostnet做anchorfree目标检测的主干网络，检测七类，选取5，10，16层作为FPN输入，训练数据1.3万张。目前训练270epoch，loss=0.78，而用mnasnet做主干，100多epoch的时候loss就降到小数点后两位了。请问我这个超参如何设置合适，目前sgd、lr=0.0001、momentum=0.9、weight_decay=le-4、cosineLR。请问你可以给我点意见吗，谢谢。

Loss and Epocs

Hi @iamhankai

I am attempting to re-train GhostNet (MobileNetV3) on the ImageNet dataset with some changes in activations and fine-tuning hyperparameters. Could you please tell me how many epochs you trained the network for and what was your final loss value? I could not find them in the paper. Thanks.

About SELayer

After excitation layer, there is "clamp operation" but not "sigmoid", so why does it have this modification?

为什么我在替换nn.Conv2d的时候，初始化权重会报错？

听从您的建议，我用GhostModule代替我网络中的nn.Conv2d,别的没有改动，然后测试网络，会报错如下：
Traceback (most recent call last):
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 184, in
model = CSPDarknet53()
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 154, in init
self.stem_conv = Conv(3, stem_channels, 3)
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 67, in init
MyConv2d(in_channels, out_channels, kernel_size, stride),
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 37, in init
nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size//2, groups=init_channels, bias=False),
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 338, in init
False, pair(0), groups, bias, padding_mode)
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 53, in init
self.reset_parameters()
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 56, in reset_parameters
init.kaiming_uniform(self.weight, a=math.sqrt(5))
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 322, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 291, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 223, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0

kernet is kernel

ghostnet.pytorch/ghost_net.py

Line 101 in 8c601df

depthwise_conv(inp, inp, kernet_size, stride, relu=False),

I think this is typo.

请问，可否使用您提供的预训练模型进行finuting，可否提供下您训练时使用的预处理

cheap_opt is liner in paper, why relu?

关于cheap conv中relu的使用问题

ghostnet.pytorch/ghost_net.py

Line 88 in 9aabd96

GhostModule(inp, hidden_dim, kernel_size=1, relu=True),

ghostnet.pytorch/ghost_net.py

Line 94 in 9aabd96

GhostModule(hidden_dim, oup, kernel_size=1, relu=False),

为什么上面两处一个relu=True，一个relu=False呢

Can the activation function be changed to prelu?

pytorch训练的超参数是？

非常棒的网络!! 请问Ghostnet有不加se结构在imagenet训练的结果吗？se对精度影响大吗？pytorch训练的超参数是？

small version will be released soon? Did you do some experiments on it ?

关于卷积核尺寸的疑问

为什么在ghost module中cheap_operation的卷积核始终是3，这个参数不用和cfgs中的kernelsize保持一致吗？我看cfgs中的kernelsize似乎只在stride等于2时，作用于两个ghostmodule中间的dw卷积和shortcut中的depthwise卷积。

如何应用到resnet模型？

1、论文中实验（表45）里的数据，是将哪些conv替换成ghost_conv?
Bottleneck里面，头尾1x1，中间3x3，是全部换了吗？

2、目前实验全部是为了压缩模型吗？有没有参数和计算量跟baseline一样（忽略ghost里的group-conv），得出结论，应用了ghost_conv能提点的实验？

the activation fn of depthwiseconv in res connectionis different from that of tensorflow version??

in your tensorflow version ,activation = BNNoReLU

res = DepthConv(end_point, net, conv_def.kernel, stride=layer_stride, 
                                        data_format='NHWC', activation=BNNoReLU)

but in pytorch version, relu = True

self.shortcut = nn.Sequential( depthwise_conv(inp, inp, 3, stride, relu=True), nn.Conv2d(inp, oup, 1, 1, 0, bias=False), nn.BatchNorm2d(oup),

ghostnet用于细粒度图像分类

作者，你好！我最近将带有预训练模型的 ghostnet1.0x 用于细粒度图像分类，数据集为cub_200，ghostnet只做了略微的调整，最后一层分类层的 num_classes=200。遇到一个问题，我用了SGD作为优化器，CLR作为学习率调整策略，拟合极快。10个epoch内，训练集准确率就到达了99.9%。23个epoch时，测试集达到了75%，并且之后不再上升。一直没想明白为何训练集拟合如此之快，以及测试集与训练集的acc的gap为何这么大。希望得到作者的指点。

How to init GhostModule

I try to re-implement the Ghost-Resnet56 in the paper, but I can't have the same performance as the author's work (accuracy is about 91% and map is 89.5%). I think the problem is that lacking of the ghost module initialization, but I'm not for sure.
I reviewed the ghost module and nn.Conv2d and I found that ghost module doesn't have a weight initialization operation while ghostnet does. And I found that the initialization doesn't include linear and relu layers.
I wonder how to design a weight initialize function for a resnet56.

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

width_multi

请问这个图中6个ghostnet的width_multi分别是多少呀？

Release of PyTorch pretrained model and ablation

Thanks for your awesome work. I have provided the PyTorch pre-trained GhostNet 1.0x with comparable accuracy in https://github.com/d-li14/ghostnet.pytorch. Furthermore, I conducted ablation experiments about different training settings, which may echo some questions in #3, #8.

hello! how to move the GhostModule to my net to replace the normal conv2d layer?

kernet_size in ghost_net.py line 101, it should be kernel_size

Stride=1 cfg Kernel Sizes Never Used?

The first column of the cfg are k kernel sizes:

ghostnet.pytorch/ghost_net.py

Lines 175 to 194 in 8c601df

 cfgs = [ 

 # k, t, c, SE, s  

 [3, 16, 16, 0, 1], 

 [3, 48, 24, 0, 2], 

 [3, 72, 24, 0, 1], 

 [5, 72, 40, 1, 2], 

 [5, 120, 40, 1, 1], 

 [3, 240, 80, 0, 2], 

 [3, 200, 80, 0, 1], 

 [3, 184, 80, 0, 1], 

 [3, 184, 80, 0, 1], 

 [3, 480, 112, 1, 1], 

 [3, 672, 112, 1, 1], 

 [5, 672, 160, 1, 2], 

 [5, 960, 160, 0, 1], 

 [5, 960, 160, 1, 1], 

 [5, 960, 160, 0, 1], 

 [5, 960, 160, 1, 1] 

 ] 

 return GhostNet(cfgs, **kwargs)

But unless stride=2, it appears GhostBottleneck() module does not use the cfg kernel size. Is this correct? If so setting these to 1 in the cfg may be more clear?

ghostnet.pytorch/ghost_net.py

Lines 81 to 108 in 8c601df

 class GhostBottleneck(nn.Module): 

 def __init__(self, inp, hidden_dim, oup, kernel_size, stride, use_se): 

 super(GhostBottleneck, self).__init__() 

 assert stride in [1, 2] 

 self.conv = nn.Sequential( 

 # pw 

 GhostModule(inp, hidden_dim, kernel_size=1, relu=True), 

 # dw 

 depthwise_conv(hidden_dim, hidden_dim, kernel_size, stride, relu=False) if stride==2 else nn.Sequential(), 

 # Squeeze-and-Excite 

 SELayer(hidden_dim) if use_se else nn.Sequential(), 

 # pw-linear 

 GhostModule(hidden_dim, oup, kernel_size=1, relu=False), 

 ) 

 if stride == 1 and inp == oup: 

 self.shortcut = nn.Sequential() 

 else: 

 self.shortcut = nn.Sequential( 

 depthwise_conv(inp, inp, kernet_size, stride, relu=False), 

 nn.Conv2d(inp, oup, 1, 1, 0, bias=False), 

 nn.BatchNorm2d(oup), 

 ) 

 def forward(self, x): 

 return self.conv(x) + self.shortcut(x)

How to train GhostNet? You didn't mention. Plz..

用GhostModule替换残差50的Bottleneck中的nn.Conv2d报错

Can not run the sample

Hi, I use the sample code you provided, But it can not run.

this is the error .

RuntimeError: Trying to create tensor with negative dimension -1: [80, 1, -1, -1]

pretrained pth model for pytorch

Dear author:
Thanks for your timely open source implementation.
It would be grateful if you can publish some pretrained pth on ImageNet dataset. Thank you.

cfgs for cifar-10

Hello, can you provide cfgs(i see the pytorch version) for construct ghostnet for cifar-10?Thanks

test speed flops and parameters

Hi,i have tested the network with mobilenet,but i can see that the speed is not so fast compared with mobilenetv2, #ghostnet flops:147.505M, params:3.903M
#mobilenetv2 flops:312.852M, params:2.225M

so i don't know what's wrong with it?the ghostnet params is higher, in fact test running time code:
x = torch.randn(32, 3, 224, 224)
for _ in range(30):
with torch.no_grad():
inputs = x.cuda()
outputs = model(inputs)
print(time.time() - t)
t = time.time()
it seems that mobilenetv2 is faster than ghostnet???

What is the difference between cheap_operation and depth_wise separable convolution
I was expecting to see a linear transformation in the cheap_operation but it is an ordinary convolution
Where is the part that the cheap linear transformation implemented

Thank you again

	def _initialize_weights(self):
	for m in self.modules():
	if isinstance(m, nn.Conv2d):
	nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
	elif isinstance(m, nn.BatchNorm2d):
	m.weight.data.fill_(1)
	m.bias.data.zero_()

	cfgs = [
	# k, t, c, SE, s
	[3, 16, 16, 0, 1],
	[3, 48, 24, 0, 2],
	[3, 72, 24, 0, 1],
	[5, 72, 40, 1, 2],
	[5, 120, 40, 1, 1],
	[3, 240, 80, 0, 2],
	[3, 200, 80, 0, 1],
	[3, 184, 80, 0, 1],
	[3, 184, 80, 0, 1],
	[3, 480, 112, 1, 1],
	[3, 672, 112, 1, 1],
	[5, 672, 160, 1, 2],
	[5, 960, 160, 0, 1],
	[5, 960, 160, 1, 1],
	[5, 960, 160, 0, 1],
	[5, 960, 160, 1, 1]
	]
	return GhostNet(cfgs, **kwargs)

	class GhostBottleneck(nn.Module):
	def __init__(self, inp, hidden_dim, oup, kernel_size, stride, use_se):
	super(GhostBottleneck, self).__init__()
	assert stride in [1, 2]

	self.conv = nn.Sequential(
	# pw
	GhostModule(inp, hidden_dim, kernel_size=1, relu=True),
	# dw
	depthwise_conv(hidden_dim, hidden_dim, kernel_size, stride, relu=False) if stride==2 else nn.Sequential(),
	# Squeeze-and-Excite
	SELayer(hidden_dim) if use_se else nn.Sequential(),
	# pw-linear
	GhostModule(hidden_dim, oup, kernel_size=1, relu=False),
	)

	if stride == 1 and inp == oup:
	self.shortcut = nn.Sequential()
	else:
	self.shortcut = nn.Sequential(
	depthwise_conv(inp, inp, kernet_size, stride, relu=False),
	nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
	nn.BatchNorm2d(oup),
	)

	def forward(self, x):
	return self.conv(x) + self.shortcut(x)

iamhankai / ghostnet.pytorch Goto Github PK

ghostnet.pytorch's People

Contributors

Stargazers

Watchers

Forkers

ghostnet.pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs