GithubHelp home page GithubHelp logo

iamhankai / ghostnet.pytorch Goto Github PK

View Code? Open in Web Editor NEW
521.0 15.0 118.0 622 KB

[CVPR2020] GhostNet: More Features from Cheap Operations

Home Page: https://arxiv.org/abs/1911.11907

Python 100.00%
convolutional-neural-networks mobilenetv3 model-compression pytorch fbnet

ghostnet.pytorch's People

Contributors

glenn-jocher avatar iamhankai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ghostnet.pytorch's Issues

Custom Weight Initialization

I noticed you use code for custom weight initialization:

def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()

I've not seen this before. Is there a reason behind this specific strategy? Do you know the effect this has on the training, and have you compared this with the pytorch default weight initialization? Thank you!

关于论文中ResNet-56和Ghost-ResNet-56训练参数问题

大神你好,我在复现论文ResNet-56实验时遇到以下问题:我训练出resnet56的baseline精度比论文中的高不少,想请教下你的resnet56训练参数。还有Ghost-ResNet-56的训练参数也有baseline一样么?

用ghostnet做anchorfree目标检测的主干网络

你好我用ghostnet做anchorfree目标检测的主干网络,检测七类,选取5,10,16层作为FPN输入,训练数据1.3万张。目前训练270epoch,loss=0.78,而用mnasnet做主干,100多epoch的时候loss就降到小数点后两位了。请问我这个超参如何设置合适,目前sgd、lr=0.0001、momentum=0.9、weight_decay=le-4、cosineLR。请问你可以给我点意见吗,谢谢。

Loss and Epocs

Hi @iamhankai

I am attempting to re-train GhostNet (MobileNetV3) on the ImageNet dataset with some changes in activations and fine-tuning hyperparameters. Could you please tell me how many epochs you trained the network for and what was your final loss value? I could not find them in the paper. Thanks.

About SELayer

After excitation layer, there is "clamp operation" but not "sigmoid", so why does it have this modification?

为什么我在替换nn.Conv2d的时候,初始化权重会报错?

听从您的建议,我用GhostModule代替我网络中的nn.Conv2d,别的没有改动,然后测试网络,会报错如下:
Traceback (most recent call last):
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 184, in
model = CSPDarknet53()
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 154, in init
self.stem_conv = Conv(3, stem_channels, 3)
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 67, in init
MyConv2d(in_channels, out_channels, kernel_size, stride),
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 37, in init
nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size//2, groups=init_channels, bias=False),
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 338, in init
False, pair(0), groups, bias, padding_mode)
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 53, in init
self.reset_parameters()
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 56, in reset_parameters
init.kaiming_uniform
(self.weight, a=math.sqrt(5))
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 322, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 291, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 223, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0

pytorch训练的超参数是?

非常棒的网络!! 请问Ghostnet有不加se结构在imagenet训练的结果吗?se对精度影响大吗?pytorch训练的超参数是?

关于卷积核尺寸的疑问

为什么在ghost module中cheap_operation的卷积核始终是3,这个参数不用和cfgs中的kernelsize保持一致吗?我看cfgs中的kernelsize似乎只在stride等于2时,作用于两个ghostmodule中间的dw卷积和shortcut中的depthwise卷积。

如何应用到resnet模型?

1、论文中实验(表45)里的数据,是将哪些conv替换成ghost_conv?
Bottleneck里面,头尾1x1,中间3x3,是全部换了吗?

2、目前实验全部是为了压缩模型吗?有没有参数和计算量跟baseline一样(忽略ghost里的group-conv),得出结论,应用了ghost_conv能提点的实验?

ghostnet用于细粒度图像分类

作者,你好!我最近将 带有预训练模型的 ghostnet1.0x 用于细粒度图像分类,数据集为cub_200,ghostnet只做了略微的调整,最后一层分类层的 num_classes=200。遇到一个问题,我用了SGD作为优化器,CLR作为学习率调整策略,拟合极快。10个epoch内,训练集准确率就到达了99.9%。23个epoch时,测试集达到了75%,并且之后不再上升。一直没想明白为何训练集拟合如此之快,以及测试集与训练集的acc的gap为何这么大。希望得到作者的指点。

Screenshot from 2020-06-14 09-11-35
Screenshot from 2020-06-14 09-16-18

How to init GhostModule

I try to re-implement the Ghost-Resnet56 in the paper, but I can't have the same performance as the author's work (accuracy is about 91% and map is 89.5%). I think the problem is that lacking of the ghost module initialization, but I'm not for sure.
I reviewed the ghost module and nn.Conv2d and I found that ghost module doesn't have a weight initialization operation while ghostnet does. And I found that the initialization doesn't include linear and relu layers.
I wonder how to design a weight initialize function for a resnet56.

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

width_multi

image
请问这个图中6个ghostnet的width_multi分别是多少呀?

Stride=1 cfg Kernel Sizes Never Used?

The first column of the cfg are k kernel sizes:

cfgs = [
# k, t, c, SE, s
[3, 16, 16, 0, 1],
[3, 48, 24, 0, 2],
[3, 72, 24, 0, 1],
[5, 72, 40, 1, 2],
[5, 120, 40, 1, 1],
[3, 240, 80, 0, 2],
[3, 200, 80, 0, 1],
[3, 184, 80, 0, 1],
[3, 184, 80, 0, 1],
[3, 480, 112, 1, 1],
[3, 672, 112, 1, 1],
[5, 672, 160, 1, 2],
[5, 960, 160, 0, 1],
[5, 960, 160, 1, 1],
[5, 960, 160, 0, 1],
[5, 960, 160, 1, 1]
]
return GhostNet(cfgs, **kwargs)

But unless stride=2, it appears GhostBottleneck() module does not use the cfg kernel size. Is this correct? If so setting these to 1 in the cfg may be more clear?

class GhostBottleneck(nn.Module):
def __init__(self, inp, hidden_dim, oup, kernel_size, stride, use_se):
super(GhostBottleneck, self).__init__()
assert stride in [1, 2]
self.conv = nn.Sequential(
# pw
GhostModule(inp, hidden_dim, kernel_size=1, relu=True),
# dw
depthwise_conv(hidden_dim, hidden_dim, kernel_size, stride, relu=False) if stride==2 else nn.Sequential(),
# Squeeze-and-Excite
SELayer(hidden_dim) if use_se else nn.Sequential(),
# pw-linear
GhostModule(hidden_dim, oup, kernel_size=1, relu=False),
)
if stride == 1 and inp == oup:
self.shortcut = nn.Sequential()
else:
self.shortcut = nn.Sequential(
depthwise_conv(inp, inp, kernet_size, stride, relu=False),
nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
)
def forward(self, x):
return self.conv(x) + self.shortcut(x)

Can not run the sample

Hi, I use the sample code you provided, But it can not run.

this is the error .

RuntimeError: Trying to create tensor with negative dimension -1: [80, 1, -1, -1]

pretrained pth model for pytorch

Dear author:
Thanks for your timely open source implementation.
It would be grateful if you can publish some pretrained pth on ImageNet dataset. Thank you.

cfgs for cifar-10

Hello, can you provide cfgs(i see the pytorch version) for construct ghostnet for cifar-10?Thanks

test speed flops and parameters

Hi,i have tested the network with mobilenet,but i can see that the speed is not so fast compared with mobilenetv2, #ghostnet flops:147.505M, params:3.903M
#mobilenetv2 flops:312.852M, params:2.225M

so i don't know what's wrong with it?the ghostnet params is higher, in fact test running time code:
x = torch.randn(32, 3, 224, 224)
for _ in range(30):
with torch.no_grad():
inputs = x.cuda()
outputs = model(inputs)
print(time.time() - t)
t = time.time()
it seems that mobilenetv2 is faster than ghostnet???

How to set the channels?

Hello, I want to set the output channel to be 128, but after i change the params, the result is worse than ResNet34. Could you please give me some advice.

Looking forward your reply.

Ghost-VGG-16和Ghost-ResNet-56的训练参数

作者你好,感谢源码分享。
我用ghostmodule替换了VGG16中的所有conv,在cifar10上训练,但精度掉了2个百分点,能请教下Ghost-VGG-16和Ghost-ResNet-56在训练时的参数设置么?

Robustness to linear quantization

I've tested Ghost Module recently on light weight Unet model and found it can provide speed gain with 1-2% accuracy drop. But after linear quantization, the accuracy is much lower. Wonder whether author has any research on this?

cheap_operation

Thank you for the implementation. I have some questions

  1. What is the difference between cheap_operation and depth_wise separable convolution
  2. I was expecting to see a linear transformation in the cheap_operation but it is an ordinary convolution
  3. Where is the part that the cheap linear transformation implemented

Thank you again

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.