iamhankai / ghostnet.pytorch Goto Github PK
View Code? Open in Web Editor NEW[CVPR2020] GhostNet: More Features from Cheap Operations
Home Page: https://arxiv.org/abs/1911.11907
[CVPR2020] GhostNet: More Features from Cheap Operations
Home Page: https://arxiv.org/abs/1911.11907
Hello
Could you please share the code you're using to obtain the parameters and FLOPs of your validated models as I'm getting conflicting numbers for ResNet-50 + Ghost using the FLOP/ Param counter provided here - https://github.com/sovrasov/flops-counter.pytorch
I noticed you use code for custom weight initialization:
Lines 162 to 169 in 2c90e67
I've not seen this before. Is there a reason behind this specific strategy? Do you know the effect this has on the training, and have you compared this with the pytorch default weight initialization? Thank you!
大神你好,我在复现论文ResNet-56实验时遇到以下问题:我训练出resnet56的baseline精度比论文中的高不少,想请教下你的resnet56训练参数。还有Ghost-ResNet-56的训练参数也有baseline一样么?
你好我用ghostnet做anchorfree目标检测的主干网络,检测七类,选取5,10,16层作为FPN输入,训练数据1.3万张。目前训练270epoch,loss=0.78,而用mnasnet做主干,100多epoch的时候loss就降到小数点后两位了。请问我这个超参如何设置合适,目前sgd、lr=0.0001、momentum=0.9、weight_decay=le-4、cosineLR。请问你可以给我点意见吗,谢谢。
Hi @iamhankai
I am attempting to re-train GhostNet (MobileNetV3) on the ImageNet dataset with some changes in activations and fine-tuning hyperparameters. Could you please tell me how many epochs you trained the network for and what was your final loss value? I could not find them in the paper. Thanks.
After excitation layer, there is "clamp operation" but not "sigmoid", so why does it have this modification?
听从您的建议,我用GhostModule代替我网络中的nn.Conv2d,别的没有改动,然后测试网络,会报错如下:
Traceback (most recent call last):
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 184, in
model = CSPDarknet53()
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 154, in init
self.stem_conv = Conv(3, stem_channels, 3)
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 67, in init
MyConv2d(in_channels, out_channels, kernel_size, stride),
File "C:/Users/luan/Downloads/YOLOv4-PyTorch-master/CSPDarknet53.py", line 37, in init
nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size//2, groups=init_channels, bias=False),
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 338, in init
False, pair(0), groups, bias, padding_mode)
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 53, in init
self.reset_parameters()
File "C:\anaconda\lib\site-packages\torch\nn\modules\conv.py", line 56, in reset_parameters
init.kaiming_uniform(self.weight, a=math.sqrt(5))
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 322, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 291, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "C:\anaconda\lib\site-packages\torch\nn\init.py", line 223, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0
Line 101 in 8c601df
cheap_opt is liner in paper, why relu?
非常棒的网络!! 请问Ghostnet有不加se结构在imagenet训练的结果吗?se对精度影响大吗?pytorch训练的超参数是?
为什么在ghost module中cheap_operation的卷积核始终是3,这个参数不用和cfgs中的kernelsize保持一致吗?我看cfgs中的kernelsize似乎只在stride等于2时,作用于两个ghostmodule中间的dw卷积和shortcut中的depthwise卷积。
1、论文中实验(表45)里的数据,是将哪些conv替换成ghost_conv?
Bottleneck里面,头尾1x1,中间3x3,是全部换了吗?
2、目前实验全部是为了压缩模型吗?有没有参数和计算量跟baseline一样(忽略ghost里的group-conv),得出结论,应用了ghost_conv能提点的实验?
in your tensorflow version ,activation = BNNoReLU
res = DepthConv(end_point, net, conv_def.kernel, stride=layer_stride,
data_format='NHWC', activation=BNNoReLU)
but in pytorch version, relu = True
self.shortcut = nn.Sequential( depthwise_conv(inp, inp, 3, stride, relu=True), nn.Conv2d(inp, oup, 1, 1, 0, bias=False), nn.BatchNorm2d(oup),
I try to re-implement the Ghost-Resnet56 in the paper, but I can't have the same performance as the author's work (accuracy is about 91% and map is 89.5%). I think the problem is that lacking of the ghost module initialization, but I'm not for sure.
I reviewed the ghost module and nn.Conv2d and I found that ghost module doesn't have a weight initialization operation while ghostnet does. And I found that the initialization doesn't include linear and relu layers.
I wonder how to design a weight initialize function for a resnet56.
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
Thanks for your awesome work. I have provided the PyTorch pre-trained GhostNet 1.0x with comparable accuracy in https://github.com/d-li14/ghostnet.pytorch. Furthermore, I conducted ablation experiments about different training settings, which may echo some questions in #3, #8.
Hi, I use the sample code you provided, But it can not run.
this is the error .
RuntimeError: Trying to create tensor with negative dimension -1: [80, 1, -1, -1]
Dear author:
Thanks for your timely open source implementation.
It would be grateful if you can publish some pretrained pth on ImageNet dataset. Thank you.
Hello, can you provide cfgs(i see the pytorch version) for construct ghostnet for cifar-10?Thanks
Hi,i have tested the network with mobilenet,but i can see that the speed is not so fast compared with mobilenetv2, #ghostnet flops:147.505M, params:3.903M
#mobilenetv2 flops:312.852M, params:2.225M
so i don't know what's wrong with it?the ghostnet params is higher, in fact test running time code:
x = torch.randn(32, 3, 224, 224)
for _ in range(30):
with torch.no_grad():
inputs = x.cuda()
outputs = model(inputs)
print(time.time() - t)
t = time.time()
it seems that mobilenetv2 is faster than ghostnet???
MobileNetv3 applies k*k
depthwise conv for each bottleneck, but GhostNet dose not. Dose the d*d
cheap op work like k*k
depthwise conv for extending receptive field?
Hello, I want to set the output channel to be 128, but after i change the params, the result is worse than ResNet34. Could you please give me some advice.
Looking forward your reply.
作者你好,感谢源码分享。
我用ghostmodule替换了VGG16中的所有conv,在cifar10上训练,但精度掉了2个百分点,能请教下Ghost-VGG-16和Ghost-ResNet-56在训练时的参数设置么?
The neural network slowed down with only replace the Conv2d in my net with GhostModule.
Did you get the same results?
How did you train Ghost-ResNet-50 (s=2) on Imagenet to achieve 75.0% top-1 accuracy?
Did you use cosine learning rate and label smooth?
I've tested Ghost Module recently on light weight Unet model and found it can provide speed gain with 1-2% accuracy drop. But after linear quantization, the accuracy is much lower. Wonder whether author has any research on this?
Thank you for the implementation. I have some questions
Thank you again
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.