d-li14 / octconv.pytorch Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models
Home Page: https://arxiv.org/abs/1904.05049
License: Apache License 2.0
PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models
Home Page: https://arxiv.org/abs/1904.05049
License: Apache License 2.0
resnet18 is quite different from resnet50.
Thanks for your implementation of the Octave Conv paper.
I have a remark/question about the Conv_BN_ACT
module.
As BN after ACT sometimes makes more sense, following the PyTorch example I made a small OctaveCNN (each using 6 convs total) for the CIFAR10 dataset. Using PReLU activations, CrossEntropy loss, AmsGrad optimizer and alpha=0.25
.
After some experimentation with using BatchNorm before or after the activation I found the following results:
Network description | # epochs | accuracy (%) | training loss | test loss |
---|---|---|---|---|
Conv_BN_ACT |
15 | 78.46 | 0.7093 | 0.6362 |
30 | 82.20 | 0.4613 | 0.5456 | |
Conv_ACT_BN |
15 | 82.84 | 0.3917 | 0.5260 |
30 | 84.18 | 0.1614 | 0.6036 |
I observe that Conv_ACT_BN
has a tendency to overfit more as its training loss is noticeably lower than testing loss when compared to those of Conv_BN_ACT
. However, Conv_ACT_BN
does have a much higher accuracy.
Have you looked at this before? Is this the reason why you choose to include Conv_BN_ACT
and not Conv_ACT_BN
?
Thank you very much for your work.
When will the pre-trained model of the OctResNet-101 be released?
First of all, thank you for your code, which has benefited me a lot, but the same error always occurs when loading the pre-training model. The error code is as follows
Error(s) in loading state_dict for OctResNet:Missing key(s) in state_dict:
Can you help me with it, or can you provide me with a full version of the code? I am grateful! My email is [email protected]
Thx for your great job! Where can i get the pretrained model on imagenet or other dataset?
Are you planning on releasing the training code?
Also, did you try to implement the ResNet BasicBlock with OctConv?
I'm trying to do it in my own implementation, but it is tricky due to the lack of down sampling on the first layer.
Thanks for your implementation of OctConv!
I see that the OctConv paper reports reduced GFLOPs usage as well as reduced computation time per image. However, running your implementation on ImageNet dataset, it seems the computation time per image actually increases by a multiple of 3! This happens both during training and during validation. I am comparing PyTorch's ResNet50 vs your OctResNet-50 with default alpha parameters.
I ran through the code and i couldnt find any snippet for initialisation of layers. do the layers need initialisation ?
and when "pretrain" parameter is set to TRUE, then nothing happens. Do we need to download the weights from somewhere ?
I don't see a main run script to run the training and evaluation for benchamarking. Seems you have create a module for octave convolution to plug into a custom written main script for benchmarking.
If the network is downsampled n times, the input size must be 2^a (a >= n), other sizes are not supported, such as 600x1000.
Hi,
Is this the redundant line?
Thanks for your implementation of this work. I‘m using your code to reproduce the result you reported and I'm wondering if I could refer to your training settings. Could you please kindly provide your training config files if possible?
Thanks to the author for open source, as a newbie, I would like to ask how your four curves are drawn on a picture?thanks!
cifar100 original_resnet26 77% oct_resnet26 65%
my training script is like this:
https://github.com/uoguelph-mlrg/Cutout/blob/master/train.py
could you shared your training script with us??
thank you.
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x_h, x_l = self.layer1(x)
x_h, x_l = self.layer2((x_h,x_l))
x_h, x_l = self.layer3((x_h,x_l))
x_h, x_l = self.layer4((x_h,x_l))
x = self.avgpool(x_h)
x = x.view(x.size(0), -1)
x = self.fc(x)
return x
first of all thanks for your implementation.
I want to train from scratch octconv.
So could you share train code?
Why is original resnet50 faster than octave-resnet50?
In inference on gtx-1080, original resnet50 is about 70 fps and octave-resnet50 is around 42 fps.
Hi,
the BasicBlock here is the BasicBlock in pytorch restnet, right ? or something else ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.