GithubHelp home page GithubHelp logo

d-li14 / octconv.pytorch Goto Github PK

View Code? Open in Web Editor NEW
289.0 9.0 42.0 138 KB

PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models

Home Page: https://arxiv.org/abs/1904.05049

License: Apache License 2.0

Python 100.00%
octconv pytorch-implementation multi-scale deep-neural-networks imagenet resnet iccv2019 mobilenet

octconv.pytorch's Introduction

Duo Li's github stats

octconv.pytorch's People

Contributors

d-li14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

octconv.pytorch's Issues

BatchNorm before activation vs BatchNorm after activation

Thanks for your implementation of the Octave Conv paper.

I have a remark/question about the Conv_BN_ACT module.
As BN after ACT sometimes makes more sense, following the PyTorch example I made a small OctaveCNN (each using 6 convs total) for the CIFAR10 dataset. Using PReLU activations, CrossEntropy loss, AmsGrad optimizer and alpha=0.25.
After some experimentation with using BatchNorm before or after the activation I found the following results:

Network description # epochs accuracy (%) training loss test loss
Conv_BN_ACT 15 78.46 0.7093 0.6362
30 82.20 0.4613 0.5456
Conv_ACT_BN 15 82.84 0.3917 0.5260
30 84.18 0.1614 0.6036

I observe that Conv_ACT_BN has a tendency to overfit more as its training loss is noticeably lower than testing loss when compared to those of Conv_BN_ACT. However, Conv_ACT_BN does have a much higher accuracy.

Have you looked at this before? Is this the reason why you choose to include Conv_BN_ACT and not Conv_ACT_BN?

About loading pre-training errors

First of all, thank you for your code, which has benefited me a lot, but the same error always occurs when loading the pre-training model. The error code is as follows
Error(s) in loading state_dict for OctResNet:Missing key(s) in state_dict:
Can you help me with it, or can you provide me with a full version of the code? I am grateful! My email is [email protected]

Difference between your implementation and paper.

Hi,
when apply depthwise seperate conv, you directly abandon low-to-high and high-to-low branches. But in the paper, author compress high-to-high branch and preserve communication between high and low branch.

image

Training code

Are you planning on releasing the training code?

Also, did you try to implement the ResNet BasicBlock with OctConv?

I'm trying to do it in my own implementation, but it is tricky due to the lack of down sampling on the first layer.

Computation time around 3 times longer with OctConv

Thanks for your implementation of OctConv!

I see that the OctConv paper reports reduced GFLOPs usage as well as reduced computation time per image. However, running your implementation on ImageNet dataset, it seems the computation time per image actually increases by a multiple of 3! This happens both during training and during validation. I am comparing PyTorch's ResNet50 vs your OctResNet-50 with default alpha parameters.

Initialisation of layers and weights not available

I ran through the code and i couldnt find any snippet for initialisation of layers. do the layers need initialisation ?

and when "pretrain" parameter is set to TRUE, then nothing happens. Do we need to download the weights from somewhere ?

There is no main script to run the network ?

I don't see a main run script to run the training and evaluation for benchamarking. Seems you have create a module for octave convolution to plug into a custom written main script for benchmarking.

Could you please provide the training recipe file?

Thanks for your implementation of this work. I‘m using your code to reproduce the result you reported and I'm wondering if I could refer to your training settings. Could you please kindly provide your training config files if possible?

about plot

Thanks to the author for open source, as a newbie, I would like to ask how your four curves are drawn on a picture?thanks!

Why only return high frequency features (x_h)

def forward(self, x):
    x = self.conv1(x)
    x = self.bn1(x)
    x = self.relu(x)
    x = self.maxpool(x)

    x_h, x_l = self.layer1(x)
    x_h, x_l = self.layer2((x_h,x_l))
    x_h, x_l = self.layer3((x_h,x_l))
    x_h, x_l = self.layer4((x_h,x_l))
    x = self.avgpool(x_h)
    x = x.view(x.size(0), -1)
    x = self.fc(x)

    return x

train coed

first of all thanks for your implementation.
I want to train from scratch octconv.
So could you share train code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.