GithubHelp home page GithubHelp logo

nmaac / acon Goto Github PK

View Code? Open in Web Editor NEW
203.0 5.0 36.0 21 KB

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Home Page: https://arxiv.org/abs/2009.04759

License: MIT License

Python 100.00%

acon's Introduction

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model FLOPs #Params. top-1 err. (ReLU) top-1 err. (Swish) top-1 err. (ACON)
ShuffleNetV2 0.5x 41M 1.4M 39.4 38.3 (+1.1) 37.0 (+2.4)
ShuffleNetV2 1.5x 299M 3.5M 27.4 26.8 (+0.6) 26.5 (+0.9)
ResNet 50 3.9G 25.5M 24.0 23.5 (+0.5) 23.2 (+0.8)
ResNet 101 7.6G 44.4M 22.8 22.7 (+0.1) 21.8 (+1.0)
ResNet 152 11.3G 60.0M 22.3 22.2 (+0.1) 21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model FLOPs #Params. top-1 err.
ShuffleNetV2 0.5x (meta-acon) 41M 1.7M 34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon) 299M 3.9M 24.7 (+2.7)
ResNet 50 (meta-acon) 3.9G 25.7M 22.0 (+2.0)
ResNet 101 (meta-acon) 7.6G 44.8M 21.0 (+1.8)
ResNet 152 (meta-acon) 11.3G 60.5M 20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

FLOPs #Params. top-1 err.
MobileNetV2 0.17 42M 1.4M 52.6
ShuffleNetV2 0.5x 41M 1.4M 39.4
TFNet 0.5 43M 1.3M 36.6 (+2.8)
MobileNetV2 0.6 141M 2.2M 33.3
ShuffleNetV2 1.0x 146M 2.3M 30.6
TFNet 1.0 135M 1.9M 29.7 (+0.9)
MobileNetV2 1.0 300M 3.4M 28.0
ShuffleNetV2 1.5x 299M 3.5M 27.4
TFNet 1.5 279M 2.7M 26.0 (+1.4)
MobileNetV2 1.4 585M 5.5M 25.3
ShuffleNetV2 2.0x 591M 7.4M 25.0
TFNet 2.0 474M 3.8M 24.3 (+0.7)

Trained Models

  • OneDrive download: Link
  • BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

acon's People

Contributors

glenn-jocher avatar nmaac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

acon's Issues

Use acon in the pre-trained model

Hi, thanks for your amazing work!
How to use ACON in the pre-trained model? Can I directly replace all activation functions in the ImageNet pre-trained network with ACON, and then finetune it in the downstream task?

meat-acon

How do I apply meta-ACON to the network structure of other tasks?for example super-resolution。Could you give me some advice?

centernet种使用预训练权重问题

您好,我这两天在backbone是resnet50的centerNet网络中,,参照MetaACON中的程序,在resnet50 bn2后也加了一行self.acon=MetaAconC(planes),, forward中也将out = self.relu(out)改为:out = self.acon(out),,,训练的时候bach_size设为16,使用了nms,学习率0.001,权值衰减是0.0005,然后预训练权重分别加载了res50.acon.pth和res50.metaacon.pth,用的voc数据集进行的训练、验证,最后结果map特别小,相比原来的centernet网络特别小。
我不太懂是哪里出了问题,或者是不是不能直接用你提供的预训练权值文件?我弄了一天多还是不行。所以来叨扰大佬了。谢谢。

image

关于论文中 Fig 4的相关咨询

感谢您的研究工作,请问论文中 Fig 4是将1.2x和-0.8x分别作为p1x,p2x带入ACON-C函数运算后得到的图像吗,我在计算的时候绘图出现了问题。

Hello, I have some question

您好,我是东北大学的在读硕士,做的显著目标检测方向.
我看了您在CVPR2021发的这篇关于ACON激活函数的文章,想尝试在显著目标检测使用ACON,但是网站里只提供了ResNet的预训练模型,所以想请问下您那边方便提供下VGG的预训练模型吗?
十分感谢.

Experimental results on CIFAR-100

Hi, Thanks for your nice work.

From the paper, the improvements on ImageNet are obvious, which is attractive to me.
Recently, I conduct experiments on CIFAR-100 with ResNet-18 and the meta-ACON. However, the performance is not satisfactory.
Here, I want to know if you can provide some results on CIFAR-100 for reference.

Very thanks.

Parameters of nn.Conv2d in MetaAconC

MetaAconC:
self.fc1 = nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=True)
self.bn1 = nn.BatchNorm2d(max(r,width//r))
self.fc2 = nn.Conv2d(max(r,width//r), width, kernel_size=1, stride=1, bias=True)
self.bn2 = nn.BatchNorm2d(width)

It should be “nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=False)”?

速度

我把网络relu全换为为AconC后发现速度慢了一倍,是因为没优化还是计算量增大了?

问题请教

Aconc激活函数有tensorflow版本的吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.