nmaac / acon Goto Github PK

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Home Page: https://arxiv.org/abs/2009.04759

License: MIT License

Python 100.00%

acon's Introduction

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model	FLOPs	#Params.	top-1 err. (ReLU)	top-1 err. (Swish)	top-1 err. (ACON)
ShuffleNetV2 0.5x	41M	1.4M	39.4	38.3 (+1.1)	37.0 (+2.4)
ShuffleNetV2 1.5x	299M	3.5M	27.4	26.8 (+0.6)	26.5 (+0.9)
ResNet 50	3.9G	25.5M	24.0	23.5 (+0.5)	23.2 (+0.8)
ResNet 101	7.6G	44.4M	22.8	22.7 (+0.1)	21.8 (+1.0)
ResNet 152	11.3G	60.0M	22.3	22.2 (+0.1)	21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model	FLOPs	#Params.	top-1 err.
ShuffleNetV2 0.5x (meta-acon)	41M	1.7M	34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon)	299M	3.9M	24.7 (+2.7)
ResNet 50 (meta-acon)	3.9G	25.7M	22.0 (+2.0)
ResNet 101 (meta-acon)	7.6G	44.8M	21.0 (+1.8)
ResNet 152 (meta-acon)	11.3G	60.5M	20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

	FLOPs	#Params.	top-1 err.
MobileNetV2 0.17	42M	1.4M	52.6
ShuffleNetV2 0.5x	41M	1.4M	39.4
TFNet 0.5	43M	1.3M	36.6 (+2.8)
MobileNetV2 0.6	141M	2.2M	33.3
ShuffleNetV2 1.0x	146M	2.3M	30.6
TFNet 1.0	135M	1.9M	29.7 (+0.9)
MobileNetV2 1.0	300M	3.4M	28.0
ShuffleNetV2 1.5x	299M	3.5M	27.4
TFNet 1.5	279M	2.7M	26.0 (+1.4)
MobileNetV2 1.4	585M	5.5M	25.3
ShuffleNetV2 2.0x	591M	7.4M	25.0
TFNet 2.0	474M	3.8M	24.3 (+0.7)

Trained Models

OneDrive download: Link
BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

acon's People

Contributors

Stargazers

Watchers

acon's Issues

是否可以在yolo5中使用代替silu？我在替换后效果并不好

Use acon in the pre-trained model

Hi, thanks for your amazing work!
How to use ACON in the pre-trained model? Can I directly replace all activation functions in the ImageNet pre-trained network with ACON, and then finetune it in the downstream task?

Use it in Conv1d

Hello,my networks is basd on conv1d,could I use the acon?

meat-acon

How do I apply meta-ACON to the network structure of other tasks？for example super-resolution。Could you give me some advice？

how to use acon in 3d datasets

hello! i want to use it in Hyperspectral images.pleace teach me how to use acon in 3d datasets?

请问ACON能用在全连接吗，怎么用

centernet种使用预训练权重问题

您好，我这两天在backbone是resnet50的centerNet网络中，，参照MetaACON中的程序，在resnet50 bn2后也加了一行self.acon=MetaAconC(planes),， forward中也将out = self.relu(out)改为：out = self.acon(out)，，，训练的时候bach_size设为16，使用了nms，学习率0.001，权值衰减是0.0005，然后预训练权重分别加载了res50.acon.pth和res50.metaacon.pth，用的voc数据集进行的训练、验证，最后结果map特别小，相比原来的centernet网络特别小。
我不太懂是哪里出了问题，或者是不是不能直接用你提供的预训练权值文件？我弄了一天多还是不行。所以来叨扰大佬了。谢谢。

关于论文中 Fig 4的相关咨询

感谢您的研究工作，请问论文中 Fig 4是将1.2x和-0.8x分别作为p1x,p2x带入ACON-C函数运算后得到的图像吗，我在计算的时候绘图出现了问题。

Hello, I have some question

您好,我是东北大学的在读硕士,做的显著目标检测方向.
我看了您在CVPR2021发的这篇关于ACON激活函数的文章,想尝试在显著目标检测使用ACON,但是网站里只提供了ResNet的预训练模型,所以想请问下您那边方便提供下VGG的预训练模型吗?
十分感谢.

Experimental results on CIFAR-100

Hi, Thanks for your nice work.

From the paper, the improvements on ImageNet are obvious, which is attractive to me.
Recently, I conduct experiments on CIFAR-100 with ResNet-18 and the meta-ACON. However, the performance is not satisfactory.
Here, I want to know if you can provide some results on CIFAR-100 for reference.

Very thanks.

请问调用AconC时该传入什么参数？

Hi, nmaac! 非常感谢您的工作，请问调用AconC的参数是输出图像的channel吗？

Parameters of nn.Conv2d in MetaAconC

MetaAconC:
self.fc1 = nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=True)
self.bn1 = nn.BatchNorm2d(max(r,width//r))
self.fc2 = nn.Conv2d(max(r,width//r), width, kernel_size=1, stride=1, bias=True)
self.bn2 = nn.BatchNorm2d(width)

It should be “nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=False)”？

速度

我把网络relu全换为为AconC后发现速度慢了一倍，是因为没优化还是计算量增大了？

good job！i have a question

看了源代码，meta-acon要自动学习self.p，self.q，以及生成β的conv每层权重？

问题请教

Aconc激活函数有tensorflow版本的吗？

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16, 1, 1])

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16, 1, 1])
会在这一行，beta = self.sigmoid(self.bn2(self.fc2(self.bn1(self.fc1(x.mean(dim=2, keepdims=True).mean(dim=3, keepdims=True))))))
会莫名其妙出现这个错误，然后早dataloader中使用了drop_last=True仍然没用