I referenced the code from pppLang and

Question about the performance of the pytorch model about sknet HOT 5 CLOSED

implus commented on September 28, 2024 1

Question about the performance of the pytorch model

from sknet.

Comments (5)

jianghaojun commented on September 28, 2024

I encountered the same problem as @dreamcontinue . I trained SKNet50 for twice following the args below, the best top1-acc is
78.84 which is 0.37% lower than paper(79.21).

args.epochs = 100
args.batch_size = 256
### data transform
args.autoaugment = False
args.colorjitter = False
args.change_light = False
### optimizer
args.optimizer = 'SGD'
args.lr = 0.1
args.momentum = 0.9
args.weight_decay = 1e-4
args.nesterov = True
### criterion
args.labelsmooth = 0.1
### lr scheduler
args.scheduler = 'multistep'
args.lr_decay_rate = 0.1
args.lr_decay_step = 30

The labelsmooth trick is already included. The data augmentation is just RandomResizedCrop/HorizontalFlip/Normalize.

Compose(
    RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
    RandomHorizontalFlip(p=0.5)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)

I read the sknet50.prototxt carefully, the only difference in the model level is bn's momentum which is 0.1(default) in pytorch, but author use 0.05(1-0.95). I am not sure how big effect will this parameter have on the results.

from sknet.

implus commented on September 28, 2024

Hi, all:

The performance of SKNet-50 in the original paper is under caffe framework, which is exactly the original caffe repo of SENet (we directly use the environment of authors of SENet as we are teammates at that time). I remember there may be more differences than the pytorch repo, mainly including framework differences and more aggressive data augmentation.

from sknet.

jianghaojun commented on September 28, 2024

Hi, all:

The performance of SKNet-50 in the original paper is under caffe framework, which is exactly the original caffe repo of SENet (we directly use the environment of authors of SENet as we are teammates at that time). I remember there may be more differences than the pytorch repo, mainly including framework differences and more aggressive data augmentation.

Thanks for the reply！

But what do you mean more aggressive data augmentation？The published paper mentioned, "For data augmentation,
we follow the standard practice and perform the random size cropping to 224 ×224 and random horizontal flipping. The practical mean channel subtraction is adpoted to normalize the input images for both training and testing", to the best of my knowledge, this data augmentation is implemented as following code in pytorch

Compose(
    RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
    RandomHorizontalFlip(p=0.5)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)

Can you give more details about the data augmentation. Also, I will read the SENet caffe code and try to figure out the differences.

I just read the repo of SENet . The data augmentation for SENet is

Method	Settings
Random Mirror	True
Random Crop	8% ~ 100%
Aspect Ratio	3/4 ~ 4/3
Random Rotation	-10° ~ 10°
Pixel Jitter	-20 ~ 20

Except the RandomResizedCrop(Random Crop/Aspect Ratio) and HorizontalFlip(Random Mirror), aggressive data augmentations like RandomRotation and PixelJitter are also used.

from sknet.

implus commented on September 28, 2024

All right~ You can also try SENet under the same pytorch repo (and the same data augmentation) and you will discover the result is also lower than its original paper.

from sknet.

jianghaojun commented on September 28, 2024

That's interesting. I am curious about the framework difference and its effect on model performance. I will try to train the SKNet50 under the SENet data augmentaion in pytorch.

from sknet.

Question about the performance of the pytorch model about sknet HOT 5 CLOSED

Comments (5)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs