Comments (5)
I encountered the same problem as @dreamcontinue . I trained SKNet50 for twice following the args below, the best top1-acc is
78.84 which is 0.37% lower than paper(79.21).
args.epochs = 100
args.batch_size = 256
### data transform
args.autoaugment = False
args.colorjitter = False
args.change_light = False
### optimizer
args.optimizer = 'SGD'
args.lr = 0.1
args.momentum = 0.9
args.weight_decay = 1e-4
args.nesterov = True
### criterion
args.labelsmooth = 0.1
### lr scheduler
args.scheduler = 'multistep'
args.lr_decay_rate = 0.1
args.lr_decay_step = 30
The labelsmooth trick is already included. The data augmentation is just RandomResizedCrop/HorizontalFlip/Normalize.
Compose(
RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
RandomHorizontalFlip(p=0.5)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
I read the sknet50.prototxt carefully, the only difference in the model level is bn's momentum which is 0.1(default) in pytorch, but author use 0.05(1-0.95). I am not sure how big effect will this parameter have on the results.
from sknet.
Hi, all:
The performance of SKNet-50 in the original paper is under caffe framework, which is exactly the original caffe repo of SENet (we directly use the environment of authors of SENet as we are teammates at that time). I remember there may be more differences than the pytorch repo, mainly including framework differences and more aggressive data augmentation.
from sknet.
Hi, all:
The performance of SKNet-50 in the original paper is under caffe framework, which is exactly the original caffe repo of SENet (we directly use the environment of authors of SENet as we are teammates at that time). I remember there may be more differences than the pytorch repo, mainly including framework differences and more aggressive data augmentation.
Thanks for the reply!
But what do you mean more aggressive data augmentation?The published paper mentioned, "For data augmentation,
we follow the standard practice and perform the random size cropping to 224 ×224 and random horizontal flipping. The practical mean channel subtraction is adpoted to normalize the input images for both training and testing", to the best of my knowledge, this data augmentation is implemented as following code in pytorch
Compose(
RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
RandomHorizontalFlip(p=0.5)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
Can you give more details about the data augmentation. Also, I will read the SENet caffe code and try to figure out the differences.
I just read the repo of SENet . The data augmentation for SENet is
Method | Settings |
---|---|
Random Mirror | True |
Random Crop | 8% ~ 100% |
Aspect Ratio | 3/4 ~ 4/3 |
Random Rotation | -10° ~ 10° |
Pixel Jitter | -20 ~ 20 |
Except the RandomResizedCrop(Random Crop/Aspect Ratio) and HorizontalFlip(Random Mirror), aggressive data augmentations like RandomRotation and PixelJitter are also used.
from sknet.
All right~ You can also try SENet under the same pytorch repo (and the same data augmentation) and you will discover the result is also lower than its original paper.
from sknet.
That's interesting. I am curious about the framework difference and its effect on model performance. I will try to train the SKNet50 under the SENet data augmentaion in pytorch.
from sknet.
Related Issues (14)
- SKnet101 Model HOT 2
- Question about cuDNN batch normalization HOT 1
- Why SKNet101 conv3_x/B_fc1's output channel is 16?
- How to modify caffe-gpu in anaconda with given addition files?
- caffe model preprocess
- Is there pytorch version of SKNet HOT 1
- Question: initialize A and B HOT 1
- any details about architectures for cifar10 HOT 1
- Which version of caffe、cuda and cuda?
- Questions about the fuse process
- Parameters
- ‘DiagonalAffineMap’ does not name a type virtual inline DiagonalAffineMap<Dtype> coord_map()
- Problems about cudnn_batch_norm_layer.cu
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sknet.