facebookresearch / resnext Goto Github PK

Implementation of a classification framework from the paper Aggregated Residual Transformations for Deep Neural Networks

License: Other

Lua 100.00%

resnext's Issues

the results on Cifar10

I trained ResNeXt-64x8 and ResNeXt-64x16 on Cifar-10. But the result is slight worse than the results released in the paper. I trained these models by three gpus:
th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16/8 -weightDecay 5e-4 -batchSize 128 -nGPU 3 -nThreads 8 -shareGradInput true
the best model results I got for each run is:
ResNeXt-64x8: 3.68%, 3.8%, 3.86% 3.90% 3.76% 3.84%
ResNeXt-64x16: 3.84% 3.60%
The average results in the paper is 3.65% and 3.58% for these two net.
Does someone have the same question? Any suggestions?
Thanks a lot.

caffe pretained model

Hi,can u provide the pretrained model of ResNeXt.The model link provided seems broken!

The error value suddenly jumps to a giga number

Recently I am trying to reproduce your result in torch. And my command is

th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-6 -batchSize 32 -nGPU 2 -LR 0.025 -nThreads 8 -shareGradInput true | tee -a ./cifar10_2gpu_torch.log

I copied the command in README.md(CIFAR10 and 2GPUs).

The problem starts from #71 epoch.
Here is my log file(For readability I pick out part of them):

 * Finished epoch # 60     top1:   6.610  top5:   0.140
 * Finished epoch # 61     top1:   7.050  top5:   0.150
 * Finished epoch # 62     top1:   7.680  top5:   0.240
 * Finished epoch # 63     top1:   7.180  top5:   0.230
 * Finished epoch # 64     top1:   7.100  top5:   0.220
 * Finished epoch # 65     top1:   6.980  top5:   0.160
 * Finished epoch # 66     top1:   6.850  top5:   0.170
 * Finished epoch # 67     top1:   6.870  top5:   0.180
 * Finished epoch # 68     top1:   7.010  top5:   0.270
 * Finished epoch # 69     top1:   6.910  top5:   0.220
 * Finished epoch # 70     top1:   6.290  top5:   0.130
 * Finished epoch # 71     top1:  85.740  top5:  34.780
 * Finished epoch # 72     top1:  81.790  top5:  33.700
 * Finished epoch # 73     top1:  80.220  top5:  28.920
 * Finished epoch # 74     top1:  79.200  top5:  31.640
 * Finished epoch # 75     top1:  78.980  top5:  27.150
 * Finished epoch # 76     top1:  79.540  top5:  30.260
 * Finished epoch # 77     top1:  81.540  top5:  29.620

And the epoch output for a single batch:

 | Epoch: [78][1158/1563]    Time 1.024  Data 0.000  Err 1528913280.0000  top1  81.250  top5  28.125
 | Epoch: [78][1159/1563]    Time 0.881  Data 0.000  Err 1559899264.0000  top1  81.250  top5  15.625
 | Epoch: [78][1160/1563]    Time 0.975  Data 0.000  Err 8231911424.0000  top1  87.500  top5  40.625
 | Epoch: [78][1161/1563]    Time 0.928  Data 0.000  Err 554394944.0000  top1  78.125  top5  28.125
 | Epoch: [78][1162/1563]    Time 1.012  Data 0.000  Err 4567331328.0000  top1  93.750  top5  40.625
 | Epoch: [78][1163/1563]    Time 1.146  Data 0.000  Err 2310403584.0000  top1  78.125  top5  34.375
 | Epoch: [78][1164/1563]    Time 0.947  Data 0.000  Err 2803231744.0000  top1  81.250  top5  25.000
 | Epoch: [78][1165/1563]    Time 0.956  Data 0.000  Err 2265360896.0000  top1  87.500  top5  50.000
 | Epoch: [78][1166/1563]    Time 0.867  Data 0.000  Err 1953190016.0000  top1  84.375  top5  21.875
 | Epoch: [78][1167/1563]    Time 1.014  Data 0.000  Err 2912053760.0000  top1  93.750  top5  28.125
 | Epoch: [78][1168/1563]    Time 1.007  Data 0.000  Err 4222694656.0000  top1  84.375  top5  31.250
 | Epoch: [78][1169/1563]    Time 0.895  Data 0.000  Err 5509958144.0000  top1  81.250  top5  37.500
 | Epoch: [78][1170/1563]    Time 0.979  Data 0.000  Err 5301891584.0000  top1  84.375  top5  34.375
 | Epoch: [78][1171/1563]    Time 0.920  Data 0.000  Err 3593149184.0000  top1  87.500  top5  28.125
 | Epoch: [78][1172/1563]    Time 1.020  Data 0.000  Err 7279746560.0000  top1  90.625  top5  31.250
 | Epoch: [78][1173/1563]    Time 1.002  Data 0.000  Err 10108009472.0000  top1  87.500  top5  31.250
 | Epoch: [78][1174/1563]    Time 0.861  Data 0.001  Err 2861270528.0000  top1  87.500  top5  28.125
 | Epoch: [78][1175/1563]    Time 0.862  Data 0.000  Err 4651573760.0000  top1  87.500  top5  31.250
 | Epoch: [78][1176/1563]    Time 1.051  Data 0.000  Err 92108896.0000  top1  75.000  top5  31.250
 | Epoch: [78][1177/1563]    Time 1.024  Data 0.000  Err 2649925888.0000  top1  87.500  top5  43.750
 | Epoch: [78][1178/1563]    Time 0.967  Data 0.000  Err 2876758784.0000  top1  71.875  top5  18.750
 | Epoch: [78][1179/1563]    Time 0.942  Data 0.000  Err 2976156928.0000  top1  71.875  top5  15.625
 | Epoch: [78][1180/1563]    Time 0.882  Data 0.000  Err 838116416.0000  top1  78.125  top5  43.750
 | Epoch: [78][1181/1563]    Time 1.028  Data 0.000  Err 6477106688.0000  top1  78.125  top5  37.500
 | Epoch: [78][1182/1563]    Time 1.004  Data 0.000  Err 5051654144.0000  top1  84.375  top5  31.250
 | Epoch: [78][1183/1563]    Time 0.859  Data 0.000  Err 5013932544.0000  top1  87.500  top5  34.375
 | Epoch: [78][1184/1563]    Time 0.848  Data 0.001  Err 2034009088.0000  top1  93.750  top5  25.000
 | Epoch: [78][1185/1563]    Time 1.060  Data 0.000  Err 3669680640.0000  top1  78.125  top5  25.000
 | Epoch: [78][1186/1563]    Time 1.028  Data 0.000  Err 4146675200.0000  top1  93.750  top5  28.125
 | Epoch: [78][1187/1563]    Time 0.966  Data 0.000  Err 2259935488.0000  top1  84.375  top5  34.375
 | Epoch: [78][1188/1563]    Time 0.956  Data 0.000  Err 1698448512.0000  top1  75.000  top5  25.000
 | Epoch: [78][1189/1563]    Time 0.864  Data 0.000  Err 4151320064.0000  top1  90.625  top5  56.250
 | Epoch: [78][1190/1563]    Time 1.035  Data 0.000  Err 1942320000.0000  top1  87.500  top5  31.250
 | Epoch: [78][1191/1563]    Time 1.026  Data 0.000  Err 1455451520.0000  top1  81.250  top5  31.250
 | Epoch: [78][1192/1563]    Time 0.867  Data 0.000  Err 2734585856.0000  top1  90.625  top5  40.625
 | Epoch: [78][1193/1563]    Time 0.965  Data 0.000  Err 36324916.0000  top1  81.250  top5  18.750
 | Epoch: [78][1194/1563]    Time 0.913  Data 0.000  Err 6873055744.0000  top1  90.625  top5  50.000
 | Epoch: [78][1195/1563]    Time 1.004  Data 0.000  Err 1242362112.0000  top1  84.375  top5  31.250

So anyone has the same problem or how to solve it? I have run this for nearly two days, but it really disappointed me.

Thanks a lot.

Why don't ResNeXt use Pre activation?

May I ask why ResNeXt don't use Pre activation as mentioned in Identity Mappings in Deep Residual Networks. I didn't see the reason in Aggregated Residual Transformations for Deep Neural Networks.

Question about style block

Hi,

In appendix "A. Implementation Details: CIFAR" of paper "Aggregated Residual Transformations for Deep Neural Networks", it is written that: "We adopt the pre-activation style block as in [14]", where [14] is the paper "Identity mappings in deep residual networks.". However, when I look at the resnext_bottleneck_B, I see it has the original style block with BN before the sum and ReLU after. Did you get your results with the original style block or with the pre-activation style block ?

About license

Hi, I have a question about license of Imagenet pre-trained model.
I want to do fine-tuning using Imagenet pre-trained model on my own dataset.
In this case, which license should I follow? Is it non-commercial?

Caffe third party repository.

Seems the Caffe third party repository no longer exists. Is there an alternative? I found https://github.com/firekong0909/ResNeXt but the download links appear to be broken.

How do you compute GFLOPs

Hi,
I'm trying to compute GFLOPs by using this code: https://github.com/apaszke/torch-opCounter
The input data is [1, 3, 224, 224], and the computed GFLOPs is about 22.84 GFLOPs for ResNeXt-50, 32x4d, which is different with the reported 4.1 GFLOPs.

So is there anything wrong with my approach? How do you compute GFLOPs?

regarding training scripts for ssd-resnet-152 in caffe

Hi, where can we find the training scripts of ssd-resnet50, ssd-resnet-152 in caffe. I need that .
Kindly reply for this issue.
Thank yoyu

How to cite your ResNeXt work?

Dear author:
very great work for deep learning, yet I want to konw how can I cite your work as bibtex? the given cite seems not complete, could you please update the cite content?

 thanks a lot.

@article{Xie2016, title={Aggregated Residual Transformations for Deep Neural Networks}, author={Saining Xie and Ross Girshick and Piotr Dollár and Zhuowen Tu and Kaiming He}, journal={arXiv preprint arXiv:1611.05431}, year={2016} }

5k pretrained models

Is there any chance we can get your 5k way pretrained models? I suspect these will work much better for pretraining.

learning rate for 4 gpu on imagenet dataset

What lr should I choose for imagenet training if I use 4 GPUs instead of 4 with batch size of 32 per gpu?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble