GithubHelp home page GithubHelp logo

facebookresearch / resnext Goto Github PK

View Code? Open in Web Editor NEW
1.9K 75.0 290.0 32 KB

Implementation of a classification framework from the paper Aggregated Residual Transformations for Deep Neural Networks

License: Other

Lua 100.00%

resnext's Issues

the results on Cifar10

I trained ResNeXt-64x8 and ResNeXt-64x16 on Cifar-10. But the result is slight worse than the results released in the paper. I trained these models by three gpus:
th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16/8 -weightDecay 5e-4 -batchSize 128 -nGPU 3 -nThreads 8 -shareGradInput true
the best model results I got for each run is:
ResNeXt-64x8: 3.68%, 3.8%, 3.86% 3.90% 3.76% 3.84%
ResNeXt-64x16: 3.84% 3.60%
The average results in the paper is 3.65% and 3.58% for these two net.
Does someone have the same question? Any suggestions?
Thanks a lot.

caffe pretained model

Hi,can u provide the pretrained model of ResNeXt.The model link provided seems broken!

The error value suddenly jumps to a giga number

Recently I am trying to reproduce your result in torch. And my command is

th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-6 -batchSize 32 -nGPU 2 -LR 0.025 -nThreads 8 -shareGradInput true | tee -a ./cifar10_2gpu_torch.log

I copied the command in README.md(CIFAR10 and 2GPUs).

The problem starts from #71 epoch.
Here is my log file(For readability I pick out part of them):

 * Finished epoch # 60     top1:   6.610  top5:   0.140
 * Finished epoch # 61     top1:   7.050  top5:   0.150
 * Finished epoch # 62     top1:   7.680  top5:   0.240
 * Finished epoch # 63     top1:   7.180  top5:   0.230
 * Finished epoch # 64     top1:   7.100  top5:   0.220
 * Finished epoch # 65     top1:   6.980  top5:   0.160
 * Finished epoch # 66     top1:   6.850  top5:   0.170
 * Finished epoch # 67     top1:   6.870  top5:   0.180
 * Finished epoch # 68     top1:   7.010  top5:   0.270
 * Finished epoch # 69     top1:   6.910  top5:   0.220
 * Finished epoch # 70     top1:   6.290  top5:   0.130
 * Finished epoch # 71     top1:  85.740  top5:  34.780
 * Finished epoch # 72     top1:  81.790  top5:  33.700
 * Finished epoch # 73     top1:  80.220  top5:  28.920
 * Finished epoch # 74     top1:  79.200  top5:  31.640
 * Finished epoch # 75     top1:  78.980  top5:  27.150
 * Finished epoch # 76     top1:  79.540  top5:  30.260
 * Finished epoch # 77     top1:  81.540  top5:  29.620

And the epoch output for a single batch:

 | Epoch: [78][1158/1563]    Time 1.024  Data 0.000  Err 1528913280.0000  top1  81.250  top5  28.125
 | Epoch: [78][1159/1563]    Time 0.881  Data 0.000  Err 1559899264.0000  top1  81.250  top5  15.625
 | Epoch: [78][1160/1563]    Time 0.975  Data 0.000  Err 8231911424.0000  top1  87.500  top5  40.625
 | Epoch: [78][1161/1563]    Time 0.928  Data 0.000  Err 554394944.0000  top1  78.125  top5  28.125
 | Epoch: [78][1162/1563]    Time 1.012  Data 0.000  Err 4567331328.0000  top1  93.750  top5  40.625
 | Epoch: [78][1163/1563]    Time 1.146  Data 0.000  Err 2310403584.0000  top1  78.125  top5  34.375
 | Epoch: [78][1164/1563]    Time 0.947  Data 0.000  Err 2803231744.0000  top1  81.250  top5  25.000
 | Epoch: [78][1165/1563]    Time 0.956  Data 0.000  Err 2265360896.0000  top1  87.500  top5  50.000
 | Epoch: [78][1166/1563]    Time 0.867  Data 0.000  Err 1953190016.0000  top1  84.375  top5  21.875
 | Epoch: [78][1167/1563]    Time 1.014  Data 0.000  Err 2912053760.0000  top1  93.750  top5  28.125
 | Epoch: [78][1168/1563]    Time 1.007  Data 0.000  Err 4222694656.0000  top1  84.375  top5  31.250
 | Epoch: [78][1169/1563]    Time 0.895  Data 0.000  Err 5509958144.0000  top1  81.250  top5  37.500
 | Epoch: [78][1170/1563]    Time 0.979  Data 0.000  Err 5301891584.0000  top1  84.375  top5  34.375
 | Epoch: [78][1171/1563]    Time 0.920  Data 0.000  Err 3593149184.0000  top1  87.500  top5  28.125
 | Epoch: [78][1172/1563]    Time 1.020  Data 0.000  Err 7279746560.0000  top1  90.625  top5  31.250
 | Epoch: [78][1173/1563]    Time 1.002  Data 0.000  Err 10108009472.0000  top1  87.500  top5  31.250
 | Epoch: [78][1174/1563]    Time 0.861  Data 0.001  Err 2861270528.0000  top1  87.500  top5  28.125
 | Epoch: [78][1175/1563]    Time 0.862  Data 0.000  Err 4651573760.0000  top1  87.500  top5  31.250
 | Epoch: [78][1176/1563]    Time 1.051  Data 0.000  Err 92108896.0000  top1  75.000  top5  31.250
 | Epoch: [78][1177/1563]    Time 1.024  Data 0.000  Err 2649925888.0000  top1  87.500  top5  43.750
 | Epoch: [78][1178/1563]    Time 0.967  Data 0.000  Err 2876758784.0000  top1  71.875  top5  18.750
 | Epoch: [78][1179/1563]    Time 0.942  Data 0.000  Err 2976156928.0000  top1  71.875  top5  15.625
 | Epoch: [78][1180/1563]    Time 0.882  Data 0.000  Err 838116416.0000  top1  78.125  top5  43.750
 | Epoch: [78][1181/1563]    Time 1.028  Data 0.000  Err 6477106688.0000  top1  78.125  top5  37.500
 | Epoch: [78][1182/1563]    Time 1.004  Data 0.000  Err 5051654144.0000  top1  84.375  top5  31.250
 | Epoch: [78][1183/1563]    Time 0.859  Data 0.000  Err 5013932544.0000  top1  87.500  top5  34.375
 | Epoch: [78][1184/1563]    Time 0.848  Data 0.001  Err 2034009088.0000  top1  93.750  top5  25.000
 | Epoch: [78][1185/1563]    Time 1.060  Data 0.000  Err 3669680640.0000  top1  78.125  top5  25.000
 | Epoch: [78][1186/1563]    Time 1.028  Data 0.000  Err 4146675200.0000  top1  93.750  top5  28.125
 | Epoch: [78][1187/1563]    Time 0.966  Data 0.000  Err 2259935488.0000  top1  84.375  top5  34.375
 | Epoch: [78][1188/1563]    Time 0.956  Data 0.000  Err 1698448512.0000  top1  75.000  top5  25.000
 | Epoch: [78][1189/1563]    Time 0.864  Data 0.000  Err 4151320064.0000  top1  90.625  top5  56.250
 | Epoch: [78][1190/1563]    Time 1.035  Data 0.000  Err 1942320000.0000  top1  87.500  top5  31.250
 | Epoch: [78][1191/1563]    Time 1.026  Data 0.000  Err 1455451520.0000  top1  81.250  top5  31.250
 | Epoch: [78][1192/1563]    Time 0.867  Data 0.000  Err 2734585856.0000  top1  90.625  top5  40.625
 | Epoch: [78][1193/1563]    Time 0.965  Data 0.000  Err 36324916.0000  top1  81.250  top5  18.750
 | Epoch: [78][1194/1563]    Time 0.913  Data 0.000  Err 6873055744.0000  top1  90.625  top5  50.000
 | Epoch: [78][1195/1563]    Time 1.004  Data 0.000  Err 1242362112.0000  top1  84.375  top5  31.250

So anyone has the same problem or how to solve it? I have run this for nearly two days, but it really disappointed me.

Thanks a lot.

Why don't ResNeXt use Pre activation?

May I ask why ResNeXt don't use Pre activation as mentioned in Identity Mappings in Deep Residual Networks. I didn't see the reason in Aggregated Residual Transformations for Deep Neural Networks.

Question about style block

Hi,

In appendix "A. Implementation Details: CIFAR" of paper "Aggregated Residual Transformations for Deep Neural Networks", it is written that: "We adopt the pre-activation style block as in [14]", where [14] is the paper "Identity mappings in deep residual networks.". However, when I look at the resnext_bottleneck_B, I see it has the original style block with BN before the sum and ReLU after. Did you get your results with the original style block or with the pre-activation style block ?

About license

Hi, I have a question about license of Imagenet pre-trained model.
I want to do fine-tuning using Imagenet pre-trained model on my own dataset.
In this case, which license should I follow? Is it non-commercial?

How do you compute GFLOPs

Hi,
I'm trying to compute GFLOPs by using this code: https://github.com/apaszke/torch-opCounter
The input data is [1, 3, 224, 224], and the computed GFLOPs is about 22.84 GFLOPs for ResNeXt-50, 32x4d, which is different with the reported 4.1 GFLOPs.

So is there anything wrong with my approach? How do you compute GFLOPs?

How to cite your ResNeXt work?

Dear author:
very great work for deep learning, yet I want to konw how can I cite your work as bibtex? the given cite seems not complete, could you please update the cite content?

 thanks a lot.

@article{Xie2016, title={Aggregated Residual Transformations for Deep Neural Networks}, author={Saining Xie and Ross Girshick and Piotr Dollár and Zhuowen Tu and Kaiming He}, journal={arXiv preprint arXiv:1611.05431}, year={2016} }

5k pretrained models

Is there any chance we can get your 5k way pretrained models? I suspect these will work much better for pretraining.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.