facebookresearch / resnext Goto Github PK
View Code? Open in Web Editor NEWImplementation of a classification framework from the paper Aggregated Residual Transformations for Deep Neural Networks
License: Other
Implementation of a classification framework from the paper Aggregated Residual Transformations for Deep Neural Networks
License: Other
I trained ResNeXt-64x8 and ResNeXt-64x16 on Cifar-10. But the result is slight worse than the results released in the paper. I trained these models by three gpus:
th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16/8 -weightDecay 5e-4 -batchSize 128 -nGPU 3 -nThreads 8 -shareGradInput true
the best model results I got for each run is:
ResNeXt-64x8: 3.68%, 3.8%, 3.86% 3.90% 3.76% 3.84%
ResNeXt-64x16: 3.84% 3.60%
The average results in the paper is 3.65% and 3.58% for these two net.
Does someone have the same question? Any suggestions?
Thanks a lot.
Hi,can u provide the pretrained model of ResNeXt.The model link provided seems broken!
Recently I am trying to reproduce your result in torch. And my command is
th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-6 -batchSize 32 -nGPU 2 -LR 0.025 -nThreads 8 -shareGradInput true | tee -a ./cifar10_2gpu_torch.log
I copied the command in README.md(CIFAR10 and 2GPUs).
The problem starts from #71 epoch.
Here is my log file(For readability I pick out part of them):
* Finished epoch # 60 top1: 6.610 top5: 0.140
* Finished epoch # 61 top1: 7.050 top5: 0.150
* Finished epoch # 62 top1: 7.680 top5: 0.240
* Finished epoch # 63 top1: 7.180 top5: 0.230
* Finished epoch # 64 top1: 7.100 top5: 0.220
* Finished epoch # 65 top1: 6.980 top5: 0.160
* Finished epoch # 66 top1: 6.850 top5: 0.170
* Finished epoch # 67 top1: 6.870 top5: 0.180
* Finished epoch # 68 top1: 7.010 top5: 0.270
* Finished epoch # 69 top1: 6.910 top5: 0.220
* Finished epoch # 70 top1: 6.290 top5: 0.130
* Finished epoch # 71 top1: 85.740 top5: 34.780
* Finished epoch # 72 top1: 81.790 top5: 33.700
* Finished epoch # 73 top1: 80.220 top5: 28.920
* Finished epoch # 74 top1: 79.200 top5: 31.640
* Finished epoch # 75 top1: 78.980 top5: 27.150
* Finished epoch # 76 top1: 79.540 top5: 30.260
* Finished epoch # 77 top1: 81.540 top5: 29.620
And the epoch output for a single batch:
| Epoch: [78][1158/1563] Time 1.024 Data 0.000 Err 1528913280.0000 top1 81.250 top5 28.125
| Epoch: [78][1159/1563] Time 0.881 Data 0.000 Err 1559899264.0000 top1 81.250 top5 15.625
| Epoch: [78][1160/1563] Time 0.975 Data 0.000 Err 8231911424.0000 top1 87.500 top5 40.625
| Epoch: [78][1161/1563] Time 0.928 Data 0.000 Err 554394944.0000 top1 78.125 top5 28.125
| Epoch: [78][1162/1563] Time 1.012 Data 0.000 Err 4567331328.0000 top1 93.750 top5 40.625
| Epoch: [78][1163/1563] Time 1.146 Data 0.000 Err 2310403584.0000 top1 78.125 top5 34.375
| Epoch: [78][1164/1563] Time 0.947 Data 0.000 Err 2803231744.0000 top1 81.250 top5 25.000
| Epoch: [78][1165/1563] Time 0.956 Data 0.000 Err 2265360896.0000 top1 87.500 top5 50.000
| Epoch: [78][1166/1563] Time 0.867 Data 0.000 Err 1953190016.0000 top1 84.375 top5 21.875
| Epoch: [78][1167/1563] Time 1.014 Data 0.000 Err 2912053760.0000 top1 93.750 top5 28.125
| Epoch: [78][1168/1563] Time 1.007 Data 0.000 Err 4222694656.0000 top1 84.375 top5 31.250
| Epoch: [78][1169/1563] Time 0.895 Data 0.000 Err 5509958144.0000 top1 81.250 top5 37.500
| Epoch: [78][1170/1563] Time 0.979 Data 0.000 Err 5301891584.0000 top1 84.375 top5 34.375
| Epoch: [78][1171/1563] Time 0.920 Data 0.000 Err 3593149184.0000 top1 87.500 top5 28.125
| Epoch: [78][1172/1563] Time 1.020 Data 0.000 Err 7279746560.0000 top1 90.625 top5 31.250
| Epoch: [78][1173/1563] Time 1.002 Data 0.000 Err 10108009472.0000 top1 87.500 top5 31.250
| Epoch: [78][1174/1563] Time 0.861 Data 0.001 Err 2861270528.0000 top1 87.500 top5 28.125
| Epoch: [78][1175/1563] Time 0.862 Data 0.000 Err 4651573760.0000 top1 87.500 top5 31.250
| Epoch: [78][1176/1563] Time 1.051 Data 0.000 Err 92108896.0000 top1 75.000 top5 31.250
| Epoch: [78][1177/1563] Time 1.024 Data 0.000 Err 2649925888.0000 top1 87.500 top5 43.750
| Epoch: [78][1178/1563] Time 0.967 Data 0.000 Err 2876758784.0000 top1 71.875 top5 18.750
| Epoch: [78][1179/1563] Time 0.942 Data 0.000 Err 2976156928.0000 top1 71.875 top5 15.625
| Epoch: [78][1180/1563] Time 0.882 Data 0.000 Err 838116416.0000 top1 78.125 top5 43.750
| Epoch: [78][1181/1563] Time 1.028 Data 0.000 Err 6477106688.0000 top1 78.125 top5 37.500
| Epoch: [78][1182/1563] Time 1.004 Data 0.000 Err 5051654144.0000 top1 84.375 top5 31.250
| Epoch: [78][1183/1563] Time 0.859 Data 0.000 Err 5013932544.0000 top1 87.500 top5 34.375
| Epoch: [78][1184/1563] Time 0.848 Data 0.001 Err 2034009088.0000 top1 93.750 top5 25.000
| Epoch: [78][1185/1563] Time 1.060 Data 0.000 Err 3669680640.0000 top1 78.125 top5 25.000
| Epoch: [78][1186/1563] Time 1.028 Data 0.000 Err 4146675200.0000 top1 93.750 top5 28.125
| Epoch: [78][1187/1563] Time 0.966 Data 0.000 Err 2259935488.0000 top1 84.375 top5 34.375
| Epoch: [78][1188/1563] Time 0.956 Data 0.000 Err 1698448512.0000 top1 75.000 top5 25.000
| Epoch: [78][1189/1563] Time 0.864 Data 0.000 Err 4151320064.0000 top1 90.625 top5 56.250
| Epoch: [78][1190/1563] Time 1.035 Data 0.000 Err 1942320000.0000 top1 87.500 top5 31.250
| Epoch: [78][1191/1563] Time 1.026 Data 0.000 Err 1455451520.0000 top1 81.250 top5 31.250
| Epoch: [78][1192/1563] Time 0.867 Data 0.000 Err 2734585856.0000 top1 90.625 top5 40.625
| Epoch: [78][1193/1563] Time 0.965 Data 0.000 Err 36324916.0000 top1 81.250 top5 18.750
| Epoch: [78][1194/1563] Time 0.913 Data 0.000 Err 6873055744.0000 top1 90.625 top5 50.000
| Epoch: [78][1195/1563] Time 1.004 Data 0.000 Err 1242362112.0000 top1 84.375 top5 31.250
So anyone has the same problem or how to solve it? I have run this for nearly two days, but it really disappointed me.
Thanks a lot.
May I ask why ResNeXt don't use Pre activation as mentioned in Identity Mappings in Deep Residual Networks. I didn't see the reason in Aggregated Residual Transformations for Deep Neural Networks.
Hi,
In appendix "A. Implementation Details: CIFAR" of paper "Aggregated Residual Transformations for Deep Neural Networks", it is written that: "We adopt the pre-activation style block as in [14]", where [14] is the paper "Identity mappings in deep residual networks.". However, when I look at the resnext_bottleneck_B, I see it has the original style block with BN before the sum and ReLU after. Did you get your results with the original style block or with the pre-activation style block ?
Hi, I have a question about license of Imagenet pre-trained model.
I want to do fine-tuning using Imagenet pre-trained model on my own dataset.
In this case, which license should I follow? Is it non-commercial?
Seems the Caffe third party repository no longer exists. Is there an alternative? I found https://github.com/firekong0909/ResNeXt but the download links appear to be broken.
Hi,
I'm trying to compute GFLOPs by using this code: https://github.com/apaszke/torch-opCounter
The input data is [1, 3, 224, 224]
, and the computed GFLOPs is about 22.84 GFLOPs for ResNeXt-50, 32x4d, which is different with the reported 4.1 GFLOPs.
So is there anything wrong with my approach? How do you compute GFLOPs?
Hi, where can we find the training scripts of ssd-resnet50, ssd-resnet-152 in caffe. I need that .
Kindly reply for this issue.
Thank yoyu
Dear author:
very great work for deep learning, yet I want to konw how can I cite your work as bibtex? the given cite seems not complete, could you please update the cite content?
thanks a lot.
@article{Xie2016, title={Aggregated Residual Transformations for Deep Neural Networks}, author={Saining Xie and Ross Girshick and Piotr Dollár and Zhuowen Tu and Kaiming He}, journal={arXiv preprint arXiv:1611.05431}, year={2016} }
Is there any chance we can get your 5k way pretrained models? I suspect these will work much better for pretraining.
What lr should I choose for imagenet training if I use 4 GPUs instead of 4 with batch size of 32 per gpu?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.