fxmeng / filter-grafting Goto Github PK
View Code? Open in Web Editor NEWFilter Grafting for Deep Neural Networks(CVPR 2020)
Home Page: https://arxiv.org/abs/2001.05868
Filter Grafting for Deep Neural Networks(CVPR 2020)
Home Page: https://arxiv.org/abs/2001.05868
in the code,
bn.append(m1.weight.data.abs() / (m1.weight.data.abs() + m2.weight.data.abs()))
i don't understand why the weight better than enotrpy?
Hi, thanks for your work! I have a question about grafting. I run grafting.sh, the accuracy is 93.910. And the result is 93.430, when train without grafting. It seems normal. but I calculate the number of invalid filters. The invalid filters ratio is 0.0366041362285614 without grafting, the result is 0.0404568612575531 with grafting.
我看代码的缩进是这样的:
def grafting(net, epoch):
while True:
try:
checkpoint = torch.load('%s/ckpt%d_%d.t7' % (args.s, args.i - 1, epoch))['net']
break
except:
time.sleep(10)
model = collections.OrderedDict()
for i, (key, u) in enumerate(net.state_dict().items()):
if 'conv' in key:
w = round(args.a / np.pi * np.arctan(args.c * (entropy(u) - entropy(checkpoint[key]))) + 0.5, 2)
model[key] = u * w + checkpoint[key] * (1 - w)
net.load_state_dict(model)
这里w是嫁接系数α
所以是所有层都参与嫁接?
但是这些不是卷积层的层,它们的嫁接系数是通过其上面的一个卷积层来计算的?
谢谢!
for i,(key,u) in enumerate(net.state_dict().items()):
if 'conv' in key:
w=round(0.4*(np.arctan(500*((float(entropy(u).cpu())-float(entropy(checkpoint[key]).cpu())))))/np.pi+1/2,2)
model[key]=u*w+checkpoint[key]*(1-w)
For layeres are not "conv", such as "BN", what value of "w" sholud be? It seemes that they are determinted by the previous layer, typically are pre "conv"?
您的论文中的α的定义和代码中的计算方式有所不同
代码中α计算为w = round(args.a / np.pi * np.arctan(args.c * (entropy(u) - entropy(checkpoint[key]))) + 0.5, 2)
而文章中并未除以π
请问代码中多出的这一步计算参数有何意义呢
It seems there is no special limitation for this work.
Is there any bugs in your baseline.py? I ran the resnet32 on cifar100, but the acc is only 4%
Running grafting.py directly cannot train all the time
The training will only go to epoch 0, then pause, and show that it has been stuck in the time.sleep (10) statement. . .
May I ask the author
What is the normal training procedure, and where is my problem. . . Thank you
models的vgg代码中构造网络时使用了 _make_layers构造卷积层,在运行grafting.py时会出现UnboundLocalError: local variable 'w' referenced before assignment
w=round(0.4*(np.arctan(500*((float(entropy(u).cpu())-float(entropy(checkpoint[key]).cpu())))))/np.pi+1/2,2)
作者你好,如代码里所示,Adaptive Weighting公式中的A和c的默认值为0.4和500,如果是这样的话,算出来的w值基本是0.3或者是0.7,并不是我所理解的一个比较adaptive(smooth)的区间,请问这里会有问题吗?
此外,A应该是控制当熵比较小的时候保留本模型filter的权重,这里的0.4对应的是本模型的权重最小是0.3。请问A的值对结果影响大吗?作者有试过其他值吗?
如题,请问是否有一些不公平?
你好,非常感谢你的工作,但是我在测试你的方法的时候,始终无法达到你的效果,我验证了不同学习率下mbv2网络在cifar-10上的得分,以及cos学习策略下的得分,和使用graft方法后得分,发现graft算法好像没作用(基础模型是92.1,graft是92.28),请问你知道原因吗?所有设置均按照默认设置
学习方式 学习率 精度 文件夹
lr 0.1 92.10 2
lr 0.1(2) 94.06 5
lr 0.1(10) 90.10 4
lr 0.1(100) 40.90 3
cos 0.1 92.75(92.6) 6
grafting(lr) 0.1 92.28 2
请问,文章使用KL散度作为了一个评价准则,但是KL散度是互信息,请问互信息作为单一变量的评价准则是否合适?还是我的理解有偏差?
作者你好,你们这个工作很赞!有个问题是Table3和Table5中,graft-resnet32-56-110,在CIFAR10上的结果为什么是不一样的?谢谢~
Hello:
I noticed that the acc would drop after the decreasing epoch. I trained your baseline model and grafting model with the same cosine learning rate. The acc for the baseline (mobilenetv2) is 72.83, and that for the grafting model(2 models) is 72.80. The acc decreased! I noticed that you never tried the cosine lr. I wonder that is there any hyperparameters wrong (I changed nothing in your grafting.py)?
![image](https://user-images.githubusercontent.com/26025961/77132458-8a898e80-6a9a-11ea-9782-4398f851b752.png
I trained the resnet32 on cifar100 with 2 grafting setting. I set the seed 1 for the first model and 2 for the second. But I found the acc even decreased.
etwork:2 epoch:0 accuracy:14.700 best:14.700
Network:2 epoch:1 accuracy:16.540 best:16.540
Network:2 epoch:2 accuracy:24.700 best:24.700
Network:2 epoch:3 accuracy:23.520 best:24.700
Network:2 epoch:4 accuracy:32.250 best:32.250
Network:2 epoch:5 accuracy:34.070 best:34.070
Network:2 epoch:6 accuracy:37.280 best:37.280
Network:2 epoch:7 accuracy:37.320 best:37.320
Network:2 epoch:8 accuracy:36.540 best:37.320
Network:2 epoch:9 accuracy:40.050 best:40.050
Network:2 epoch:10 accuracy:44.270 best:44.270
Network:2 epoch:11 accuracy:39.200 best:44.270
Network:2 epoch:12 accuracy:38.870 best:44.270
Network:2 epoch:13 accuracy:44.300 best:44.300
Network:2 epoch:14 accuracy:43.690 best:44.300
Network:2 epoch:15 accuracy:38.960 best:44.300
Network:2 epoch:16 accuracy:47.530 best:47.530
Network:2 epoch:17 accuracy:40.490 best:47.530
Network:2 epoch:18 accuracy:46.240 best:47.530
Network:2 epoch:19 accuracy:40.100 best:47.530
Network:2 epoch:20 accuracy:41.650 best:47.530
Network:2 epoch:21 accuracy:44.300 best:47.530
Network:2 epoch:22 accuracy:44.440 best:47.530
Network:2 epoch:23 accuracy:44.870 best:47.530
Network:2 epoch:24 accuracy:43.860 best:47.530
Network:2 epoch:25 accuracy:46.760 best:47.530
Network:2 epoch:26 accuracy:32.770 best:47.530
Network:2 epoch:27 accuracy:42.220 best:47.530
Network:2 epoch:28 accuracy:48.920 best:48.920
Network:2 epoch:29 accuracy:44.960 best:48.920
Network:2 epoch:30 accuracy:45.670 best:48.920
Network:2 epoch:31 accuracy:45.630 best:48.920
Network:2 epoch:32 accuracy:45.840 best:48.920
Network:2 epoch:33 accuracy:46.910 best:48.920
Network:2 epoch:34 accuracy:51.240 best:51.240
Network:2 epoch:35 accuracy:48.490 best:51.240
Network:2 epoch:36 accuracy:49.460 best:51.240
Network:2 epoch:37 accuracy:45.080 best:51.240
Network:2 epoch:38 accuracy:49.390 best:51.240
Network:2 epoch:39 accuracy:45.370 best:51.240
Network:2 epoch:40 accuracy:40.510 best:51.240
Network:2 epoch:41 accuracy:39.560 best:51.240
Network:2 epoch:42 accuracy:46.540 best:51.240
Network:2 epoch:43 accuracy:48.780 best:51.240
Network:2 epoch:44 accuracy:49.220 best:51.240
Network:2 epoch:45 accuracy:46.590 best:51.240
Network:2 epoch:46 accuracy:40.120 best:51.240
Network:2 epoch:47 accuracy:44.470 best:51.240
Network:2 epoch:48 accuracy:42.030 best:51.240
Network:2 epoch:49 accuracy:47.310 best:51.240
Network:2 epoch:50 accuracy:46.580 best:51.240
Network:2 epoch:51 accuracy:45.010 best:51.240
Network:2 epoch:52 accuracy:46.270 best:51.240
Network:2 epoch:53 accuracy:47.070 best:51.240
Network:2 epoch:54 accuracy:46.270 best:51.240
Network:2 epoch:55 accuracy:49.480 best:51.240
Network:2 epoch:56 accuracy:45.360 best:51.240
Network:2 epoch:57 accuracy:46.950 best:51.240
Network:2 epoch:58 accuracy:47.840 best:51.240
Network:2 epoch:59 accuracy:52.580 best:52.580
Network:2 epoch:60 accuracy:43.420 best:52.580
Network:2 epoch:61 accuracy:68.060 best:68.060
Network:2 epoch:62 accuracy:68.230 best:68.230
Network:2 epoch:63 accuracy:68.550 best:68.550
Network:2 epoch:64 accuracy:68.750 best:68.750
Network:2 epoch:65 accuracy:68.490 best:68.750
Network:2 epoch:66 accuracy:68.290 best:68.750
Network:2 epoch:67 accuracy:68.320 best:68.750
Network:2 epoch:68 accuracy:67.870 best:68.750
Network:2 epoch:69 accuracy:68.240 best:68.750
Network:2 epoch:70 accuracy:67.790 best:68.750
Network:2 epoch:71 accuracy:67.170 best:68.750
Network:2 epoch:72 accuracy:68.130 best:68.750
Network:2 epoch:73 accuracy:68.610 best:68.750
Network:2 epoch:74 accuracy:66.910 best:68.750
Network:2 epoch:75 accuracy:66.640 best:68.750
Network:2 epoch:76 accuracy:66.710 best:68.750
Network:2 epoch:77 accuracy:66.220 best:68.750
Network:2 epoch:78 accuracy:65.440 best:68.750
Network:2 epoch:79 accuracy:66.520 best:68.750
Network:2 epoch:80 accuracy:66.810 best:68.750
Network:2 epoch:81 accuracy:66.030 best:68.750
Network:2 epoch:82 accuracy:65.430 best:68.750
Network:2 epoch:83 accuracy:66.470 best:68.750
Network:2 epoch:84 accuracy:66.250 best:68.750
Network:2 epoch:85 accuracy:65.690 best:68.750
Network:2 epoch:86 accuracy:65.500 best:68.750
Network:2 epoch:87 accuracy:66.020 best:68.750
Network:2 epoch:88 accuracy:65.160 best:68.750
Network:2 epoch:89 accuracy:63.700 best:68.750
Network:2 epoch:90 accuracy:65.590 best:68.750
Network:2 epoch:91 accuracy:65.310 best:68.750
Network:2 epoch:92 accuracy:63.440 best:68.750
Network:2 epoch:93 accuracy:64.340 best:68.750
Network:2 epoch:94 accuracy:64.090 best:68.750
Network:2 epoch:95 accuracy:64.020 best:68.750
Network:2 epoch:96 accuracy:63.130 best:68.750
Network:2 epoch:97 accuracy:62.210 best:68.750
Network:2 epoch:98 accuracy:63.610 best:68.750
Network:2 epoch:99 accuracy:63.960 best:68.750
Network:2 epoch:100 accuracy:64.730 best:68.750
Network:2 epoch:101 accuracy:65.030 best:68.750
Network:2 epoch:102 accuracy:64.990 best:68.750
Network:2 epoch:103 accuracy:64.130 best:68.750
Network:2 epoch:104 accuracy:63.280 best:68.750
Network:2 epoch:105 accuracy:63.800 best:68.750
Network:2 epoch:106 accuracy:64.050 best:68.750
Network:2 epoch:107 accuracy:63.460 best:68.750
Network:2 epoch:108 accuracy:64.790 best:68.750
Network:2 epoch:109 accuracy:64.470 best:68.750
Network:2 epoch:110 accuracy:65.210 best:68.750
Network:2 epoch:111 accuracy:64.350 best:68.750
Network:2 epoch:112 accuracy:62.980 best:68.750
Network:2 epoch:113 accuracy:63.390 best:68.750
Network:2 epoch:114 accuracy:63.860 best:68.750
Network:2 epoch:115 accuracy:64.430 best:68.750
Network:2 epoch:116 accuracy:62.950 best:68.750
Network:2 epoch:117 accuracy:63.950 best:68.750
Network:2 epoch:118 accuracy:64.000 best:68.750
Network:2 epoch:119 accuracy:63.570 best:68.750
Network:2 epoch:120 accuracy:62.570 best:68.750
Network:2 epoch:121 accuracy:70.290 best:70.290
Network:2 epoch:122 accuracy:69.990 best:70.290
Network:2 epoch:123 accuracy:70.000 best:70.290
Network:2 epoch:124 accuracy:70.210 best:70.290
Network:2 epoch:125 accuracy:69.750 best:70.290
Network:2 epoch:126 accuracy:69.850 best:70.290
Network:2 epoch:127 accuracy:70.200 best:70.290
Network:2 epoch:128 accuracy:69.730 best:70.290
Network:2 epoch:129 accuracy:69.830 best:70.290
Network:2 epoch:130 accuracy:69.780 best:70.290
Network:2 epoch:131 accuracy:69.520 best:70.290
Network:2 epoch:132 accuracy:69.560 best:70.290
Network:2 epoch:133 accuracy:69.630 best:70.290
Network:2 epoch:134 accuracy:69.770 best:70.290
Network:2 epoch:135 accuracy:69.750 best:70.290
Network:2 epoch:136 accuracy:69.390 best:70.290
Network:2 epoch:137 accuracy:69.630 best:70.290
Network:2 epoch:138 accuracy:69.250 best:70.290
Network:2 epoch:139 accuracy:69.460 best:70.290
Network:2 epoch:140 accuracy:69.420 best:70.290
Network:2 epoch:141 accuracy:69.230 best:70.290
Network:2 epoch:142 accuracy:69.490 best:70.290
Network:2 epoch:143 accuracy:69.430 best:70.290
Network:2 epoch:144 accuracy:69.220 best:70.290
Network:2 epoch:145 accuracy:69.660 best:70.290
Network:2 epoch:146 accuracy:69.330 best:70.290
Network:2 epoch:147 accuracy:69.070 best:70.290
Network:2 epoch:148 accuracy:69.260 best:70.290
Network:2 epoch:149 accuracy:69.350 best:70.290
Network:2 epoch:150 accuracy:69.130 best:70.290
Network:2 epoch:151 accuracy:69.270 best:70.290
Network:2 epoch:152 accuracy:68.890 best:70.290
Network:2 epoch:153 accuracy:69.220 best:70.290
Network:2 epoch:154 accuracy:68.980 best:70.290
Network:2 epoch:155 accuracy:68.850 best:70.290
Network:2 epoch:156 accuracy:68.970 best:70.290
Network:2 epoch:157 accuracy:69.260 best:70.290
Network:2 epoch:158 accuracy:69.140 best:70.290
Network:2 epoch:159 accuracy:69.100 best:70.290
Network:2 epoch:160 accuracy:68.860 best:70.290
Network:2 epoch:161 accuracy:68.990 best:70.290
Network:2 epoch:162 accuracy:69.120 best:70.290
Network:2 epoch:163 accuracy:68.780 best:70.290
Network:2 epoch:164 accuracy:69.190 best:70.290
Network:2 epoch:165 accuracy:68.560 best:70.290
Network:2 epoch:166 accuracy:68.860 best:70.290
Network:2 epoch:167 accuracy:68.860 best:70.290
Network:2 epoch:168 accuracy:68.620 best:70.290
Network:2 epoch:169 accuracy:69.010 best:70.290
Network:2 epoch:170 accuracy:68.760 best:70.290
Network:2 epoch:171 accuracy:68.680 best:70.290
Network:2 epoch:172 accuracy:68.950 best:70.290
Network:2 epoch:173 accuracy:68.830 best:70.290
Network:2 epoch:174 accuracy:68.740 best:70.290
Network:2 epoch:175 accuracy:68.780 best:70.290
Network:2 epoch:176 accuracy:68.620 best:70.290
Network:2 epoch:177 accuracy:68.410 best:70.290
Network:2 epoch:178 accuracy:68.540 best:70.290
Network:2 epoch:179 accuracy:68.660 best:70.290
Network:2 epoch:180 accuracy:68.610 best:70.290
Network:2 epoch:181 accuracy:68.750 best:70.290
Network:2 epoch:182 accuracy:68.690 best:70.290
Network:2 epoch:183 accuracy:68.730 best:70.290
Network:2 epoch:184 accuracy:68.710 best:70.290
Network:2 epoch:185 accuracy:68.750 best:70.290
Network:2 epoch:186 accuracy:68.590 best:70.290
Network:2 epoch:187 accuracy:68.710 best:70.290
Network:2 epoch:188 accuracy:68.740 best:70.290
Network:2 epoch:189 accuracy:68.690 best:70.290
Network:2 epoch:190 accuracy:68.820 best:70.290
Network:2 epoch:191 accuracy:68.820 best:70.290
Network:2 epoch:192 accuracy:68.480 best:70.290
Network:2 epoch:193 accuracy:68.810 best:70.290
Network:2 epoch:194 accuracy:68.760 best:70.290
Network:2 epoch:195 accuracy:68.790 best:70.290
Network:2 epoch:196 accuracy:68.770 best:70.290
Network:2 epoch:197 accuracy:68.590 best:70.290
Network:2 epoch:198 accuracy:68.730 best:70.290
Network:2 epoch:199 accuracy:68.840 best:70.290
The final act is 68.84. The corresponding baseline in your paper is 69.82
176行 torch.save(state, '%s/ckpt%d_%d.t7' % (args.s, args.i % args.num, epoch)),保存了一个epoch训练好的模型权重
106行 checkpoint = torch.load('%s/ckpt%d_%d.t7' % (args.s, args.i - 1, epoch))['net']。加载已有的权重信息,可是没有这个名字的权重信息,会导致程序在 time.sleep(10)死循环里,希望能够得到您的解答感谢
I would like to know do you use bn in grafting training?
If so, how do you grafting bn parameters, the same as weights?
直接运行grafting.py 无法一直训练
训练只会到 epoch 0 , 然后暂停 ,,,显示是一直停留在time.sleep(10)这个语句。。。
请问下作者
正常的训练步骤是怎样的,,,我的问题出在哪里。。。谢谢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.