tingxueronghua / pytorch-classification-advprop Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
I have read the AdvResNet in your net.py, but I can not find anything about the Auxiliary BN, which is an essential part in this paper you reproduced.
I would like to ask whether PGD attacker (calculate gradients) will update the parameters (mean, var) of aux_bn in Resnet. Thank you.
Thanks for sharing your code! I found the key part is confusing. Please correct me if I am wrong.
I think the key part (copied as below) is not correct.
During the training, if our setting is mixbn = False, and the attacker= PGD attacker,
Then we first generate adversarial image with aux bn (this part is correct).
However, following the original paper, the clean images should go through clean bn and adv images go through adv bn.
In your code, the clean and adv images are concatenated first and both go through clean bn only.
if training:
self.eval()
self.apply(to_adv_status)
if isinstance(self.attacker, NoOpAttacker):
images = x
targets = labels
else:
aux_images, _ = self.attacker.attack(x, labels, self._forward_impl)
images = torch.cat([x, aux_images], dim=0)
targets = torch.cat([labels, labels], dim=0)
self.train()
if self.mixbn:
# the DataParallel usually cat the outputs along the first dimension simply,
# so if we don't change the dimensions, the outputs will be something like
# [clean_batches_gpu1, adv_batches_gpu1, clean_batches_gpu2, adv_batches_gpu2...]
# Then it will be hard to distinguish clean batches and adversarial batches.
self.apply(to_mix_status)
return self._forward_impl(images).view(2, input_len, -1).transpose(1, 0), targets.view(2, input_len).transpose(1, 0)
else:
self.apply(to_clean_status)
return self._forward_impl(images), targets
else:
images = x
targets = labels
return self._forward_impl(images), targets
Hi, Yucheng,
Thanks for your nice implementations of advprop. I have a question about performing adversarial training with advprop. In standard training, the clean data pass the main BNs and the generated adversarial data pass the auxiliary BNs. However, if we perform adversarial training, only the adversarial data will be used. So how can we perform adversarial training with advprop?
Thanks for your re-implementation!
I'm confused with one detail implementation in your code. Could you please help me?
In the code, you assume the input to the model consists of two parts, including main
part and aux
part, and you use two different bn layer to process each. However, it may be wrong to do so when combined with DataParallel.
For example, your whole batch is [main_1, main_2, main_3, main_4, aux_1, aux_2, aux_3, aux_4]
, and you adopt two GPU to train the model. In the training, the [main_1, main_2, main_3, main_4]
will be assigned to the first GPU, while the [aux_1, aux_2, aux_3, aux_4]
will be assigned to the second GPU. The code assumes the first half in each mini-batch is main part and second half is aux part, i.e., [main_1, main_2]
uses the main normalization layer and [main_3, main_4]
uses the aux normalization layer, which may be wrong.
Dear author,
Thanks for your kindly implement advprop. I am wondering that can you provide use the training time of AdvProp with ResNet-50 architecture.
Best regards
Hi I did not understand the followig part of the code in imagenet.py
if mixbn:
with torch.no_grad():
batch_size = outputs.size(0)
loss_main = criterion(outputs[:batch_size // 2], targets[:batch_size // 2]).mean()
loss_aux = criterion(outputs[batch_size // 2:], targets[batch_size // 2:]).mean()
prec1_main = accuracy(outputs.data[:batch_size // 2],
targets.data[:batch_size // 2], topk=(1,))[0]
prec1_aux = accuracy(outputs.data[batch_size // 2:],
targets.data[batch_size // 2:], topk=(1,))[0]
losses_main.update(loss_main.item(), batch_size // 2)
losses_aux.update(loss_aux.item(), batch_size // 2)
top1_main.update(prec1_main.item(), batch_size // 2)
top1_aux.update(prec1_aux.item(), batch_size // 2)
If we are not at all using the loss_main and loss_aux, why are we generating them separately? Also, why there are two ifs in the train function: 1. if args.mixup and then this if.?
Hi,
Thank you so much for your implementation!
It's exciting to see ResNet-50 trained via AdvProp reaches a similar accuracy to the vanilla ResNet-50! :)
Thank you so much!
Anh
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.