License: MIT License

Python 99.77% Shell 0.23%

pytorch-classification-advprop's People

Contributors

Stargazers

Watchers

Forkers

jaiabhayk forlovezed ml-edu ytongbai scott-mao peterouzh sunrise6513 penguinleee jinhaoduan cv-ip hfz223322 zheng547 kmc0207 woojulee24 apricess

pytorch-classification-advprop's Issues

Where is the Auxiliary BN in your codes?

I have read the AdvResNet in your net.py, but I can not find anything about the Auxiliary BN, which is an essential part in this paper you reproduced.

About BN in Advprop

I would like to ask whether PGD attacker (calculate gradients) will update the parameters (mean, var) of aux_bn in Resnet. Thank you.

Key part error

Thanks for sharing your code! I found the key part is confusing. Please correct me if I am wrong.
I think the key part (copied as below) is not correct.

During the training, if our setting is mixbn = False, and the attacker= PGD attacker,
Then we first generate adversarial image with aux bn (this part is correct).
However, following the original paper, the clean images should go through clean bn and adv images go through adv bn.
In your code, the clean and adv images are concatenated first and both go through clean bn only.

if training:
self.eval()
self.apply(to_adv_status)
if isinstance(self.attacker, NoOpAttacker):
images = x
targets = labels
else:
aux_images, _ = self.attacker.attack(x, labels, self._forward_impl)
images = torch.cat([x, aux_images], dim=0)
targets = torch.cat([labels, labels], dim=0)
self.train()

        if self.mixbn:
            # the DataParallel usually cat the outputs along the first dimension simply,
            # so if we don't change the dimensions, the outputs will be something like
            # [clean_batches_gpu1, adv_batches_gpu1, clean_batches_gpu2, adv_batches_gpu2...]
            # Then it will be hard to distinguish clean batches and adversarial batches.
            self.apply(to_mix_status)
            return self._forward_impl(images).view(2, input_len, -1).transpose(1, 0), targets.view(2, input_len).transpose(1, 0)
        else:
            self.apply(to_clean_status)
            return self._forward_impl(images), targets
    else:
        images = x
        targets = labels
        return self._forward_impl(images), targets

About performing adversarial training with advprop

Hi, Yucheng,

Thanks for your nice implementations of advprop. I have a question about performing adversarial training with advprop. In standard training, the clean data pass the main BNs and the generated adversarial data pass the auxiliary BNs. However, if we perform adversarial training, only the adversarial data will be used. So how can we perform adversarial training with advprop?

Questions about mixbn

Thanks for your re-implementation!

I'm confused with one detail implementation in your code. Could you please help me?

In the code, you assume the input to the model consists of two parts, including main part and aux part, and you use two different bn layer to process each. However, it may be wrong to do so when combined with DataParallel.

For example, your whole batch is [main_1, main_2, main_3, main_4, aux_1, aux_2, aux_3, aux_4], and you adopt two GPU to train the model. In the training, the [main_1, main_2, main_3, main_4] will be assigned to the first GPU, while the [aux_1, aux_2, aux_3, aux_4] will be assigned to the second GPU. The code assumes the first half in each mini-batch is main part and second half is aux part, i.e., [main_1, main_2] uses the main normalization layer and [main_3, main_4] uses the aux normalization layer, which may be wrong.

Traning time with ResNet50 networks

Dear author,

Thanks for your kindly implement advprop. I am wondering that can you provide use the training time of AdvProp with ResNet-50 architecture.

Best regards

The mixbn loss

Hi I did not understand the followig part of the code in imagenet.py
if mixbn:
with torch.no_grad():
batch_size = outputs.size(0)
loss_main = criterion(outputs[:batch_size // 2], targets[:batch_size // 2]).mean()
loss_aux = criterion(outputs[batch_size // 2:], targets[batch_size // 2:]).mean()
prec1_main = accuracy(outputs.data[:batch_size // 2],
targets.data[:batch_size // 2], topk=(1,))[0]
prec1_aux = accuracy(outputs.data[batch_size // 2:],
targets.data[batch_size // 2:], topk=(1,))[0]
losses_main.update(loss_main.item(), batch_size // 2)
losses_aux.update(loss_aux.item(), batch_size // 2)
top1_main.update(prec1_main.item(), batch_size // 2)
top1_aux.update(prec1_aux.item(), batch_size // 2)

If we are not at all using the loss_main and loss_aux, why are we generating them separately? Also, why there are two ifs in the train function: 1. if args.mixup and then this if.?

Pre-trained model for the ResNet-50 trained via AdvProp

Hi,

Thank you so much for your implementation!

It's exciting to see ResNet-50 trained via AdvProp reaches a similar accuracy to the vanilla ResNet-50! :)

Would you be able to share the model that achieves 77.42% (as reported in README.md)?
Do you have any statistics on the adversarial accuracy of this model?

Thank you so much!

Anh

tingxueronghua / pytorch-classification-advprop Goto Github PK

pytorch-classification-advprop's People

Contributors

Stargazers

Watchers

Forkers

pytorch-classification-advprop's Issues

Where is the Auxiliary BN in your codes?

About BN in Advprop

Key part error

About performing adversarial training with advprop

Questions about mixbn

Traning time with ResNet50 networks

The mixbn loss

Pre-trained model for the ResNet-50 trained via AdvProp

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs