carlini / nn_robust_attacks Goto Github PK

View Code? Open in Web Editor NEW

787.0 787.0 229.0 57 KB

Robust evasion attacks against neural network to find adversarial examples

License: BSD 2-Clause "Simplified" License

Python 100.00%

nn_robust_attacks's People

Contributors

Stargazers

Watchers

Forkers

arjunbhagoji tartaruszen yenchenlin william-r-s abhishekk-ml dreadlord1984 hangjie720 ml-lab wuliwei9278 shenqixiaojiang prabhant mmccoyd vlnguyen92 zhaogang92 gwding iamgroot42 virilo liuheng2cqupt jiangxiaoxiong xrj-com xiangyuwei jeffjunzhang mzweilin tzl0031 kurnianggoro senwang86 duoergun0729 runngezhang keyky mstczuo sanolans fatykrch beyondboy lrisliu fbcotter aogrcs shangtse mamengyiyi tianweixing kejihan arthurconan sauln huahongzhang bhushan-jagtap-2013 akolada paulgowdy amyzhw adityabantwal ldmds winterissunny kutim yeying213 lbh1995 zekunstevenzhang msrocean dracwww hanfeng-cdd weitianli varunotelli chenhx1992 phecy luizgh weizequan mathcbc petrasuk sunshine352 bybylove niepei bishwashere mrzhouqifei superf0sh cod3r0k overflocat dinggit maxrumi samjh keranrong xiaofanustc lihebi wilbert-wu praveern apakat huyoboy mlpassion josh200501 foroliviawong matchading lishaofeng 105062125 rubinxin jonas-klesen qilong-zhang vivekvekariya khchow-gt adam-dziedzic icmpnorequest cuixiongyi stealth1206 jiweitian tdczlhb

nn_robust_attacks's Issues

Traceback (most recent call last):
File "/content/drive/MyDrive/Codes/carlini_nn_robust_attacks/nn_robust_attacks/train_models.py", line 113, in
num_epochs=1, train_temp=100)
File "/content/drive/MyDrive/Codes/carlini_nn_robust_attacks/nn_robust_attacks/train_models.py", line 88, in train_distillation
init=file_name+"_init")
File "/content/drive/MyDrive/Codes/carlini_nn_robust_attacks/nn_robust_attacks/train_models.py", line 50, in train
model.load_weights(init)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2227, in load_weights
with h5py.File(filepath, 'r') as f:
File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 408, in init
swmr=swmr)
File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file read failed: time = Sun Jan 3 14:21:59 2021
, filename = '/content/models/mnist-distilled-100_init_init', file descriptor = 4, errno = 21, error message = 'Is a directory', buf = 0x7fffd5aa6370, total read size = 8, bytes this sub-read = 8, bytes actually read = 18446744073709551615, offset = 0)

please help me to resolve this error, thank you.

no boxmin and boxmax in L_0 and L_inf

Hi author,

I'm developing a robust model.

For some mathematical reasons, input images should be defined in range [0, 1].

However, attacks are implemented by tanh which range is [-0.5, 0.5].

L_0 and L_inf do not provide any parameters boxmin and boxmax to shift range to [boxmin, boxmax].

Someone implement configurable range in L_2 attack.

commit b5925dd
| Author: w
| Date: Wed Oct 18 09:05:25 2017 +0800
|
| Make range of box constraints configurable
|

Can we safely modify L_0 and L_inf attacks based on this commit?

l0 not implemented correctly

function compare(x,y) does not called after following code.

nn_robust_attacks/l0_attack.py

Line 161 in d2067d5

if works < .0001 and self.ABORT_EARLY:

So, there is no checking whether the attack is success or not.

By default self.independent_channels is False
Then, we will run following code

nn_robust_attacks/l0_attack.py

Line 228 in d2067d5

valid = valid.reshape((self.model.image_size**2,self.model.num_channels))

nn_robust_attacks/l0_attack.py

Line 229 in d2067d5

totalchange = abs(np.sum(nimg[0]-img,axis=2))*np.sum(np.abs(gradientnorm[0]),axis=2)

So, valid has shape (pixels, channels), totalchange has shape (pixels`)``. Let's consider color image (3 channels). It turns out the shape of validandtotalchange``` not matched.

In the following code,

nn_robust_attacks/l0_attack.py

Line 237 in d2067d5

valid[e] = 0

You basically change initial channel value (0,0,0) to 0, which is not correct.

Parameter setting for L_0 and L_inf attack

I can not reproduce the same result of L0 and Li attacks with your paper.

Based on your L2 example:

attack = CarliniL2(sess, model, batch_size=9, max_iterations=1000, confidence=0)

Because in L0 and Li don't have batch size and confidence parameters, I used these settings:

attack = CarliniL0(sess, model, max_iterations=1000)

and

attack = CarliniLi(sess, model, max_iterations=1000)

I checked that there should be some additional parameters for them.

CarliniL0(sess, model, targeted = TARGETED, learning_rate = LEARNING_RATE, max_iterations = MAX_ITERATIONS, abort_early = ABORT_EARLY, initial_const = INITIAL_CONST, largest_const = LARGEST_CONST, reduce_const = REDUCE_CONST, const_factor = CONST_FACTOR, independent_channels = False)

CarliniL2(sess, model, batch_size=1, confidence = CONFIDENCE, targeted = TARGETED, learning_rate = LEARNING_RATE, binary_search_steps = BINARY_SEARCH_STEPS, max_iterations = MAX_ITERATIONS, abort_early = ABORT_EARLY, initial_const = INITIAL_CONST)

CarliniLi(sess, model, targeted = TARGETED, learning_rate = LEARNING_RATE, max_iterations = MAX_ITERATIONS, abort_early = ABORT_EARLY, initial_const = INITIAL_CONST, largest_const = LARGEST_CONST, reduce_const = REDUCE_CONST, decrease_factor = DECREASE_FACTOR, const_factor = CONST_FACTOR)

So, can you describe your attack parameter settings to get the result in your paper?

Can I run this on a Mac OS?

Can I run this on a Mac OS? I've run into problems loading tensorflow-gpu libraries.

Why the label of attack images is same with the original images?

The label of attack images is same with the original images when I use the run_inference_on_image function of setup_inception file.
So does it mean that the attack almost failed?

About the hyper parameters for cifar and mnist

I want to reproduce the experiment result in cifar and mnist, can you share me your hyper parameters for all the l0, l2 and li attack? Or you just use the default parameters in both mnist and cifar datasets ?
I have tried the default hyper parameters for cifar in l0 attack, but the noise of the adversarial examples were not sparse, that is to say, the L0 value of the noise were similar to the L0 value in the l2 attack.
Can you give some help ? Thanks!

The performance of l2_attack on Pytorch

Hi! I am trying to reproduce your l2_attack on Pytorch to test the robustness of my network (I trained the model with Pytorch).

But now there will be around 5 samples in each 100 samples that don't have adversarial counterparts according to the results (their l2 distance are 1e10 which is the defualt initial value set in the code).

The only two differences between my code (Pytorch version) and your code (Tensorflow version) are:
(1) the initialization of modifier. Since I couldn't find which distribution the "tf.variables_initializer" uses, so I just tried several initializers with different distribution in Pytorch, but still couldn't gurantee all adversarial samples are found.

(2) the Optimizer. I think there is subtle difference between the Adam optimizer of tf and that of Pytorch, according to my previous experience.

(Btw, I am sure I am using pre-softmax outputs instead of post-softmax ones.)

I have tried to adjust the learning rate and initial constant, but that didn't make much difference. Due to the GPU limitation, it's not practical for me to set large max_iteration & binary_search_step and correspondingly small learning rate.

So I wonder if you have any experience with this issue. I appreciate any insights or suggestions. Thanks!

why the adversarial perturbations was damaged by saving the adversarial samples with scipy.misc.

Hello, @carlini . Sorry to bother you again. I'm try to defense your attack. But there is a strange thing happened: the adversarial perturbations was damaged by saving the adversarial samples with scipy.misc.
And the image is dealed with ' image = image / 255.0 - 0.5 ' when as the input of pre-trained model
and We find the final output is different with the start.

L2 regularization term is squared. Why here specifically? Which impact?

Hello @carlini,

Reading through your paper and your code, I noticed that for the $L^2$ attack, you use a regularization term $\Vert\delta\Vert_2^2$. But all the time except one in your paper, you mention $\Vert\delta\Vert_p$ (no square): in section A or for $L^{\infty}$ attack. Furthermore, Szegedy et al. also used it without square.

Questions:

Is this done purposefully?
Is it discussed anywhere?
Are you sure about the impact (or absence thereof) of the exponent on the results?

Thanks and congrats for achieving your goal:

We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

Élie

About the settings for imagenet

I want to know whether the parameter setting for imagenet is same as cifar ? I load the inception V3 and search for the c via 9 times binary search, which cost much time for training, can you share some ideas about how to accelarating the training process in your method ?

when i run trains_model.py i am getting this error....i am beginner,if any help would be greatfull.

/usr/bin/python3.5 /home/veena/Desktop/nn_robust_attacks-master/train_models.py
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
Using Theano backend.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
(45000, 32, 32, 3)
WARNING:tensorflow:From /home/veena/Desktop/nn_robust_attacks-master/train_models.py:54: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.

Traceback (most recent call last):
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 521, in make_tensor_proto
str_values = [compat.as_bytes(x) for x in proto_values]
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 521, in
str_values = [compat.as_bytes(x) for x in proto_values]
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/util/compat.py", line 61, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got /dense_3_target

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/veena/Desktop/nn_robust_attacks-master/train_models.py", line 109, in
train(CIFAR(), "models/cifar", [64, 64, 128, 128, 256, 256], num_epochs=50)
File "/home/veena/Desktop/nn_robust_attacks-master/train_models.py", line 60, in train
metrics=['accuracy'])
File "/home/veena/.local/lib/python3.5/site-packages/keras/engine/training.py", line 333, in compile
sample_weight, mask)
File "/home/veena/.local/lib/python3.5/site-packages/keras/engine/training_utils.py", line 403, in weighted
score_array = fn(y_true, y_pred)
File "/home/veena/Desktop/nn_robust_attacks-master/train_models.py", line 54, in fn
logits=predicted/train_temp)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 250, in new_func
return func(*args, **kwargs)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1965, in softmax_cross_entropy_with_logits
labels = array_ops.stop_gradient(labels, name="labels_stop_gradient")
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8030, in stop_gradient
"StopGradient", input=input, name=name)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 513, in _apply_op_helper
raise err
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 510, in _apply_op_helper
preferred_dtype=default_dtype)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1107, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 217, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 196, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/veena/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 525, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type <class 'theano.tensor.var.TensorVariable'> to Tensor. Contents: /dense_3_target. Consider casting elements to a supported type.

Process finished with exit code 1

question for l2 distance in l2_attack

hey, i found when calculating the l2 distance in loss function of l2 attack, the following code is implemented:

self.l2dist= tf.reduce_sum(tf.square(self.newimg-(tf.tanh(self.timg) * self.boxmul + self.boxplus)),[1,2,3])

since the input data had already normalized between [-0.5,0.5], why applying tanh function to input image again here? why not just use

self.newimg - self.timg

student.predict instead of teacher.predict in the func train_distillation

should be student.predict(data.train_data) instead of teacher.predict(data.train_data) after student training in the func train_distillation?

corrected code:

 # train the student model at temperature t
    student = train(data, file_name, params, num_epochs, batch_size, train_temp,
                    init=file_name+"_init")

    # and finally we predict at temperature 1
    predicted = student.predict(data.train_data)

    print(predicted)

I want to attack my own model training by tensorflow2.0.

I want to attack my own model training by tensorflow2.0, how do I modify the code?

Possibility of "fixing" the number of pixels to be modified - L0 attack for MNIST

I was able to modify your algorithm(l0_attack) to produce an adversarial example(AX) that modifies only a specific number of pixels(say, a budget setting). The results were,

For any arbitrary (input, target) pair the attack was not always successful -> as per intuition, lower the number of pixels(budget setting), lower was the success rate.
Again, fewer the number of pixels modified, higher the per pixel intensity change

My questions are:

Is it possible to achieve a successful AX for any arbitrary(input, target) under any budget setting? (say, like a one-pixel attack) using your L0 algorithm?
If so, how can we make the attack(AX) the strongest under that particular budget setting?
e.g. if the budget setting was 784 pixels for MNIST, then I'm assuming we could make the AX stronger by increasing the confidence value.

I'm not sure whether the formulation above makes complete sense(under the constraints of your algorithm).
Could you suggest me any pointers towards this, or whether this is even possible?
Thanks for your time.

Graphdef cannot be larger than 2Gb in tensorflow when using the InceptionModel

When using the InceptionModel to get more attack examples, the error will happened.
That is because "you're importing the graph every time you call predict(), and so you're accumulating a very large default graphdef. You should change your code so that you only load the graph once outside of your predict function. This should also speed up your code considerably."

Unable to reproduce L0 attack on MNIST

The generated adversarial examples end up being in the range [-0.5,0.5] instead of [0,1].
I tried making some modifications to change the range, but was unsuccessful.
Could you give me some pointers to get the adv. images in the range [0,1]?

I am using the model mentioned under setup_mnist.py
Thanks

why I only change the weight of the attacked model, the examples are not adversarial any more.

Hi, when I trained the default model for the first time and based on these weights I can generate adversarial examples with 100% attack success rate. But I trained the same model one more time and save the weights, try to attack this second model by the examples I generated based on the first one, it dose not work, the test accuracy is over 65%. Dose it make sense? I thought the adversarial examples should have transferability within the same model but different weights.
Thank you for your time!!

why 10000 in your code,what's the meaning?Thanks!!!

in l2 completion,you use codes like below. What is the meaning?And why you use 10000?Could you tell me some more detail?

other = tf.reduce_max((1-self.tlab)self.output - (self.tlab10000),1)

How can I load the pretrained pb file of tensorflow format into the project?

I want to use the pretrained tensorflow model into the project for evaluation, but I can't load the pb file into the keras model, how to solve the problem ? Thanks !

What are the keras and tensorflow imported in the code？

L_infinite implementation problem

I try to put the L_infinite attack into cleverhans, but here is some problem I don't understand.
The adversarial image I get always have some nan value. When I look into the code, I find:
In L_infinite implementation, there is a function doit, in which it tries to convert the original image to the tanh-space.

def doit(oimgs, labs, starts, tt, CONST):
# convert to tanh-space
imgs = np.arctanh(np.array(oimgs)*1.999999)
starts = np.arctanh(np.array(starts)*1.999999)

Here I don't understand why it times 1.999999 instead of 0.999999.
Because the range of the original images is [0,1], then it changes into [0, 1.999999]. For the part [1,1.999999], the np.arctanh() value becomes inf or nan.

By the way, I read the original paper for the L2, L0, L infinite attacks, in the paper, the L infinite results are remarkable, but I run the code in this repository, I find the maximum value of the adversarial image is 0.5, the minimum value is -0.5. It is weird, is it the issue of parameters or something goes wrong?

question for self.newimg in l2_attack

hello, after reading your paper, I have a question, what's the meaning of this sentence:
self.newimg = tf.tanh(modifier + self.timg) * self.boxmul + self.boxplus
the input data is between [-0.5,0.5], so why this sentence is not the following:
self.newimg = tf.tanh(modifier) * self.boxmul + self.boxplus + (tf.tanh(self.timg) * self.boxmul + self.boxplus)
i don't understand tf.tanh(modifier + self.timg)

Unable to generate l2 attack examples

I have a pre-trained keras MNIST model and I want to use my model to generate adversarial examples.

Here is the code I change in class MNISTModel() in setup_mnist.py :
`

    self.num_channels = 1
    self.image_size = 28
    self.num_labels = 10
    K.set_session(session)
    model = load_model('MNIST_model1.h5')
    self.model = model

But when I ran the code, I got some examples that have all zeros.

Should I load the model like this?
Thank you

Unsuccessful TensorSliceReader constructor


Traceback (most recent call last):
  File "test_attack.py", line 69, in <module>
    data, model =  MNIST(), MNISTModel("models/mnist", sess)
  File "/Users/datle/Documents/Secure_machine_learning/SVM/Carlini/nn_robust_attacks/setup_mnist.py", line 89, in __init__
    model.load_weights(restore)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training_v1.py", line 236, in load_weights
    return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 2199, in load_weights
    py_checkpoint_reader.NewCheckpointReader(filepath)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/training/py_checkpoint_reader.py", line 99, in NewCheckpointReader
    error_translator(e)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/training/py_checkpoint_reader.py", line 35, in error_translator
    raise errors_impl.NotFoundError(None, None, error_message)
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/mnist

I am using the latest python3, TensorFlow, and Keras.
I changed tf.Session() --> tf.compat.v1.Session() at line 68, test_attack.py
I changed tf.app.flags --> tf.compat.v1.flags at line 52,60, 65 and 67 in file setup_inception.py
Please help !

modifier always equals zero

I tried to use CarliniL2 to test my model. However, for most of the samples, the attacking results will be all zeros. I checked the code and I found that after I run
_, l, l2s, scores, nimg = self.sess.run([self.train, self.loss, self.l2dist, self.output, self.newimg])
The result of the variable 'modifier' in __init__ will always be zero. In other words, grads of the optimizer will always be zero, and it will not update the image. Therefore, I got results of all zeros.
I could make sure that my model does not include the softmax layer. Any hints?

l0 attack can't work correctly

In the function attack_single, after run several steps,

 while True:
            # try to solve given this valid map
            res = self.grad([np.copy(img)], [target], np.copy(prev), 
                       valid, const)
            if res == None:
                # the attack failed, we return this as our final answer
                print("Final answer",equal_count)
                return last_solution

encounter a problem, equal_count not defiened ,and last_solution is None ,Therefore , r.extend(None) also cause a exception

GZip error

I am getting the following error on running the train.py file:

File "train_models.py", line 110, in <module>
    train(MNIST(), "models/mnist", [32, 32, 64, 64, 200, 200], num_epochs=50)
  File "/users/btech/ananyag/Desktop/ugp/nn_robust_attacks/setup_mnist.py", line 49, in __init__
    train_data = extract_data("data/train-images-idx3-ubyte.gz", 60000)
  File "/users/btech/ananyag/Desktop/ugp/nn_robust_attacks/setup_mnist.py", line 23, in extract_data
    bytestream.read(16)
  File "/usr/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.8/gzip.py", line 479, in read
    if not self._read_gzip_header():
  File "/usr/lib/python3.8/gzip.py", line 427, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'<!')

Can anyone please suggest the solution?

Unable to run train_models.py

nn_robust_attacks datle$ python3 train_models.py
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1317, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 956, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1392, in connect
server_hostname=server_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 412, in wrap_socket
session=session
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 853, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1117, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_models.py", line 109, in
train(CIFAR(), "models/cifar", [64, 64, 128, 128, 256, 256], num_epochs=50)
File "/Users/datle/Documents/Secure_machine_learning/SVM/Carlini/nn_robust_attacks/setup_cifar.py", line 68, in init
"cifar-data.tar.gz")
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 543, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>

li attack clarification

Would you like to explain the meaning of the following code?

nn_robust_attacks/li_attack.py

Line 135 in 87f8c65

if works < .0001*CONST and (self.ABORT_EARLY or step == CONST-1):

For me, it does not make sense

step == CONST-1, step is an integer related to iteration vs CONST is a float point value related to loss value.
works is the loss value of the instance. Why do you want to set the threshold of loss value to be 0.0001*CONST? I think your intuition is to push loss2 to 0, and loss1 less than 0.0001. I am not sure whether this explanation makes sense.

Any adversarial attack that sustains after resize attack

Hi Sir,

This is Bala. I have a query regarding adversarial attack.

Is there any adversarial attack that sustains/consists of added noise, after resize attack ? (adversarial image -> converting into High / low resolution image -> resize to original adverarial image size)

Thanks,
Bala

What version of tensorflow + keras?

Trying to follow along your attack, I get issues with new TF versions. I'll get errors for nb_epoch (new TF uses just epoch).

However, whenever I set my tf version 1.15 (last version of 1.x) and I try to run python3 train_models.py I get the error:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/keras/init.py", line 3, in
from tensorflow.keras.layers.experimental.preprocessing import RandomRotation
ModuleNotFoundError: No module named 'tensorflow.keras.layers.experimental.preprocessing'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_models.py", line 10, in
from keras.models import Sequential
File "/usr/local/lib/python3.7/dist-packages/keras/init.py", line 6, in
'Keras requires TensorFlow 2.2 or higher. '
ImportError: Keras requires TensorFlow 2.2 or higher. Install TensorFlow via pip install tensorflow``

Which versions of tf and keras were used for this project?

Thanks and best

L2 untargeted attack not working?!

@carlini I am trying to run your source code. But when I go for cifar10 untargeted settings I get all zero loss always and thus class is the same as the original prediction. Thus attack is failing. I have tried using the confidence of 0, 10, 20, 40 in all case it failed. Although for confidence > 0 the losses are not zero but still the attack fails as the class is not changed.
Whereas targeted attack works fine in all cases.
Can you give some hint about what might be the problem? I am using pre-softmax layer only for prediction, as used in the code.

Also, to train the model used in your paper for cifar10 you have written decay of 0.5 in momentum, how to do that I have done the step decay of 0.5 in learning rate at 10 epochs rate and got 79.5% test acc.

Misleading printing?

I am using l0_attack.py.
The default printing shows:

            equal_count = self.model.image_size**2-np.sum(np.all(np.abs(img-nimg[0])<.0001,axis=2))
            print("Forced equal:",np.sum(1-valid),
                  "Equal count:",equal_count)

"Equal count" may be misleading as this number is the number of pixels that are different from (not equal to) the original image at the current iteration. Should it be

            print("Forced equal:",np.sum(1-valid),
                  "Different count:",equal_count)

            print("Forced equal:",np.sum(1-valid),
                  "L0:",equal_count)

nn_robust_attacks/l0_attack.py

Line 227 in 610c43f

 equal_count = self.model.image_size**2-np.sum(np.all(np.abs(img-nimg[0])<.0001,axis=2)) 

Setting confidence=0 produces different adversarial accuracy between different runs

Hello, I have ran the l2_attack for mnist on GPUs.
Sometimes it produces adversarial accuracy around 1%.
Sometimes it produces adversarial accuracy around 20%~30%.
But by setting confidence=0.01 (small values) resolving the issue.

Do you have any idea about why this happened?
Thanks!

How to control the pixel number to be noised ?

I want to control the adversarial example with a fixed noised pixel level, i.e. set the L0 norm between the adversarial example and original image to be about image.size*c%. For example, I want to set the L0 norm in cifar to be about 32x32x3x20%. (Probably in this range, no exact values are needed) Can you give some help ?

Contribute implementation to Foolbox

I am one of the authors of Foolbox, a recently released Python package that aims to provide reference implementations of many adversarial attacks with a uniform API for many different deep learning frameworks. So far we implemented 15+ attacks and provide consistent interfaces to PyTorch, Theano, Tensorflow, Keras, Lasagne and MXNet. The paper on Foolbox is

https://arxiv.org/abs/1707.04131

I'd hereby like to invite you to contribute an implementation of your method. The backend of Foolbox likely needs a couple of small modifications to support your attack but I'd be very happy to assist you. Please let me know if you are interested.

Low validation accuracy of CIFAR

I ran the script to train the model on CIFAR10 and also the L0 attack on the trained model.

However, the validation accuracy achieved by the script is very low. It is not reasonable to perform adversarial attacks on such low accuracy model.

l0 attack: Potential Bug

Apologies if I'm missing something obvious here, but in the l0 attack, shouldn't the valid[e] = 0 be after the breaks?

If set a pixel to "don't change", if 1. totalchange < threshold and 2. we haven't changed too many pixels.. setting valid[e] = 0 before the breaks would invalidate the pixel regardless?

did = 0
for e in np.argsort(totalchange):
if np.all(valid[e]):
did += 1
valid[e] = 0

                if totalchange[e] > .01:
                    # if this pixel changed a lot, skip
                    break
                if did >= .3*equal_count**.5:
                    # if we changed too many pixels, skip
                    break

Also, you haven't implemented the random starts in the l2 implementation in this repo correct? The paper says:

We randomly sample points uniformly from the ball of radius
r, where r is the closest adversarial example found so far.

This r depends on the source/target label for a given image? i.e we chose a r based on the closest adversarial example for the target class under consideration (r would vary significantly depending on the target class, as adversarial examples for some classes are harder than others)? What initial value did you pick?

L_inf always fails if abort_early is False

In li_attack.py, the function doit never returns anything if abort_early=False. This means that doit will always return None, which will be interpreted by attack_single as a failure.

carlini / nn_robust_attacks Goto Github PK

nn_robust_attacks's People

Contributors

Stargazers

Watchers

Forkers

nn_robust_attacks's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs