albermax / innvestigate Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 234.0 58.05 MB

A toolbox to iNNvestigate neural networks' predictions!

License: Other

Python 100.00%

innvestigate's People

Contributors

Stargazers

Watchers

Forkers

alexbinder emilbols tuyki p16i dnn-group rturon rtygbwwwerr megha144 saeedseyyedi intuitionmachine codeaudit anuragreddygv323 anirband axelderomblay jackeywang777 eycab pkcakeout sandy4321 abhishekrana pseegerer nareshshah139 fhvilshoj nodat clennan cdchushig ccchang0111 blaxe05 muzi-8 githubclj mariussterling jmaces jglombitza niuwan1 naghipourfar lbnphoenix jurgyy maiziheimei gchlebus maelh matech96 phamcuong92 easilar malteleuschner ying-a sienne-carmin nicemanis nadezhdakor konradbachusz-zz shujushi tahedi1 mbenhamd sznajder jenny-nlc rbtsbg ml-and-video advanpix xclmj abelviageiro mbuet2ner estorrs laudehenri sungwoo-lee nnu-gisa zhangys2019 valentinodavide suphoff dmitrysarov albao11 sherlock42 theadriangreen pakozm thomastilli mycstar ttl518 drroad bravedistribution adderbyte guohf3 carrielui leanderweber anderskm xiao233333 vandana-rajan miss1997yuan tomaspohl kmeagle1515 jbdatascience wnstlr berleon martin-legeland chenxingqiang ai-mikasa saurabhya sirnyls mingchaozhu medinaelrobrini enryh mousamustafa fermatrox pctsai0621

innvestigate's Issues

Restructure code base

Review the package structure and where methods should go best.
Especially in the utils module.

Numerical assertions for debugging

Is there any convenient way to include numerical checks in the code, similar to the
numpy.testing.assert_*
functions? I do not seem to find anything.

Patternnet/attribution

get to work for all application networks.
harden code:
- warn when used with not specified layers networks.
- check if some kernel layers have no pattern (needed?).

Check License

Check license, esp if each file needs a header.

Plotting

Fix plotting. Esp the font sizes should be correct for mnist and imagenet. See examples.

kcheck.is_input_layer always returns True

I have generated some pre-defined conditional rules for LRP, such as LRPCompositeA and LRPCompositeAFlat in relevance_based.py:1091+

As soon as the parameter input_layer_rule is set for the constructor, it either crashes (if it is a string) or kcheck.is_input_layer returns True for all layers.

LRPCompositeAFlat is defined in relevance_based.py:1170 and runs, yet applies the flat rule to all layers.

This Analyzer has been activated in imagenet_lrp.py:130 in branch feature/input_layer_rule. Check out that branch and run imagenet_lrp.py and observe the debug prints in the terminal output, which show mappings from layer object to rule object as returned by select_rule in relevance_based.py

LRPCompositeBFlat is defined in relevance_based.py:1180 and passes the input_layer_rule parameter as a string, which causes a crash. To recreate that problem activate the analyzer in imagenet_lrp.py:131.

Stack trace regarding that crash:

Traceback (most recent call last):
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/examples/imagenet_lrp.py", line 165, in <module>
    a = analyzer.analyze(image if is_input_analyzer else x)
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 373, in analyze
    self.compile_analyzer()
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 338, in compile_analyzer
    tmp = self._create_analysis(model)
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 497, in _create_analysis
    return_all_reversed_tensors=return_all_reversed_tensors)
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/utils/keras/graph.py", line 594, in reverse_model
    "layer": layer,
  File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/relevance_based.py", line 844, in __init__
    rule_class = LRP_RULES[rule]
TypeError: unhashable type: 'list'

I will not touch the code until monday, so have at it.
Have a nice weekend!

Readme

Create readme. Should contain:

introduction
- on backward model and analyze scope.
how to:
- setup/install lib.
- run the tests.
- fit/analyze interface.
outline and references of implemented methods.
pointers to ipynbs.

Code Todos First Release

Pixelflipping

In the tool submodule adding a pixelflipping evaluation.

Constructor inputs:

model, analyzer(s), function to merge attribution region (e.g., non-overlapping max-pooling), function to "blur" region, size of regions, number of steps.
evaluate (generator+normal interface):
- do all analyzations, "blur", use fit_generator of model to get new predictions.
- output, matrix: (analyzer, sample, number of steps) output.

Multi-gpu?

Imagenet Tests

Use keras.application vgg16 and then add other keras application networks. Problems:

Example code needs to be adapted.
- There might be issues with the handling of weights on the different platforms.
PatternComputer needs to be adapted.

LRP

Restructure code.
Add pooling layers. (max pooling is covered with the gradient, sum pooling should work with LRP rules; means need to add reverse hook).
Add merge layers; i.e. add reverse hooks.
(still needed?) Add LNR layer to innvestigate.layers and create a LRP rule to handle it.
Add check for all keras layer if compatible, if not raise exception.
Get to work for all application networks.
Organize module: which rules to keep, divide DT and LRP?

Initializing revert functions slow

For large networks the initializing of revert functions is pretty slow (ca. 1s for dense layers), when using multiple methods and networks with many layers this makes all a bit slow. Lets have a loook.

.vscode folder commited

Could it be that you submitted by mistake this folder into your feature-branch, Sebastian?

Analyzer setup times

How come the instantiation times for the analyzer objects are so high, compared to the execution for given images, once the setup is done?

Example for running the first three images in mnist_lrp.py

Image 0: Input (0.0230s) Gradient*Input (0.1160s) LRP-Z (0.2516s) LRP-Z-IB (0.1634s) LRP-Epsilon (0.1742s) LRP-Epsilon-IB (0.1694s) LRP-W-Square (0.1779s) LRP-Flat (0.1840s) LRP-A2B1 (0.6371s) LRP-A2B1-IB (0.5809s) LRP-A1B0 (0.5755s) LRP-A1B0-IB (0.5152s) LRP-ZPlus (0.5752s) LRP-ZPlusFast (0.3181s)
Image 1: Input (0.0005s) Gradient*Input (0.0016s) LRP-Z (0.0014s) LRP-Z-IB (0.0018s) LRP-Epsilon (0.0018s) LRP-Epsilon-IB (0.0016s) LRP-W-Square (0.0017s) LRP-Flat (0.0016s) LRP-A2B1 (0.0027s) LRP-A2B1-IB (0.0024s) LRP-A1B0 (0.0024s) LRP-A1B0-IB (0.0019s) LRP-ZPlus (0.0021s) LRP-ZPlusFast (0.0018s)
Image 2: Input (0.0005s) Gradient*Input (0.0014s) LRP-Z (0.0018s) LRP-Z-IB (0.0015s) LRP-Epsilon (0.0017s) LRP-Epsilon-IB (0.0016s) LRP-W-Square (0.0013s) LRP-Flat (0.0014s) LRP-A2B1 (0.0026s) LRP-A2B1-IB (0.0027s) LRP-A1B0 (0.0024s) LRP-A1B0-IB (0.0020s) LRP-ZPlus (0.0024s) LRP-ZPlusFast (0.0019s)
[...]

Same for imagenet_lrp.py . Model is VGG16

Image 0: Input (0.0193s) Gradient*Input (0.3270s) LRP-Z (2.0532s) LRP-Z-IB (1.9581s) LRP-Epsilon (2.9357s) LRP-Epsilon-IB (3.3425s) LRP-W-Square (3.3343s) LRP-Flat (3.5949s) LRP-A2B1 (13.3973s) LRP-A2B1-IB (15.5360s) LRP-A1B0 (21.1701s) LRP-A1B0-IB (21.2032s) LRP-ZPlus (23.4144s) LRP-ZPlusFast (12.3629s)
Image 1: Input (0.0007s) Gradient*Input (0.0152s) LRP-Z (0.0228s) LRP-Z-IB (0.0224s) LRP-Epsilon (0.0236s) LRP-Epsilon-IB (0.0229s) LRP-W-Square (0.0230s) LRP-Flat (0.0230s) LRP-A2B1 (0.0651s) LRP-A2B1-IB (0.0637s) LRP-A1B0 (0.0375s) LRP-A1B0-IB (0.0360s) LRP-ZPlus (0.0358s) LRP-ZPlusFast (0.0214s)
Image 2: Input (0.0010s) Gradient*Input (0.0147s) LRP-Z (0.0216s) LRP-Z-IB (0.0215s) LRP-Epsilon (0.0220s) LRP-Epsilon-IB (0.0217s) LRP-W-Square (0.0209s) LRP-Flat (0.0207s) LRP-A2B1 (0.0605s) LRP-A2B1-IB (0.0603s) LRP-A1B0 (0.0361s) LRP-A1B0-IB (0.0350s) LRP-ZPlus (0.0348s) LRP-ZPlusFast (0.0213s)
 [...]

Once the (computational graph ? of the) analyzer has been built, excution times are low. What is causing this? Is it the copying of the layer params? Can we minimize that time?

Edit: readability

API Documentation

A markdown file that describes the current analyzers and function and their parameters.

examples/plot_all_methods_all_networks.py

It is unclear to me how to call this script.
Please add a more transparent example call.

General question on constructors

Being mostly familiar with py2, java, c# and ... I am used to making super-calls in constructor definitions before doing anything else. In some languages, you are even forced to do so.

In the software, I frequently see a lot of class setup code, which is the finalized by a super class constructor call. Is there a particular and practical reason to do so, or just personal choice?

Analyzer generator interface

Add a interface that works with a generator to the analyzers.

Needed for pixelflipping.

Baseline - LRP-Z

Check that it will be only used with networks that are compatible, i.e., Relu-Networks with max-pooling.

Documentation code

Add necessary code documentation especially for the graph revert functionality.

Plotting requires 'agg' backend

This happens, e.g. when executing mnist_all_methods.py

Matplotlib backend should be switchable.

Notebooks

We use (ipy)notebooks as main way to introduce people to the api. "Learning by doing".

The following notebooks and contents should be part of the first release:

A) [core reversal code:] walk through the idea of inverting the graph, show how to implement the gradient by inverting each layer, then a slightly more advanced example by showing how to create deconvnet and guided backprop.
- best done with inception v3 => fast network and imagenet pics look great.
- If there are some nice visualization/sketches on how the reversion works, we can use them also in the paper. Parts of the text can also be shared.
B) [mnist complete workflow:] the same as the mnist example right now, just with more comments on how things work. And the code should be structured more nicely, i.e, not using those function, more in a step by step fashion.
- the pics at the moment look horrible, I think this is due to the training. Best look into the cat-paper and try to reproduce the results, in case ask Sebastian or Pieter-Jan how the got them.
C) [imagenet application] a step-by-step example with imagenet, again similar to the all_methods.py. The scope here is to reuse the patterns and let the people know how the can use our applications submodule.
D) [imagenet network comparison] Same as the C only the outcome should be a square with n networks as rows and n methods as columns. The code can and will be a bit more messy as different networks need different preprocessing functions and image sizes. Overall it should be doable as the key information is already present in the dictionary returned by innvestigate.applications.
- Reference C and delimit the scope to show the power of innvestigate and on how to compare different networks.
- The plots of this example will be used in the final paper.
E) [imagent pattern training] similar to the current train_***.py. Focus on how to train patterns for a large network. In the best case this example shows how to train with several gpus (means setting the parameter gpus=X).
D) [perturbation mnist] an example notebook on how to use perturbation analysis with mnist that everybody can run.
E) [perturbation imagenet] an example notebook on how to use perturbation analysis with imagenet.
F) [LRP/DT intro] Sebastian might wants to add an LRP/DT notebook.

I think if there exists first a working python-script it is easy to create the notebook. The advantage would be one can run the examples easily via ssh. The drawback is the code is doubled and when changed one needs to change two places.

Patterncomputer

Fix distant relu issue.
Add support for other conv layers.
- Conv1D, Conv3D.
- Separable and depth-wise conv.
- (not first release) add support for layers where not all neurons have same input.
  - Could be done by additional dimension in the input data do stats collector of pattern.
- (not first release) layers with multiple applications.
  - warning for now, in the future there should be an option to treat the separately.
(not first release) add multi-gpu support

Multi-Platform

Add missing backends to /innvestigate/utils/keras/backend.py. Test them.

inhomogeneous color mapping choice across all methods

in examples/all_methods.py, some methods use the postprocessing option graymap and others heatmap.
Would it not be fair/nicer to look at to use the same color mapping scheme for all those methods?

Setup script.

Add needed packages to setup script. Test by using empty venv and install package.

Known needed modules: pillow, numpy, scipy, keras.

Note: it should be up to the user which backend for keras he installs.

Public Repo for the Software Paper

https://github.com/albermax/innvestigate_paper is public. is this intended?

Methods: Deconvnet, GuidedBackprop

Add checks: only relu networks.
Control if the reversion also works with activation (only) layers.

LRP: AlphaBeta and Zplus

I have found a bug in the code for the AlphaBeta rules and also the Zplus rule.

The current code creates copies of the layer's weight and thesholds it at zero into positive and negative weight arrays.
However, AlphaBeta and also Zplus require forward transformations thresholded at zero, which, with the current implementation, requires layer inputs x to be greater than or equal to zero.

For a correct implementation of both rules (Zplus can inherit from AlphaBeta and realize alpha=1, beta=0), it would suffice to copy the layer's weights once, but then compute two forward passes which need to be thresholded.
Does this interfere with the computation of the backward gradient?

Also, for correctly considering the positive and negative forward contributions originating from the bias, the bias needs to be thresholded as well.

Please have a look at (this)[https://github.com/sebastian-lapuschkin/lrp_toolbox/blob/python-wip/python/modules/linear.py], lines 219+, which implements the rules using numpy.
Can this (conveniently) be fixed, without negatively interfering with the gradient computations?

Applications module

Create a module with the same/subset of the networks in keras.applications.

Create for each network a testcase.
Add flags that download weights and patterns.
- Upload patterns in h5py format, use keras caching functionality.
Return model and dictionary with additional information (see innvestigate.utils.tests.networks.imagenet)

Plotting proposition: optional gamma correction

In some cases, saliency maps can become very sparse, with only a few very local peaks dominating the normalization of the saliency map into rgb space via a color map.

what do you think about the option to apply gamma scaling, to make weak responses more visible?

Input/Random Analyzer

Implement them in keras. Useful f.e. when used with Pixelflipping. Let's go all keras.

Issues with model and pattern IO

For some models, loading (partially) fails. See below:

vgg16: ok
vgg16: ok
resnet50: ok
inception_v3: weights ok. "relu" type pattern missing (no url).
inception_resnet_v2: weights ok. "relu" type pattern missing (no url).
densenet121: weights ok. "relu" type pattern missing (no url).
densenet169: ok
densenet201: weights ok. "relu" type pattern missing (no url).
nasnet_large: failed. see Traceback below*
nasnet_mobile: weights ok. "relu" type pattern missing (no url).

*)
[...]
Using TensorFlow backend.
2018-03-21 13:32:40.338924: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.8/NASNet-large.h5
359751680/359746192 [==============================] - 46s 0us/step
Image 0: Traceback (most recent call last):
File "all_methods.py", line 145, in
prob = modelp.predict_on_batch(x)[0]
File "/home/lapuschkin/.local/lib/python3.5/site-packages/Keras-2.1.5-py3.5.egg/keras/engine/training.py", line 1939, in predict_on_batch
self._feed_input_shapes)
File "/home/lapuschkin/.local/lib/python3.5/site-packages/Keras-2.1.5-py3.5.egg/keras/engine/training.py", line 123, in _standardize_input_data
str(data_shape))
ValueError: Error when checking : expected input_1 to have shape (None, 331, 331, 3) but got array with shape (1, 224, 224, 3)

Style file

Write down some style guidelines for this module.

utils.keras.checks.is_input_layer

is broken and relies on stuff from utils.keras.graph
its call in LRP.init therefore does not work, which still points to kgraph.is_input_layer

Skip-Connection + Wrapper

There is a bug when using the BaseWrapper and the skip connections, see test trivia.skip_connection.
Needs to be fixed for ResNets.

Inception and Wrapper/SmoothGrad/IntegratedGradients

There is an issue when combining Inception with wrapper-based methods, i.e., Smoothgrad/Integrated Gradients.

Please post informations here.

Todo:

Post stackstrace here.

Graph Order

provide function that exposes reversal ordering

The Keras backend and backend-specific values

I would like to introduce the epsilon parameter for the EpsilonRule classes as a passable parameter, instead of reading it from the keras backend, e.g. for tuning layers-specific epsilons wrt to the number of neurons. something like that.

Calling K.set_epsilon(eps) will set epsilon globally, e.g. replace the current epsilon in the keras backend with eps.
I suppose how the reverted graph is then executed depends on the backend, e.g tensorflow will first build the full graph according the instructions of the Analyzer classes, then run it.
Would this mean, that only the last call to K.set_epsilon(eps) would be valid and overwrite the previous ones for the graph execution, or does keras handle the backend "just in time", and the epsilon can be set for each layer safely?

I am asking, because keras tends to complain (when using tensorflow) when stuff from tbe backend is multiplied by non-backend floats or ints. It is not fully transparent to me I must say.

Is there an argument against the introduction of an epsilon as a regular parameter for the class' constructors?

Restructering Tests

Currently there are base test cases. Those base test cases iterate over a set a of networks.

Issues:

Which networks are selected is controlled by the environment variable NNPATTERNS_TEST_FILTER. This is in-transparent.
There are no testsuits that collect the tests.

Todo:

Remove the environment variable and add a new parameter so that the base test case can control which networks are used.
Create test cases that only work with a set of networks.
- I.e., trivia.+mnist.logreg, mnist., cifar10., imagenet.
Update the tests.
In the base test folder create test suits that collect the different tests.
- E.g., fast tests => trivia.+mnist.logreg; base tests => fast+mnist.+cifar.*; application tests => test with the applications networks see #23.

Cooler Pictures

Lets find a set of cooler pictures for the imagenet example notebooks, and some where the networks typically predict the wrong label.

Python 2

Test module with python 2.

Documentation First Release

Checking model layers

Graph checks for methods.
- Split in warning-possible and exception-only.
List of layers.
- For each method test or check.

downloading pretrained mnist-models

I have added some mnist-models we have used in our PLOS and JMLR papers. Currently, those models are placed on my dropbox account.
Since keras.utils.data_utils.get_file seems to corrupt the files during download, there is a hack based on os.system("wget {}".format(urlname) in innvestigate.utils.tests.networks.base.py:325, which is not OS-agnostic and is not nice.

Moving the models to a single location would make sense, and also adding hash values for download verification. The current URLs can be found in innvestigate.utils.tests.networks.base.py:297+

Feel free to adapt the download locations, etc, to your liking.

Pattern file for VGG in dropbox is corrupted

File: https://www.dropbox.com/s/v7e0px44jqwef5k/imagenet_224_vgg_16.patterns.A_only.npz?dl=1

Does not load with numpy, but old downloads work.

Submitting Keras Issue

There is most likely a bug in Keras. "run_internal_graph" of the model class does not handle masks correctly for layers with multiple outputs and not mask support. Open an issue and report/make pull request.

At the moment we have a workaround but it would be more principled to fix Keras.

utility functions in examples

many of the example scripts share the same/highly similar utility functions for pre- and post processing, etc.
I suggest to create a examples.utils.py, collecting those functions, making the actual example scripts more lean.

Backward Rule Parameterization

Currently, as I understand it, we can choose to pass as lrp rules some strings and rule classes.
Since a class needs to be passed (and not an object), users of the software are forced to inherit their own classes for custom parameterizations, which I expect might be an unfamiliar concept for potential users without background or experience in software engineering.

Do you think the current pattern of passing strings or classes can be relaxed in order to allow pre-parameterized rule object instances to be passed alongside strings and rule classes?

Also, a question regarding testing:
Has the software been tested with a list of rules and use_conditions = False?
I am asking since the paramete input_layer_rule adds a BoundedRule object to the beginning of the list.
what happens, if len(rules) != len(layers of the model) in this case?

The dummy forward pass in WSquareRule

I am currently looking at WSquareRule.apply and see

def apply(self, Xs, Ys, Rs, reverse_state):
grad = ilayers.GradientWRT(len(Xs))
# Create dummy forward path to take the derivative below.
Ys = kutils.apply(self._layer_wo_act_b, Xs) #?
    # Compute the sum of the squared weights.
    ones = ilayers.OnesLike()(Xs)
    Zs = iutils.to_list(self._layer_wo_act_b(ones))
    # Weight the incoming relevance.
    tmp = [ilayers.SafeDivide()([a, b])
           for a, b in zip(Rs, Zs)]
    # Redistribute the relevances along the gradient.
    tmp = iutils.to_list(grad(Xs+Ys+Rs))
    return tmp

Line 4 marked with a #? : Is this correct? Is this forward pass not redundant and could be removed by refactoring the function to

def apply(self, Xs, Ys, Rs, reverse_state):
grad = ilayers.GradientWRT(len(Xs))
ones = ilayers.OnesLike()(Xs)
Zs = kutils.apply(self._layer_wo_act_b, ones) #?
    # Weight the incoming relevance.
    tmp = [ilayers.SafeDivide()([a, b])
           for a, b in zip(Rs, Zs)]
    # Redistribute the relevances along the gradient.
    tmp = iutils.to_list(grad(Xs+Zs+Rs)) #?
    return tmp

Cheers,

albermax / innvestigate Goto Github PK

innvestigate's People

Contributors

Stargazers

Watchers

Forkers

innvestigate's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs