albermax / innvestigate Goto Github PK
View Code? Open in Web Editor NEWA toolbox to iNNvestigate neural networks' predictions!
License: Other
A toolbox to iNNvestigate neural networks' predictions!
License: Other
Review the package structure and where methods should go best.
Especially in the utils module.
Is there any convenient way to include numerical checks in the code, similar to the
numpy.testing.assert_*
functions? I do not seem to find anything.
Check license, esp if each file needs a header.
Fix plotting. Esp the font sizes should be correct for mnist and imagenet. See examples.
I have generated some pre-defined conditional rules for LRP, such as LRPCompositeA
and LRPCompositeAFlat
in relevance_based.py:1091+
As soon as the parameter input_layer_rule
is set for the constructor, it either crashes (if it is a string) or kcheck.is_input_layer
returns True for all layers.
LRPCompositeAFlat
is defined in relevance_based.py:1170
and runs, yet applies the flat rule to all layers.
This Analyzer has been activated in imagenet_lrp.py:130
in branch feature/input_layer_rule
. Check out that branch and run imagenet_lrp.py
and observe the debug prints in the terminal output, which show mappings from layer object to rule object as returned by select_rule
in relevance_based.py
LRPCompositeBFlat
is defined in relevance_based.py:1180
and passes the input_layer_rule
parameter as a string, which causes a crash. To recreate that problem activate the analyzer in imagenet_lrp.py:131
.
Stack trace regarding that crash:
Traceback (most recent call last):
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/examples/imagenet_lrp.py", line 165, in <module>
a = analyzer.analyze(image if is_input_analyzer else x)
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 373, in analyze
self.compile_analyzer()
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 338, in compile_analyzer
tmp = self._create_analysis(model)
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/base.py", line 497, in _create_analysis
return_all_reversed_tensors=return_all_reversed_tensors)
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/utils/keras/graph.py", line 594, in reverse_model
"layer": layer,
File "/home/lapuschkin/Desktop/innvestigate-lrp-wip/innvestigate/innvestigate/analyzer/relevance_based.py", line 844, in __init__
rule_class = LRP_RULES[rule]
TypeError: unhashable type: 'list'
I will not touch the code until monday, so have at it.
Have a nice weekend!
Create readme. Should contain:
In the tool submodule adding a pixelflipping evaluation.
Constructor inputs:
Multi-gpu?
Use keras.application vgg16 and then add other keras application networks. Problems:
For large networks the initializing of revert functions is pretty slow (ca. 1s for dense layers), when using multiple methods and networks with many layers this makes all a bit slow. Lets have a loook.
Could it be that you submitted by mistake this folder into your feature-branch, Sebastian?
How come the instantiation times for the analyzer objects are so high, compared to the execution for given images, once the setup is done?
Example for running the first three images in mnist_lrp.py
Image 0: Input (0.0230s) Gradient*Input (0.1160s) LRP-Z (0.2516s) LRP-Z-IB (0.1634s) LRP-Epsilon (0.1742s) LRP-Epsilon-IB (0.1694s) LRP-W-Square (0.1779s) LRP-Flat (0.1840s) LRP-A2B1 (0.6371s) LRP-A2B1-IB (0.5809s) LRP-A1B0 (0.5755s) LRP-A1B0-IB (0.5152s) LRP-ZPlus (0.5752s) LRP-ZPlusFast (0.3181s)
Image 1: Input (0.0005s) Gradient*Input (0.0016s) LRP-Z (0.0014s) LRP-Z-IB (0.0018s) LRP-Epsilon (0.0018s) LRP-Epsilon-IB (0.0016s) LRP-W-Square (0.0017s) LRP-Flat (0.0016s) LRP-A2B1 (0.0027s) LRP-A2B1-IB (0.0024s) LRP-A1B0 (0.0024s) LRP-A1B0-IB (0.0019s) LRP-ZPlus (0.0021s) LRP-ZPlusFast (0.0018s)
Image 2: Input (0.0005s) Gradient*Input (0.0014s) LRP-Z (0.0018s) LRP-Z-IB (0.0015s) LRP-Epsilon (0.0017s) LRP-Epsilon-IB (0.0016s) LRP-W-Square (0.0013s) LRP-Flat (0.0014s) LRP-A2B1 (0.0026s) LRP-A2B1-IB (0.0027s) LRP-A1B0 (0.0024s) LRP-A1B0-IB (0.0020s) LRP-ZPlus (0.0024s) LRP-ZPlusFast (0.0019s)
[...]
Same for imagenet_lrp.py . Model is VGG16
Image 0: Input (0.0193s) Gradient*Input (0.3270s) LRP-Z (2.0532s) LRP-Z-IB (1.9581s) LRP-Epsilon (2.9357s) LRP-Epsilon-IB (3.3425s) LRP-W-Square (3.3343s) LRP-Flat (3.5949s) LRP-A2B1 (13.3973s) LRP-A2B1-IB (15.5360s) LRP-A1B0 (21.1701s) LRP-A1B0-IB (21.2032s) LRP-ZPlus (23.4144s) LRP-ZPlusFast (12.3629s)
Image 1: Input (0.0007s) Gradient*Input (0.0152s) LRP-Z (0.0228s) LRP-Z-IB (0.0224s) LRP-Epsilon (0.0236s) LRP-Epsilon-IB (0.0229s) LRP-W-Square (0.0230s) LRP-Flat (0.0230s) LRP-A2B1 (0.0651s) LRP-A2B1-IB (0.0637s) LRP-A1B0 (0.0375s) LRP-A1B0-IB (0.0360s) LRP-ZPlus (0.0358s) LRP-ZPlusFast (0.0214s)
Image 2: Input (0.0010s) Gradient*Input (0.0147s) LRP-Z (0.0216s) LRP-Z-IB (0.0215s) LRP-Epsilon (0.0220s) LRP-Epsilon-IB (0.0217s) LRP-W-Square (0.0209s) LRP-Flat (0.0207s) LRP-A2B1 (0.0605s) LRP-A2B1-IB (0.0603s) LRP-A1B0 (0.0361s) LRP-A1B0-IB (0.0350s) LRP-ZPlus (0.0348s) LRP-ZPlusFast (0.0213s)
[...]
Once the (computational graph ? of the) analyzer has been built, excution times are low. What is causing this? Is it the copying of the layer params? Can we minimize that time?
Edit: readability
A markdown file that describes the current analyzers and function and their parameters.
It is unclear to me how to call this script.
Please add a more transparent example call.
Being mostly familiar with py2, java, c# and ... I am used to making super-calls in constructor definitions before doing anything else. In some languages, you are even forced to do so.
In the software, I frequently see a lot of class setup code, which is the finalized by a super class constructor call. Is there a particular and practical reason to do so, or just personal choice?
Add a interface that works with a generator to the analyzers.
Needed for pixelflipping.
Add necessary code documentation especially for the graph revert functionality.
This happens, e.g. when executing mnist_all_methods.py
Matplotlib backend should be switchable.
We use (ipy)notebooks as main way to introduce people to the api. "Learning by doing".
The following notebooks and contents should be part of the first release:
A) [core reversal code:] walk through the idea of inverting the graph, show how to implement the gradient by inverting each layer, then a slightly more advanced example by showing how to create deconvnet and guided backprop.
B) [mnist complete workflow:] the same as the mnist example right now, just with more comments on how things work. And the code should be structured more nicely, i.e, not using those function, more in a step by step fashion.
C) [imagenet application] a step-by-step example with imagenet, again similar to the all_methods.py. The scope here is to reuse the patterns and let the people know how the can use our applications submodule.
D) [imagenet network comparison] Same as the C only the outcome should be a square with n networks as rows and n methods as columns. The code can and will be a bit more messy as different networks need different preprocessing functions and image sizes. Overall it should be doable as the key information is already present in the dictionary returned by innvestigate.applications.
E) [imagent pattern training] similar to the current train_***.py. Focus on how to train patterns for a large network. In the best case this example shows how to train with several gpus (means setting the parameter gpus=X).
D) [perturbation mnist] an example notebook on how to use perturbation analysis with mnist that everybody can run.
E) [perturbation imagenet] an example notebook on how to use perturbation analysis with imagenet.
F) [LRP/DT intro] Sebastian might wants to add an LRP/DT notebook.
I think if there exists first a working python-script it is easy to create the notebook. The advantage would be one can run the examples easily via ssh. The drawback is the code is doubled and when changed one needs to change two places.
Add missing backends to /innvestigate/utils/keras/backend.py. Test them.
in examples/all_methods.py, some methods use the postprocessing option graymap
and others heatmap
.
Would it not be fair/nicer to look at to use the same color mapping scheme for all those methods?
Add needed packages to setup script. Test by using empty venv and install package.
Known needed modules: pillow, numpy, scipy, keras.
Note: it should be up to the user which backend for keras he installs.
https://github.com/albermax/innvestigate_paper is public. is this intended?
I have found a bug in the code for the AlphaBeta rules and also the Zplus rule.
The current code creates copies of the layer's weight and thesholds it at zero into positive and negative weight arrays.
However, AlphaBeta and also Zplus require forward transformations thresholded at zero, which, with the current implementation, requires layer inputs x to be greater than or equal to zero.
For a correct implementation of both rules (Zplus can inherit from AlphaBeta and realize alpha=1, beta=0), it would suffice to copy the layer's weights once, but then compute two forward passes which need to be thresholded.
Does this interfere with the computation of the backward gradient?
Also, for correctly considering the positive and negative forward contributions originating from the bias, the bias needs to be thresholded as well.
Please have a look at (this)[https://github.com/sebastian-lapuschkin/lrp_toolbox/blob/python-wip/python/modules/linear.py], lines 219+, which implements the rules using numpy.
Can this (conveniently) be fixed, without negatively interfering with the gradient computations?
Create a module with the same/subset of the networks in keras.applications.
In some cases, saliency maps can become very sparse, with only a few very local peaks dominating the normalization of the saliency map into rgb space via a color map.
what do you think about the option to apply gamma scaling, to make weak responses more visible?
Implement them in keras. Useful f.e. when used with Pixelflipping. Let's go all keras.
For some models, loading (partially) fails. See below:
vgg16: ok
vgg16: ok
resnet50: ok
inception_v3: weights ok. "relu" type pattern missing (no url).
inception_resnet_v2: weights ok. "relu" type pattern missing (no url).
densenet121: weights ok. "relu" type pattern missing (no url).
densenet169: ok
densenet201: weights ok. "relu" type pattern missing (no url).
nasnet_large: failed. see Traceback below*
nasnet_mobile: weights ok. "relu" type pattern missing (no url).
*)
[...]
Using TensorFlow backend.
2018-03-21 13:32:40.338924: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.8/NASNet-large.h5
359751680/359746192 [==============================] - 46s 0us/step
Image 0: Traceback (most recent call last):
File "all_methods.py", line 145, in
prob = modelp.predict_on_batch(x)[0]
File "/home/lapuschkin/.local/lib/python3.5/site-packages/Keras-2.1.5-py3.5.egg/keras/engine/training.py", line 1939, in predict_on_batch
self._feed_input_shapes)
File "/home/lapuschkin/.local/lib/python3.5/site-packages/Keras-2.1.5-py3.5.egg/keras/engine/training.py", line 123, in _standardize_input_data
str(data_shape))
ValueError: Error when checking : expected input_1 to have shape (None, 331, 331, 3) but got array with shape (1, 224, 224, 3)
Write down some style guidelines for this module.
is broken and relies on stuff from utils.keras.graph
its call in LRP.init therefore does not work, which still points to kgraph.is_input_layer
There is a bug when using the BaseWrapper and the skip connections, see test trivia.skip_connection.
Needs to be fixed for ResNets.
There is an issue when combining Inception with wrapper-based methods, i.e., Smoothgrad/Integrated Gradients.
Please post informations here.
Todo:
provide function that exposes reversal ordering
I would like to introduce the epsilon parameter for the EpsilonRule classes as a passable parameter, instead of reading it from the keras backend, e.g. for tuning layers-specific epsilons wrt to the number of neurons. something like that.
Calling K.set_epsilon(eps) will set epsilon globally, e.g. replace the current epsilon in the keras backend with eps.
I suppose how the reverted graph is then executed depends on the backend, e.g tensorflow will first build the full graph according the instructions of the Analyzer classes, then run it.
Would this mean, that only the last call to K.set_epsilon(eps) would be valid and overwrite the previous ones for the graph execution, or does keras handle the backend "just in time", and the epsilon can be set for each layer safely?
I am asking, because keras tends to complain (when using tensorflow) when stuff from tbe backend is multiplied by non-backend floats or ints. It is not fully transparent to me I must say.
Is there an argument against the introduction of an epsilon as a regular parameter for the class' constructors?
Currently there are base test cases. Those base test cases iterate over a set a of networks.
Issues:
Todo:
Lets find a set of cooler pictures for the imagenet example notebooks, and some where the networks typically predict the wrong label.
Test module with python 2.
I have added some mnist-models we have used in our PLOS and JMLR papers. Currently, those models are placed on my dropbox account.
Since keras.utils.data_utils.get_file
seems to corrupt the files during download, there is a hack based on os.system("wget {}".format(urlname)
in innvestigate.utils.tests.networks.base.py:325
, which is not OS-agnostic and is not nice.
Moving the models to a single location would make sense, and also adding hash values for download verification. The current URLs can be found in innvestigate.utils.tests.networks.base.py:297+
Feel free to adapt the download locations, etc, to your liking.
File: https://www.dropbox.com/s/v7e0px44jqwef5k/imagenet_224_vgg_16.patterns.A_only.npz?dl=1
Does not load with numpy, but old downloads work.
There is most likely a bug in Keras. "run_internal_graph" of the model class does not handle masks correctly for layers with multiple outputs and not mask support. Open an issue and report/make pull request.
At the moment we have a workaround but it would be more principled to fix Keras.
many of the example scripts share the same/highly similar utility functions for pre- and post processing, etc.
I suggest to create a examples.utils.py, collecting those functions, making the actual example scripts more lean.
Currently, as I understand it, we can choose to pass as lrp rules some strings and rule classes.
Since a class needs to be passed (and not an object), users of the software are forced to inherit their own classes for custom parameterizations, which I expect might be an unfamiliar concept for potential users without background or experience in software engineering.
Do you think the current pattern of passing strings or classes can be relaxed in order to allow pre-parameterized rule object instances to be passed alongside strings and rule classes?
Also, a question regarding testing:
Has the software been tested with a list of rules and use_conditions = False?
I am asking since the paramete input_layer_rule adds a BoundedRule object to the beginning of the list.
what happens, if len(rules) != len(layers of the model) in this case?
I am currently looking at WSquareRule.apply and see
def apply(self, Xs, Ys, Rs, reverse_state):
grad = ilayers.GradientWRT(len(Xs))
# Create dummy forward path to take the derivative below.
Ys = kutils.apply(self._layer_wo_act_b, Xs) #?# Compute the sum of the squared weights. ones = ilayers.OnesLike()(Xs) Zs = iutils.to_list(self._layer_wo_act_b(ones)) # Weight the incoming relevance. tmp = [ilayers.SafeDivide()([a, b]) for a, b in zip(Rs, Zs)] # Redistribute the relevances along the gradient. tmp = iutils.to_list(grad(Xs+Ys+Rs)) return tmp
Line 4 marked with a #? : Is this correct? Is this forward pass not redundant and could be removed by refactoring the function to
def apply(self, Xs, Ys, Rs, reverse_state):
grad = ilayers.GradientWRT(len(Xs))
ones = ilayers.OnesLike()(Xs)
Zs = kutils.apply(self._layer_wo_act_b, ones) #?# Weight the incoming relevance. tmp = [ilayers.SafeDivide()([a, b]) for a, b in zip(Rs, Zs)] # Redistribute the relevances along the gradient. tmp = iutils.to_list(grad(Xs+Zs+Rs)) #? return tmp
Cheers,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.