distillpub / post--attribution-baselines Goto Github PK

View Code? Open in Web Editor NEW

16.0 3.0 5.0 1.38 GB

The repository for the submission "Visualizing the Impact of Feature Attribution Baselines"

License: Other

HTML 0.72% JavaScript 1.38% TeX 0.10% Python 0.50% Jupyter Notebook 97.25% Shell 0.05% CSS 0.01%

machine-learning interpretable-machine-learning data-visualization d3 distillpub

post--attribution-baselines's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

To view the rendered version of the post, visit: https://distill.pub/2020/bayesian-optimization/

Authors

Apoorv Agnihotri and Nipun Batra (both IIT Gandhinagar)

Offline viewing

Open public/index.html in your browser.

NB - the citations may not appear correctly in the offline render

post--attribution-baselines's People

Contributors

Stargazers

Watchers

Forkers

knut0815 vivonasgv soongxueyong oliver4701 dearborn-open-ai

post--attribution-baselines's Issues

Confusion about the Riemann Integral of expected gradients.

Hello,
In the Section of Averaging Over Multiple Baselines the expected gradients is originally defined as the first integral as an expectation over D, and the second integral as an expectation over the path between x’ and x which is double integral. Then as the EG is belong to "the path atrribution methods", it becomes the Riemann Integral of one dimension.

In the origin paper the path methods are defined as integrating the gradients along a smooth path from x' to x. Expected gradient is the average of multiple path methods but these path methods have different origin point x', then why they can be considered as the " integrate gradients over one or more paths between two valid inputs"(there are more than two inputs). Then why it can be computed as the Riemann Integral of one dimension with a single path.

The second confusion stems from the figure(4) in Using the Training Distribution. I want to know what the blue line's value means in this picture. In A Better Understanding of Integrated Gradients the blue line in figure(4) represents the |f(x)-f(x')|, but in the train dataset expected gradient formula, the x' is not a constant value so why is the blue line represents a constant value. If it is the average of f(x') , it is reasonable that the blue line value changes along with the increasement of k.

Moreover, the gradient function is obviously not a linear function. In the Riemann Integral the differential part converge to x-E(x') while the function part never converge to gradient(x-E(x')), so I'm not sure that the expected gradient can really converge a constant value or a curve.

Docs -> Public

Hi,

I'm on the Distill team and I'm looking to put your article on Distill's staging server so it's ready to go! Unfortunately, our setup currently uses public for rendering the static assets instead of docs, which gh-pages uses. We're looking at allowing configurable folders in the future, but since we don't have that yet would you mind changing the folder?

Thanks!
Nick

Review #1

The following peer review was solicited as part of the Distill review process.

The reviewer chose to waive anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service they offer to the community.

Distill is grateful to Ruth Fong for taking the time to review this article.

General Comments

more examples beyond the 4 that are used throughout would be appreciated (i.e., in the last figure, consider adding more examples; why does the owl example look ""better"" for integrated gradients than for expected gradients)
how does expected gradients stand up to other desirata for interpretability (i.e., Sanity Checks [Adebayo et al., NeurIPS 2018], Lipton ICML Workshop 2016)
provide more explanation for sum of cumulative gradients for expected gradients (i.e., why is it desirable that the red line is close to the blue line? what does that mean?)

Small suggestions to improve readability:

increase size of labels on diagrams (i.e., slider ticklabels for alpha are unreadable, y-axis ticklabels on eq 4 graph is unreadable)
add a bit more explanation around figure of first 4 equations (in particular, explanation of red line [eq 4] -- is this a mean or a sum?; highlight in caption more clearly that red line is sum of cumulative gradient at current alpha over all pixels)
provide a brief explanation (i.e., a footnote) for how is a scalar extracted per feature [i.e., pixel] given the 3D RGB vector per feature (i.e., is the max(abs(dy/dx))) taken across color channels, as is done in Simonyan et al., 2014)?

Main weakness regarding ""Scientific Correctness & Integrity"" is a lacking discussion about related works and limitations:

missing discussion with other highly related literature: SmoothGrad [Smilkov et al., arXiv 2017] and RISE [Petsiuk et al., BMVC 2018]
should briefly discuss that inputs being presented (interpolation between two images) are outside the training domain
generally missing citations and mention of other kinds of attribution methods besides path ones
room to improve discussion on single input choice (what about other typical choices for the baseline value besides a constant color, such as random noise or blurred input [Fong and Vedaldi, 2017])
to improve reproducibility, having a ""repro in colab notebook"" button for at least one of the figures would be a nice to have

Distill employs a reviewer worksheet as a help for reviewers.

The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: Both explanation of existing methods (i.e., integrated gradients) and presentation of novel method (i.e., expected gradients)

Advancing the Dialogue	Score
How significant are these contributions?	3/5

Outstanding Communication	Score
Article Structure	3/5
Writing Style	4/5
Diagram & Interface Style	3/5
Impact of diagrams / interfaces / tools for thought?	3/5
Readability	4/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	1/5
How easy would it be to replicate (or falsify) the results?	3/5
Does the article cite relevant work?	2/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	2/5

Incorrect derivative of function w.r.t. parameterized argument alpha

This footnote is not quite right.

post--attribution-baselines/public/index.html

Lines 267 to 278 in 765dadd

 <d-footnote>That is, if we integrate over the 

 straight-line between \(x'\) and \(x\), which 

 we can represent as \(\gamma(\alpha) = 

 x' + \alpha(x - x')\), then: 

 $$ 

 \frac{\delta f(\gamma(\alpha))}{\delta \alpha} = 

 \frac{\delta f(\gamma(\alpha))}{\delta \gamma(\alpha)} \times 

 \frac{\delta \gamma(\alpha)}{\delta \alpha} = 

 \frac{\delta f(x' + \alpha' (x - x'))}{\delta x_i} \times (x_i - x'_i) 

 $$ 

 The difference from baseline term is the derivative of the 

 path function \(\gamma\) with respect to \(\alpha\).

$$\gamma(\alpha)$$ is a vector, so by the multivariable chain rule,

$$
\frac{\delta f(\gamma(\alpha))}{\delta \alpha}
= \sum_i \frac{\delta f(\gamma(\alpha))}{\delta \gamma_i} \times  
  \frac{\delta \gamma_i(\alpha)}{\delta \alpha}
= \sum_i \frac{\delta f(x' + \alpha' (x - x'))}{\delta x_i} \times (x_i - x'_i)
$$

Review #2

The following peer review was solicited as part of the Distill review process.

The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service they offer to the community.

General Comments

Paper summary:

Several feature attribution methods rely on an additional input (besides the one being explained) called the “baseline”. The paper discusses how the choice of baseline impact the attributions for an input, and proposes the idea of averaging over several baselines when good individual choices do not exist. It does this in the context of the specific attribution method called “Integrated Gradients” and the specific task of object recognition on the ImageNet dataset.

Pros:

The paper is very well-written an easy to follow. It offers a very nice exposition of the Integrated Gradients method. The interactive visualization immensely help with understanding the various ideas
The paper tackles the important and thorny issue of picking baselines in feature attribution methods. The visualization that allows choosing different segments of the input image as a baseline is very clever. It makes the sensitivity of the attributions to the choice of baselines very apparent.

Cons:

The paper views the baseline as a mere implementation detail of Integrated Gradients (and other feature attribution methods). This is a bit misleading. The Integrated Gradients paper considers the baseline to be a part of the attribution problem statement. The various axioms are also defined for the pair of input and baseline. In that sense, Integrated Gradients posits that one must commit to a baseline while formulating the attribution problem.
It would help to have more discussion on properties of Expected Gradients (and more generally of the idea of “averaging over baselines”). It is also not clear if one must simply average the attributions across different baselines. Instead, one may study the distribution over attributions to identify differnet patterns, say via clustering. (See the next section for more suggestions.)

Suggestions:

Below are some suggestions on improving / extending this paper:

The idea of averaging over several baselines seems quite general, and so the paper could be greatly strengthened by including an additional example (preferable for a task on text or tabular inputs)
It would help to discuss what axioms do Expected Gradients satisfy? Is there a new completeness axiom to tell us that we have taken enough background samples?
Computing Expected Gradients involves computing the average attribution relative to a random sample of baseline points. The sampling brings uncertainty, and I wonder if the authors considered quantifying the uncertainty with confidence intervals?
An attractive property of the black baseline is that it is encoded as zero, and therefore it is clear how to interpret the sign of the attribution — positive attribution means that the model prefers the pixel to be brighter. If the baseline is non-zero then the sign of the attribution is harder to interpret. A positive attribution would mean that the model prefers for the pixel to move away from the baseline. This may mean making the pixel brighter or darker depending on which side of the baseline the pixel lies. The problem is exacerbated when several different baselines are considered. It would help if the authors comment on interpreting the sign of the attributions.
While the formalism discussed in the paper assumes a certain input distribution D, in practice, we only have a certain sample of the distribution. Often the sample may not be representative. In such cases, I worry that artifacts of the sample may creep into the Expected Gradients. It would help if the authors comment on this.
When considering multiple baselines it could be that the attribution to a pixel is positive for some baselines and negative for some others, and the average attribution ends up being near zero. In such cases, I wonder if the expectation is right summarization of the distribution of attributions across different baselines? Instead, one could consider clustering the attributions (from different baselines) to separate the different patterns at play.
The idea of averaging gradients across a sample of points is also used by SmoothGrad (https://arxiv.org/abs/1706.03825). Is there a formal connection between Expected Gradients and SmoothGrad?

Minor:

In the second to last figure, what is the value of alpha used for parts (2) and (4)?

Distill employs a reviewer worksheet as a help for reviewers.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: Explanation of existing results

Advancing the Dialogue	Score
How significant are these contributions?	4/5

Outstanding Communication	Score
Article Structure	4/5
Writing Style	4/5
Diagram & Interface Style	4/5
Impact of diagrams / interfaces / tools for thought?	4/5
Readability	4/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	2/5
How easy would it be to replicate (or falsify) the results?	4/5
Does the article cite relevant work?	3/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	3/5

Review #3

The following peer review was solicited as part of the Distill review process.

Distill is grateful to Sara Hooker for taking the time to review this article.

General Comments

The question proposed by the authors is very interesting, namely that the choice of baseline is a crucial hyperparameter that determines all subsequent attribution. Given the wide use of integrated gradients in various sensitive domains, it is a valuable contribution. I enjoyed the visuals, and think these charts provide valuable insight for a new comer to the field.
The key methodological limitations of the draft (as is) would appear to be the lack of a formal framework to articulate the differences in properties between the baselines introduced. While expected gradients may avoid the issue of ignoring certain pixel values, the same could be said of using a baseline of random noise[?] How do we compare the relative merit of these baselines?

Comments on writing exposition

This version of the article is a promising draft on an interesting question -- ""What is a valid reference point for an attribution method such as integrated gradients?"" ""What are the implications of this choice?""

Below follows my comments related to writing exposition:

Some sections of the text could benefit from repositioning -- for example ""This question may be critical when models are making high stakes decisions about.. "" speaks to the motivation of the work but is buried in the second section.
Certain terms are introduced far too late in the draft such as ""path methods"" which would have been good to introduce at the very beginning to clarify the scope of the contribution. Instead, phrases such as ""Most require choosing a hyper-parameter known as the baseline input..."" suggest that most saliency methods are path methods when this is not the case -- many estimate the contribution of pixels to a given prediction using different methodologies (raw gradient, pertubation based methods such as Fong et al., 2017, Ribeiro et al. 2016).
section ""Game Theory and Missingess"" actually introduces the topic of interest -- why does the choice of baseline matter, why should the reader case. Having this section so far into the article is disruptive for the reader, a reshuffling of sections could improve the flow.
Diagrams - high latency for certain diagrams. for figure 1,2 the true label and predicted appear identical for all chosen images which make it less interesting to have the same images repeated.

Comments on methodology and exposition of contributions

The authors have made replication of results very easy by releasing code, and put together experiments that use a standard CV architecture (InceptionV4), open source dataset and only require limited compute.

Below follows additional comments related to the methodology and exposition of contributions (as well as suggested relevant work):

this draft omits that Sundararajan et al. themselves discuss the difficulty of choice of baseline, and propose that for computer vision tasks one possible choice is a black image (note that this is not always a grid of zeros, if the network normalizes the images as a pre-processing step),. Note that in their work, Sundararajan et al. also mention that there may be multiple possible baseline, including one of random noise.
The current draft does not appear to take into account that this was already surfaced as an acknowledged limitation by the authors, which feels like a large oversight and unnecessary. It is also important to note that Sundararajan et al. are careful to convey that a black image is not the only possible choice, but that the intent of the baseline is "" to convey a complete absence of signal, so that the features that are apparent from the attributions are properties only of the input, and not of the baseline.""
In our recent work on evaluating saliency methods, we also dedicate discussion to the implications of choice of baseline for integrated gradients (Kindermans et al. The (Un)reliability of Saliency methods, section 3.2). We formally evaluate how the reliability of the explanation depends on the choice of reference point. We do in fact show that a black image reference point is reliable, but a zero grid is unreliable. This set of experiments may be useful as context for this work, in particular it seems that partly what is missing right now is a formal measure to compare how choosing different baselines impacts the end properties of the explanation. Additional work in this direction includes (Adebayo et al. 2018 Sanity Checks for Saliency Maps, Hooker et. al 2019 Evaluating Feature Importance Estimates) Given the new baseline, is the method a more reliable feature importance estimator?
Visualizations of saliency maps with post-processing steps such as taking the absolute value or capping the feature attributions at 99th percentile is problematic. These post-processing steps are not well justified theoretically and appear to mainly be used to improve the perceptual quality of saliency map visualization.
how much more computational cost is incurred by expected gradients over integrated gradients? This would appear to be another hyperparameter introduced which is the number of images to be interpolated over?

Distill employs a reviewer worksheet as a help for reviewers.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: Explanation of existing results

Advancing the Dialogue	Score
How significant are these contributions?	3/5

Outstanding Communication	Score
Article Structure	3/5
Writing Style	3/5
Diagram & Interface Style	3/5
Impact of diagrams / interfaces / tools for thought?	5/5
Readability	3/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	1/5
How easy would it be to replicate (or falsify) the results?	5/5
Does the article cite relevant work?	4/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	5/5

	<d-footnote>That is, if we integrate over the
	straight-line between \(x'\) and \(x\), which
	we can represent as \(\gamma(\alpha) =
	x' + \alpha(x - x')\), then:
	$$
	\frac{\delta f(\gamma(\alpha))}{\delta \alpha} =
	\frac{\delta f(\gamma(\alpha))}{\delta \gamma(\alpha)} \times
	\frac{\delta \gamma(\alpha)}{\delta \alpha} =
	\frac{\delta f(x' + \alpha' (x - x'))}{\delta x_i} \times (x_i - x'_i)
	$$
	The difference from baseline term is the derivative of the
	path function \(\gamma\) with respect to \(\alpha\).

distillpub / post--attribution-baselines Goto Github PK

post--attribution-baselines's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

Authors

Offline viewing

post--attribution-baselines's People

Contributors

Stargazers

Watchers

Forkers

post--attribution-baselines's Issues

General Comments

General Comments

Paper summary:

Pros:

Cons:

Suggestions:

Minor:

General Comments

Comments on writing exposition

Comments on methodology and exposition of contributions

Recommend Projects

Recommend Topics

Recommend Org

Jobs