suinleelab / path_explain Goto Github PK

View Code? Open in Web Editor NEW

185.0 14.0 28.0 209.24 MB

A repository for explaining feature attributions and feature interactions in deep neural networks.

License: MIT License

Python 4.48% Jupyter Notebook 95.30% Shell 0.22%

machine-learning interpretable-deep-learning explainable-ai tensorflow pytorch

path_explain's People

Contributors

Stargazers

Watchers

path_explain's Issues

Whether the feature attribution method can be applied on training data?

Thanks for your great work! Can I ask a general and quick question? Is it reasonable to make explanations on training instances?

To give an example, model f is trained on dataset X and tested on dataset Y. When I debug a model f, can I use the explanation method to explain the training data, to know which features model f focuses on?
(Usually, the explanation method is applied on test/validate data. According to your paper, the explanation method is also applied on validation set.)

Thank you!

Using Longformer

I followed the steps shown in the path_explain README file but used a fine-tuned Longformer sequence classification model. It did not work since there is no model.pre_classifier() method for Longformer. Do you have any suggestions on how I could make this work?

can't find bert_explainer

Hello!

In interpret_stsb.ipynb, there is a module called bert_explainer (from bert_explainer import BertExplainerTF), but it seems that there is no bert_explainer in the repo, it would be helpful if you could share that. Thank you!

Can you help update a version with pytorch example?

Hi there,

I was trying to reproduce the example and I found the current examples seems to be out of date with some incorrect path and it needs further modification.

On the other hand, although I've revise the current version and reproduce with TF version, I'm having a hard time reproducing the pytorch version.

Can you help generating a pytorch version example?

Sincerely

No convergence for IH for larger input strings

Hi!

I've been working with your IH/IG implementation lately, and doing some experiments with it in an NLP context. What I have noticed is that when I increase the length of my input this is an adverse effect on the convergence of my IH interactions, with respect to the attributions I'm getting with IG.

IG itself converges nicely with respect to the completeness axiom and the model output, but the interaction completeness axiom of section 2.2.1 of your paper does not seem to hold at all in these cases

In this plot you can see that as the input length is increased, the Mean Squared Error between the interactions (summed over the last dimension) and the attributions no longer converges to a reasonable margin of error, with the number of interpolation points for IH on the x-axis (note the log scale on the y):

I tested this on a 1-layer LSTM (very tiny, only 16 hidden units), using the Tensorflow implementation of IH+IG, with a fixed zero-valued baseline (so not using the expectation).

What I was wondering is whether you encountered similar issues when testing your approach on larger models. I see that in Theorem 1 of the paper you touch upon related issues, but that only seems to concern the simply feedforward layer case, and not more complex models like LSTMs.

EmbeddingExplainerTorch not available in pip package

The pip package contains the EmbeddingExplainerTF class but not its torch counterpart, even though it is implemented. Is there any known issue with the Torch version of the EmbeddingExplainer class?

Extend torch interactions to higher dimensions

Seems like PathExplainerTorch.interactions only supports 2D tensors, unlike the TensorFlow version.

What are the bottlenecks to supporting arbitrarily-sized tensors, and how difficult would the change be? If it's not too bad, I would be interested in making a PR myself.

Would love any input @jjanizek, thanks!

Mismatch num_samples for Torch and Tensorflow implementation IG

I'm currently testing the Integrated Gradients implementation on a torch and tensorflow version of a Huggingface model, and noticed the attributions I got with the same configuration are slightly different.

I tested this with use_expectations=False, so for the Integrated Gradients and not the Expected Gradients.

It turns out that the num_samples argument behaves slightly different: for the torch model _get_samples_input returns an interpolated baseline that is of size num_samples + 1, which is happening at this point due to the +1:

path_explain/path_explain/explainers/path_explainer_torch.py

Lines 102 to 103 in b567945

 scaled_inputs = [reference_tensor + (float(i)/num_samples)*(input_expand - reference_tensor) \ 

 for i in range(0,num_samples+1)]

The _sample_alphas method of PathExplainerTF, on the other hand, returns an interpolation tensor of length num_samples, which causes the mismatch.

When passing a num_samples to the torch implementation that is 1 lower instead the attributions of both methods are exactly the same.

I'm not sure which of these two is more "correct", but I think it would be good if num_samples acts the same for both implementations. One could even argue that in both cases we should return num_samples+2 points, given that the interpolation also includes the input + baseline itself; in that case the number of intermediate points based on which we'd be computing the attributions would actually be equal to num_samples.

Unable to use multiprocessing pool with PathExplainerTF

When I do the following, I get an error that tensorflow objects are not pickleable:

import multiprocessing as mp

explainer = PathExplainerTF(model)

with mp.Pool() as p:
attributions = p.starmap(explainer.attributions, [i, ...] for i in x_test])

output_indices not passed through for torch interactions

Hi! I'm trying to run your interaction setup on a torch model (from Huggingface's library), but I run into trouble because the output_indices doesn't seem to be passed through to the attributions method.

path_explain/path_explain/explainers/path_explainer_torch.py

Lines 245 to 246 in 673ee95

 ig_tensor[:,i,:] = self.attributions(particular_slice, baseline, 

 num_samples=inner_loop_nsamples, use_expectation=use_expectation)

This causes an error in _get_grad, when output_indices is trying to be unsqueezed, but because it's not passed it is still set to None:

path_explain/path_explain/explainers/path_explainer_torch.py

Lines 129 to 131 in 673ee95

 indices_tensor = torch.cat([ 

 sample_indices.unsqueeze(1), 

 output_indices.unsqueeze(1)], dim=1)

cannot import name 'EmbeddingExplainerTF'

Hi, I installed path-explain via pip and somehow cannot import EmbeddingExplainerTF class, although I have no trouble importing PathExplainerTF

from path_explain import EmbeddingExplainerTF

.npy files

attributions = np.load('attributions.npy')
interactions = np.load('interactions.npy')

Hi, thanks for your great works!
And I want to reproduce the examples,but I found that these files seem to be missing, can you tell me where I got this from?

	scaled_inputs = [reference_tensor + (float(i)/num_samples)*(input_expand - reference_tensor) \
	for i in range(0,num_samples+1)]

	ig_tensor[:,i,:] = self.attributions(particular_slice, baseline,
	num_samples=inner_loop_nsamples, use_expectation=use_expectation)

	indices_tensor = torch.cat([
	sample_indices.unsqueeze(1),
	output_indices.unsqueeze(1)], dim=1)

suinleelab / path_explain Goto Github PK

path_explain's People

Contributors

Stargazers

Watchers

Forkers

path_explain's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs