Comments (4)
related:
from shap.
Hey!
It is important to note that while KernalExplainer assumes feature independence when estimating the conditional expectations, it still captures the importance of n-grams if the model depends on the n-grams. Assuming feature independence means you will toggle the words independently. I would make the reference value for a word be that the word is not there. That might require a wrapper function around the model that takes a binary vector and maps it to a token sequence with specific words missing.
It might be worth putting together an example notebook for a text processing RNN at some point.
Another option is the integrated gradients method, which is faster but is restricted to a comparison with a reference value (which is not a big deal here where we are using a single reference value anyway).
from shap.
Hi Scott,
Thank You for publishing code along with paper on model interpretability.
Found notebooks especially helpful while working on my own problem - explaining black box model that takes as it's input word vectors.
Embedding is part of my pipeline, and actual input to the model as a whole is a list of word tokens.
Let me start describing what I was able to achieve and later ask for a nudge ;)
In current setting, I've enriched each token list with position within the sequence. Additionaly I've assumed reference values need to be position dependend in the text, only than I could reliably steer (indirectly) toggle reference value replacement in KernelExplainer->explain()
based on position in in the text to be explained.
Steering is done below with vectorized version of replace_index_with_word
. Index is token sequence, generated with Keras by Tokenizer.fit_on_texts()
.
class SpecialToken(Enum):
EMPTY = 0
def replace_index_with_word(self, _elem):
if _elem == SpecialToken.EMPTY.value: # order matters
return SpecialToken.EMPTY
if type(_elem) is int: # order matters
return self.index_word[_elem]
if isinstance(_elem, float): # handle both python and numpy floats
return self.index_word[int(_elem)]
else:
return
def f(X: np.ndarray):
vreplace_index_with_word = np.vectorize(replace_index_with_word)
return pipe.predict_proba(vreplace_index_with_word(X))
Token list to data frame for SHAP.
def list_to_named_columns(tokens):
return {'pos_{:d}'.format(i + 1): [x] for i, x in enumerate(tokens)}
def tokens_to_data_frame(tokens):
tokens_with_seq: dict = list_to_named_columns(tokens)
return pd.DataFrame.from_dict(tokens_with_seq, orient='columns')
For multiple examples, I'm simply concatenating data frames as below
class SpecialToken(Enum):
EMPTY = 0
def examples_to_data_frame(examples):
frames = [tokens_to_data_frame(token_list) for token_list in examples]
return pd.concat(frames).reset_index(drop=True).fillna(SpecialToken.EMPTY)
SpecialToken is a type I've created to skip word vector embedding later on (part of the pipeline - not published here).
The output for single example explanation shows attributions as below
More debug information
Explaning ['taki' 'bzdura' 'musieć' 'napisać' 'jakiś' 'abderyta' '.'] with reference being [<SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>, <SpecialToken.EMPTY: 0>]
It's clear which words attributed the most or which one lower the overall score, but for multiple examples token position (which is group/feature name) doesn't focus attention on the word itself at all, ie.
This leads me to the main idea, as for me word itself carries more information, was thinkin about replacing token sequences with one hot encoding, while maintaing underneath ability to express proper syntethic data (ie. bidirectional mapping based on masking)
Do You think it wouldn't undermine the logic behind Your library or maybe You could suggest different route if anyting on Your mind.
Any comment would be appreciated.
from shap.
Just saw this. That's an interesting question. I can see that by treating the inputs by position you lose the ability to see the importance of a single word. It would not break SHAP to use a one hot encoding for features, and then just mask those words in the sequence that are not "on" before sending it through then model. Hope that helps.
I should also mention that we are working on deep learning specific DeepExplainer
that will make Keras models much faster to explain. I'll post here once the first version of that finishes.
from shap.
Related Issues (20)
- CIBuildWheel failing on windows runners
- ENH: Limiting number of CPU cores used by shap HOT 4
- [Meta issue] Release 0.45.0 HOT 5
- BUG: Waterfall feature names IndexError HOT 1
- BUG: DeepExplainer throws error when using `__call__`
- CI failing on tensorflow 2.16+ due to incompatibility between transformers & keras V3
- BUG: Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Functio HOT 3
- ENH: Display worst features with barplot
- Allow shap to take list or dict as input HOT 3
- BUG: LightGBM with multiclass interaction TreeShap produces explainer error HOT 12
- ENH: Integrate Fasttreeshap speedup into SHAP HOT 2
- BUG: `base_score` attribute of the `XGBTreeModelLoader` is broken for all exponential losses (e.g. tweedie, poisson) HOT 4
- BUG: ERROR USING LLAMA-2 HOT 15
- BUG: 0.45.0 update breaks pytorch example on docs HOT 1
- x
- BUG: Error using Falcon for text-generation HOT 4
- BUG: Error when using DeepExplainer on LSTM Model HOT 1
- ENH: Partition Explainer for Video Models
- Does Feature/Column Order of dataset matter while calculating SHAP values? HOT 3
- When will the paddlepaddle framework be supported HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shap.