Comments (10)
Oh, this is interesting - you have both text AND continuous features.
If all you care about is explaining the text parts while keeping the continuous values fixed, it should be easy, as the only difference is that you have two sentences instead of one. What you would need to do in this case is to put both sentences in a string, separated by a character that is not caught by the split_expression parameter (see the initializer of LimeTextExplainer). You then could define a classifier_fn that takes whatever text LIME gives is, splits it into two sentences, computes the sentence vectors and etc. Your code would look something like this:
def get_classifier_fn(continuous_features):
def classifier_fn(text_list):
ret = []
for text in text_list:
sentence1, sentence2 = text.split(SPLIT_CHARACTER)
# assuming this returns a number between 0 and 1
ret.append(model.predict_similarity(get_embedding(sentence1),
get_embedding(sentence2),
continuous_features))
return ret
return classifier_fn
Then, if you wanted to explain a particular instance, you would have to do something like this:
def explain_instance(instance):
text = '%s %s %s' % (instance.sentence1, SPLIT_CHARACTER, instance.sentence2)
fn = get_classifier_fn(instance.continuous_features)
explainer = LimeTextExplainer(split_expression=SPLIT_EXPRESSION, bow=BOW)
return explainer.explain_instance(text, fn, labels=(0,))
You may have to be a bit creative if you want to use the visualizations we have, since the text displayed will have both sentences. Also, you have to think if it makes sense to use bow=True or False in this case. If you're doing anything with sequences, I would say False makes more sense.
If you want to explain the impact of the words and the numerical features, you'll have to write a function that perturbs both at the same time. If this is the case, you would want to do a mix of __data_labels_distances in lime_text.py and __data_inverse in lime_tabular.py.
Both are viable ideas, the second one is definitely a bit more work.
I would be very interested to see what you end up with, I've never thought about this particular use case. If you are comfortable sharing what the application is, please tell me over email : ).
Let me know if you have any more questions too.
Best,
from lime.
Thank you so so much! Your explanations are super helpful. For now, I'm particularly interested in the text components so will focus on that. I will definitely drop you a mail :)
from lime.
Hi, in code "explainer.explain_instance(text, fn, labels=(0,))", the default value is "labels=(1,)" and explanation in your original code is "labels: iterable with labels to be explained." I wonder what this variable exactly stands for? Thanks.
from lime.
This parameter is there for multi-class classification problems, and it defines for which classes (or labels) we want to generate explanations for. The default is 1 because usually what we have is two-class classification, where we only care about explaining label 1 (since the explanation for label 0 is the opposite).
The code assumes that the classifier function returns an array with a column per label (the probability for each), while in this example we are only outputting one number. That is why I set labels=(0,).
from lime.
Thanks so much for your quick response. BTW, is there a chance to only explain several particular features? Which parameter should I set?
from lime.
Do you mean restrict the explanation to a particular set of features? If so, there is no parameter for that.
from lime.
Thank you so so much! Your explanations are super helpful. For now, I'm particularly interested in the text components so will focus on that. I will definitely drop you a mail :)
Thank you so so much! Your explanations are super helpful. For now, I'm particularly interested in the text components so will focus on that. I will definitely drop you a mail :)
Hi! I am conducting the same task as yours. But I find the final score of every token is similar. I think it's incorrect. Could you please tell me how you complete it?
More appreciate!!!!!!!!
from lime.
LimeTextExplainer
can someone give an example SPLIT_EXPRESSION
? 🙏
from lime.
@marcotcr can you give an example for SPLIT_EXPRESSION
? 🙏 I have a very similar use case meanwhile would not know what to pass to SPLIT_EXPRESSION
. many thanks! 🙏 documentations here are not helping:
split_expression – Regex string or callable. If regex string, will be used with re.split. If callable, the function should return a list of tokens.
from lime.
apparently without using SPLIT_EXPRESSION
, the split_char is considered part of the input and get interpreted
from lime.
Related Issues (20)
- Tensorflow bert and lime HOT 1
- How to explain seq2seq model using lime?
- How can explanation data be inverse normalized?
- TypeError: TextEncodeInput
- Lime consuming too much RAM for explanations for layoutLM HOT 1
- Value error for more then one word ? any work through ?
- ImportError: cannot import name 'QhullError' from 'scipy.spatial'....
- ValueError: Found input variables with inconsistent numbers of samples: [5000, 20000] ? how does this occur is soemthing wrong with my tokeniser ?
- Multi_Label_Classification
- Is this Repo Maintained? HOT 2
- Can LIME only be used for text classification tasks?
- How to explain prediction for a data with just a few features (from all features of training dataset)? HOT 2
- Trying to explaine a simple Neural Network using LIME
- submodular pick
- ans = self.domain_mapper.map_exp_ids(self.local_exp[label_to_use], **kwargs) KeyError: 1
- Getting error when my predict_fn is actually a method from a class HOT 1
- `show_in_notebook` shows no bars
- importance score for all tokens
- Unexpected keyword argument 'progress_bar' for lime_image.LimeImageExplainer().explain_instance()
- Issues with LIME Implementation in Image Classification
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lime.