GithubHelp home page GithubHelp logo

fastshap's People

Contributors

iancovert avatar ms2666 avatar szvsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fastshap's Issues

variability in Fastshap values

Hello @iancovert
I have been reading all of your contribution on Shpley values and model explainability and grateful for your contribution in the field

Now i have tried to test fastshap on a different binary classification dataset and found that fastshap values vary and differ a lot to converged Shapley values and often in the opposite direction
see one of the plot included below
image

Any thoughts or suggestions on how i could try to improve their consistency
notebook code is here

How to decide loss_fn when we build regression models?

I have observed that the loss function KLDivLoss() is utilized in training the surrogate model. However, in the TensorFlow scenario, the loss function employed is categorical_crossentropy. I am now wondering which loss function to utilize when incorporating FastSHAP when we build regression models. Additionally, are there any other significant changes we should consider?

Multiclass classification fast shap

Hello,
thanks for sharing the runnable code from your paper, looking forward to include fastshap among my fav explainers!

I would like to know if there is a stright forward adaptation of FS to a multiclass classification (i.e. more than 2 classes).
I have tried to change directly the number of inputs neurons and output neurons for the surrogate model. From the tutorial, the surrogate model has 2 * num_features inputs and 2 outputs. I changed it in order to handle n_classes * num_features. And similarly in in the explainer model, i changed it to have n_classes * num_features outputs rather than just 2 * num_features.

Maybe I am oversimplifying, but nontheless seems a smooth change, in the execution fails.
Am I overlooking for a solution?

Fastshap handling nominal variables

How does fastshap handle nominal variables?

`library(tidymodels)
library(tidyverse)
library(mlbench)
library(xgboost)
library(lightgbm)
library(treesnip)
data(Glass)

head(Glass)
Glass$Type
rec <-recipe(RI ~., data = Glass) %>% step_scale(all_numeric())

prep_rec <- prep(rec, retain = TRUE)

split <- initial_split(Glass)

train_data <- training(split)

test_data <- testing(split)

model<-
parsnip::boost_tree(
mode = "regression"
) %>%
set_engine('lightgbm' , verbose = 0 )

wf_glass <- workflow() %>%
add_recipe(rec) %>%
add_model(model)
fit <- wf_glass %>% parsnip::fit(data = train_data)

library(fastshap)
explain(object = fit %>% extract_fit_parsnip(), newdata = test_data %>% select(-RI) %>% as.matrix(), X = train_data %>% select(-RI) %>% as.matrix(), pred_wrapper = predict)`

This will lead to the following error : Error in genFrankensteinMatrices(X, W, O, feature = column) : Not compatible with requested type: [type=character; target=double].

My guess :
This might be due to the fact that transforming a data.frame with different vector types with as.matrix() will lead to a character matrix. This matrix can't be transformed to a matrix of type double without loosing the values of the factor columns here Type. On the other hand, as the error expresses, we can't use a numeric target for the regression task if all other variables are of class character.

Am I missing something or is this a possible transforming problem?

Feature: SHAP Interaction FX

In Consistent Individualized Feature Attribution for Tree Ensembles, Lundberg et al define SHAP interaction effects:

Given $M$ features, for $i\neq j$,

$$\phi_{i,j} = \sum_{S\subseteq N\setminus \{i,j\} }c(S)\nabla_{i,j}(S)$$

where the contribution term is given by

$$\nabla_{i,j}(S) = f_x(S\cup \{i,j\}) - f_x(S\cup\{i\}) - f_x(S\cup\{j\}) + f_x(S) \\$$

and the weighting term is given by

$$c(s)=\frac{|S|!(M-|S|-2)!}{2(M-1)!}$$

For the diagonal $i=j$, set $\phi_{i,i}=\phi_i - \sum_{j\neq i}\phi_{i,j}$, where $\phi_i$ is the usual SHAP value.

I'm interested in adapting the FastSHAP loop to generate interaction effects, but I'm unsure how involved the process would be. There are a few approaches I could imagine. In either approach, the new explainer could internally just generate the upper triangular portion of the matrix and then automatically build the fully symmetric matrix.

  1. Use an existing FastSHAP model as the basis
    1. ensure that the new model's predictions rows/columns sum to the correct shap value (using the existing SHAP model)
    2. create a new loss term which enforces the new \nabla_{i,j} term for the off diagonals by sampling random pairs of $i\neq j$. These can be sampled uniformly I think.
  2. Train the explainer to predict both SHAP and shap interactions simultaneously

The tricky part is obviously thinking about how to construct the interaction loss term(s). My intuition is that it is something in the ballpark of the following pseudocode:

# for simplicity in indexing, I'm writing this as if x is a single sample rather than a batch
# and pred is a plain shap interaction matrix rather than a tensor

pred = interaction_explainer(x)
shap_vals = fastshap(x)

# the pred matrix must sum to the same total as the true shap values
shap_sum_loss = mse_loss(shap_vals.sum(),pred.sum())

# the columns and rows of the matrix must sum to the same total as the shap values
shap_col_loss = mse_loss(shap_vals, pred.sum(dim=0)
shap_row_loss = mse_loss(shap_vals, pred.sum(dim=1)

# symmetry loss
shap_sym_loss = mse_loss(pred, pred.T)

# interaction loss
a,b= sample_ij()
S = sample_excluding_ij(a,b) # assume S is just a single coalition for now, not a batch
null = impute(x,{})
baseline = impute(x,S)
baseline_with_i = impute(x, S union {a})
baseline_with_j = impute(x, S union {a})
complete = impute(x, S union {a,b})
ifx_S = S @ pred # something like this to pick out all cells where both i,j are in S
ifx_S_i = (S union {a})  @ pred # something like this to pick out all cells where both i,j in S union {a})
ifx_S_j = = (S union {b})  @ pred # something like this to pick out all cells where both i,j in S union {b})
ifx_S_ij = = (S union {a,b})  @ pred # something like this to pick out all cells where both i,j in S union {a,b})
loss_S = mse_loss(null + ifx_S.sum(), baseline)
loss_S_i = mse_loss(null+ifx_S_i.sum(), baseline_with_i)
loss_S_j = mse_loss(null+ifx_S_j.sum(), baseline_with_j)
loss_S_ij = mse_loss(null+ifx_S_ij.sum(), complete)

There might of course be some complexities which I totally miss out on in terms of how i and j are sampled, how S is sampled, etc etc.

Just wanted to get your thoughts on the level of complexity from a math perspective of implementing this... i.e. does it need a full paper's worth of work to prove out the right sampling schemes, loss term, etc, or is it something that should be trivial to figure out?

Notebook issue: the kernel appears to have died. it will restart automatically.

Hi!

I got this notification while running this line in cifar_single_model.ipynb:

fastshap.train(
fastshap_train,
fastshap_val,
batch_size=128,
num_samples=2,
max_epochs=200,
eff_lambda=1e-2,
validation_samples=1,
lookback=10,
bar=True,
verbose=True)

I reinstall the fastshap, but the issue remains there.

Does anyone know what should I do?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.