fursovia / tcav_nlp Goto Github PK
View Code? Open in Web Editor NEW"Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)" paper implementation
Home Page: https://arxiv.org/abs/1711.11279
"Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)" paper implementation
Home Page: https://arxiv.org/abs/1711.11279
In the file calculate_tcav.py lines 17-18 you load a pickle file called labs_mapping.pkl
. I do not see this created anywhere in the code. Could you help me understand where this comes from?
Hi Ivan, first of all thank you for sharing your code.
I am starting to run a few experiments and wanted to integrate pre-trained embeddings.
So far I was thinking about adding it into the model_fn and use the nn.embedding_lookup.
Alternatively one could also integrate it directly inside the vectorize function inside the input_fn.py
The following is the snippet I am using to load in embeddings.
def loadPretrainedModel(file):
print("Loading Pretrained Model")
f = open(file,'r')
model = {}
for line in f:
splitLine = line.split()
word = splitLine[0]
embedding = np.array([float(val) for val in splitLine[1:]])
model[word] = embedding
print("Done.",len(model)," words loaded!")
return model
I was considering something like the following:
word_model = loadPretrainedModel(file):
variable = tf.Variable(word_model, dtype=tf.float32, trainable=False)
embeddings = tf.nn.embedding_lookup(variable, features['x'], name='emb_matrix_lookup')
Somehow this is not really working though.
Alternatively one could integrate the gensim library which could speed up things perhaps.
I would really appreciate any thoughts you have on this.
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
In addition to this, I wanted to ask whether you think it is possible to use the input_fn.py mapping for getting the gradients inside the collect_concepts.py too.
Currently the code is using
grads = mw.calculate_grad(sess, labels, train['text'].tolist())
However, the issue is that this doesn't work with input sequences longer than 10 as it doesn't automatically create the sequences.
One way to get around is to transform the whole data into sequences that follow seq length but ideally one could use the same mapping from input_fn used for training the model. If you have any ideas how this could work with TensorFlow’s dataloader/mapping that would be great.
In any case: thanks again for sharing and I would apreciate any help you could provide.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.