Comments (14)
Good idea, and thanks for pointing me to that code. I've been experimenting with gradient-based sensitivity analysis on the attention weights (rather than the inputs), but the results weren't particularly interpretable. But I may revisit this with some of the newer fine-tuned models. I agree that this would be a useful analysis for the inputs, though it may overload the visualization. I'll think more about this. Thanks.
from bertviz.
Hi, I did the implementation of some gradient-based algorithms, you can check it here: https://github.com/koren-v/Interpret
(Integrated Gradients works pretty good compared to Smooth Grad)
from bertviz.
I can definitely recommend captum as well, they have an example using BERT: https://captum.ai/tutorials/Bert_SQUAD_Interpret
from bertviz.
I've been wondering how to do this as well as I want to try and visualize which words were most important to the classification. One idea I've had is (if I have 12 encoders in BERT and am only fine-tuning the 12th layer) to take the output of the 11th layer, the Wq, Wk, and Wv weights of the fine-tuned 12th layer and calculate the score manually. Would that be the correct way to think of it? I would essentially get the value score for each token in the input.
from bertviz.
Hi, @jessevig Thanks for your great work. As @lspataro @Azharo mentioned, I want to ask some more details. Lets see an example below:
As we know, BERT/GPT2 accept datas in the form of "sentence, label" or "prompt, inference, label", which maybe means a text classify task and a NLI task, like these sentences:
Text classifies:
s1: The girl wears a red hat and dresses up like a princess. ---label "clothing"
s2: He went to play basketball after class. It seems not many people today. ---label "sport"
As we expected, the words hat and dress in s1, basketball in s2 should get more attentions or a greater weight.
NLIs:
prompt: Some people think that strict punishments for driving offenses are the key to reducing traffic accidents.
inference: Government should invest more in non-profit advertisements, so that they would improve people’s safety awareness potentially
label: 1
As we expected, the words "driving offenses","traffic accidents" in prompt and "government","safety awareness" in inference maybe get more attentions or a greater weight, other words maybe get smaller weights.
We want to get results like these, maybe not a relatively complex visualization results as @jessevig showed, I think this work may need to be done in the last layer of Bert. Furthermore, I would like to see a process.
For example, when we focus on the word hat in s1 above, according to the result clothing, I would like to see how Bert focuses higher weight on hat step by step, which should be the result of a series of time series locking hat, but I am not clear about what it should be, hoping to get your answer. Thank you!
from bertviz.
Hi there, anyone working on this topic?
I am looking for a way to identify the most important words in a sentence classification task as well.
from bertviz.
+1
from bertviz.
+1
from bertviz.
same question here :)
from bertviz.
+1
from bertviz.
This package does word importance directly with huggingface transformers, using captum
https://github.com/cdpierse/transformers-interpret
Transformers Interpret is a model explainability tool designed to work exclusively with the 🤗 transformers package.
In line with the philosophy of the transformers package Transformers Interpret allows any transformers model to be explained in just two lines. It even supports visualizations in both notebooks and as savable html files.
from bertviz.
Closing this issue as I feel it's out of scope for BertViz given the other available libraries.
from bertviz.
Yeah Same question
from bertviz.
I try to use the code in this link https://github.com/cdpierse/transformers-interpret for a multi classification task for my pretrained model using bert function.I trained my model and save it using model.save(). then when I try to load the model path to model I got this error
-
`OSError: Can't load config for '/content/drive/Shared with me/Colab Notebooks/bert_visualization2.h5py'. Make sure that:
-
'/content/drive/Shared with me/Colab Notebooks/bert_visualization2.h5py' is a correct model identifier listed on 'https://huggingface.co/models'
-
or '/content/drive/Shared with me/Colab Notebooks/bert_visualization2.h5py' is the correct path to a directory containing a config.json file
-
`
from bertviz.
Related Issues (20)
- Cannot import bertviz HOT 1
- Is there a simple way to use this code to visualize ViT, DeiT architecture? HOT 1
- How to visualize attention if the sizes of input and output sequence are different? ValueError. HOT 1
- How to use language models such as LLaMA / Alpaca / Vicuna with BertViz? HOT 1
- Will bertviz work for vision transformer?
- cannot import name 'Mapping' from 'collections' HOT 6
- Text truncation
- Takes too much time to run the model_view() visualization
- library installed but not found
- Using [MASK] in a sentence.
- neuron view errors
- Selecting multiple tokens at once.
- Visualize EncoderDecoderModel with tied encoder and decoder
- Any plan on upadating the code for LLaMA models? HOT 11
- Is there any way to "pin" the attention view for a single token? HOT 2
- Request for adding the transformers_neuron_view for LLAMA series models HOT 1
- How to visualize the generated tokens?
- Neuron view
- Bug when visualizing T5 models with generate HOT 1
- Issue visualizing layer attention
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bertviz.