distillpub / post--building-blocks Goto Github PK

View Code? Open in Web Editor NEW

85.0 85.0 26.0 573.46 MB

The Building Blocks of Interpretability

Home Page: https://distill.pub/2018/building-blocks

License: Creative Commons Attribution 4.0 International

HTML 7.71% JavaScript 90.08% TeX 0.84% EJS 1.37%

post--building-blocks's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

To view the rendered version of the post, visit: https://distill.pub/2020/bayesian-optimization/

Authors

Apoorv Agnihotri and Nipun Batra (both IIT Gandhinagar)

Offline viewing

Open public/index.html in your browser.

NB - the citations may not appear correctly in the offline render

post--building-blocks's People

Contributors

Stargazers

Watchers

post--building-blocks's Issues

events.js:141 throw er; // Unhandled 'error' event

I get the following error when running npm run dev


> saliency@ dev /home/user/distill
> cross-env NODE_ENV=development webpack-dev-server --hot

events.js:141
      throw er; // Unhandled 'error' event
      ^

Error: spawn webpack-dev-server ENOENT
    at exports._errnoException (util.js:870:11)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:178:32)
    at onErrorNT (internal/child_process.js:344:16)
    at nextTickCallbackWith2Args (node.js:441:9)
    at process._tickCallback (node.js:355:17)
    at Function.Module.runMain (module.js:444:11)
    at startup (node.js:136:18)
    at node.js:966:3

npm ERR! Linux 4.13.0-38-generic
npm ERR! argv "/usr/bin/nodejs" "/usr/bin/npm" "run" "dev"
npm ERR! node v4.2.6
npm ERR! npm  v3.5.2
npm ERR! code ELIFECYCLE
npm ERR! saliency@ dev: `cross-env NODE_ENV=development webpack-dev-server --hot`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the saliency@ dev script 'cross-env NODE_ENV=development webpack-dev-server --hot'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the saliency package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     cross-env NODE_ENV=development webpack-dev-server --hot
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs saliency
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls saliency
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     /home/user/distill/npm-debug.log

with the following content in the log file


0 info it worked if it ends with ok
1 verbose cli [ '/usr/bin/nodejs', '/usr/bin/npm', 'run', 'dev' ]
2 info using [email protected]
3 info using [email protected]
4 verbose run-script [ 'predev', 'dev', 'postdev' ]
5 info lifecycle saliency@~predev: saliency@
6 silly lifecycle saliency@~predev: no script for predev, continuing
7 info lifecycle saliency@~dev: saliency@
8 verbose lifecycle saliency@~dev: unsafe-perm in lifecycle true
9 verbose lifecycle saliency@~dev: PATH: /usr/share/npm/bin/node-gyp-bin:/home/user/distill/node_modules/.bin:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/home/user/bin:/home/user/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-9-oracle/bin:/usr/lib/jvm/java-9-oracle/db/bin
10 verbose lifecycle saliency@~dev: CWD: /home/user/distill
11 silly lifecycle saliency@~dev: Args: [ '-c',
11 silly lifecycle   'cross-env NODE_ENV=development webpack-dev-server --hot' ]
12 silly lifecycle saliency@~dev: Returned: code: 1  signal: null
13 info lifecycle saliency@~dev: Failed to exec dev script
14 verbose stack Error: saliency@ dev: `cross-env NODE_ENV=development webpack-dev-server --hot`
14 verbose stack Exit status 1
14 verbose stack     at EventEmitter.<anonymous> (/usr/share/npm/lib/utils/lifecycle.js:232:16)
14 verbose stack     at emitTwo (events.js:87:13)
14 verbose stack     at EventEmitter.emit (events.js:172:7)
14 verbose stack     at ChildProcess.<anonymous> (/usr/share/npm/lib/utils/spawn.js:24:14)
14 verbose stack     at emitTwo (events.js:87:13)
14 verbose stack     at ChildProcess.emit (events.js:172:7)
14 verbose stack     at maybeClose (internal/child_process.js:821:16)
14 verbose stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:211:5)
15 verbose pkgid saliency@
16 verbose cwd /home/user/distill
17 error Linux 4.13.0-38-generic
18 error argv "/usr/bin/nodejs" "/usr/bin/npm" "run" "dev"
19 error node v4.2.6
20 error npm  v3.5.2
21 error code ELIFECYCLE
22 error saliency@ dev: `cross-env NODE_ENV=development webpack-dev-server --hot`
22 error Exit status 1
23 error Failed at the saliency@ dev script 'cross-env NODE_ENV=development webpack-dev-server --hot'.
23 error Make sure you have the latest version of node.js and npm installed.
23 error If you do, this is most likely a problem with the saliency package,
23 error not with npm itself.
23 error Tell the author that this fails on your system:
23 error     cross-env NODE_ENV=development webpack-dev-server --hot
23 error You can get information on how to open an issue for this project with:
23 error     npm bugs saliency
23 error Or if that isn't available, you can get their info via:
23 error     npm owner ls saliency
23 error There is likely additional logging output above.
24 verbose exit [ 1, true ]

Spritemap notebooks

In the sprite map section of the ipython notebook linked in Teaser.html it is mentioned "Check out other notebooks on how to make your own neuron visualizations." do you have any specific recommendations other than say the tensorflow deepdream notebook?

In teaser image, hovering over spatial locations doesn't change the feature visualization

The interface design figure says: "For instance, let us consider our teaser ﬁgure again. (...) hovering over spatial locations gives us neuron-speciﬁc attribution." Perhaps I misunderstood the caption, but the figure shows a bidirectional arrow between image and feature visualization, so I thought hovering over the image would change the feature visualization, but it doesn't seem to change anything. Here's what I see:

Using Chrome on macOS. No obvious console errors.

Review #3

The following peer review was solicited as part of the Distill review process.

The reviewer chose to keep keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer for taking the time to review this article.

Conflicts of Interest: Reviewer disclosed a minor potential conflict of interest that the editors did not view as a substantive concern.

Section Summaries and Comments

This submission uses the core interpretability technique proposed in Feature Distillation (Olah et al) to address core concerns in interpretability (visualization, attribution, summarizing to human absorbable amounts of information).

The article first motivates the reason to understand the network through canonical examples (in this case visualizations of concepts maximally activating neurons).

One comment on this: while using visual examples definitely gives a finer degree of granularity than lumping several abstractions together under one description -- e.g. “floppy ear detectors” -- it seems likely that any visual concept that the network reliably learns that is unfamiliar to humans is likely to still be overlooked, even using canonical examples. (This would be an example of the human scale issue). One way to study this further (probably in future work) might be to extract semantic concepts that are repeatedly learned, and thus identify these important concepts that aren’t immediately human recognizable.

The next section introduces a way to perform feature visualization at every spatial location across all channels. The main contribution of this section is definitely the layer by layer visualization of different concept detectors (and with the nice follow on image of concepts scaled by activation magnitudes). It might help slightly to have a sentence introducing this in the start of the section. The visualizations are fantastic, and really give a sense of how the model comes to its decision. I’m very excited to see such visualizations applied to other image datasets (particularly those where we might use ML to make scientific discoveries), as well as other domains (e.g. language).

Perhaps this is a personal preference, but when the authors describe the optimization procedure, I would have really liked to see a simple equation or two (maybe even as an aside), so that the mathematically minded readers can have a (simplified) mathematical description. Several parts were unclear to me: e.g. do you initialize with the image and then optimize to maximize the sum of the activations over all channels? Or something else?

Some mathematical description is also something I would have liked to see in the Human Scale section when describing the matrix factorization problem. (There are links to colab notebooks which is an excellent resource, but it would be nice to have an equation or two instead of having to look through the code to work this out.)

The next section presents what is arguably the most important visualization of the article, of seeing how different concepts at different layers influence and are influenced by concepts in other layers. This is the first visualization of its kind that I’ve seen, and seems like it could be extremely valuable in helping determine failure cases of neural networks, or assess how robust they are. It would be fascinating to see this diagram for adversarial images for example. One follow up question to this visualization might be: there has been work that suggests that saliency maps are not entirely reliable (and this is indeed discussed later in the article). Is this kind of influence visualization robust to other attribution methods? (Perhaps other attribution methods could even be built into the model, e.g. attention.)

The authors then overview an attempt to categorize and summarize the interpretability conclusions being presented using matrix factorization. As mentioned before, it would have been nice to see some mathematical descriptions of the setup. It is good to see the human scale problem being addressed but the results here appear to be preliminary, and there is likely much more work to be done in the future. One related direction is the following: in this article, it appears that the “semantic dictionary” is built up from the output classes and humans looking at some of the feature detectors and labelling them, e.g. “floppy ears”. Have the authors considered methods to automatically cluster together similar concepts? (E.g. first collecting a set of canonical examples corresponding to a certain class, and then applying clustering to see what natural groups they fall into?)

Final Takeaway

This is an excellent submission studying how to develop tools to better interpret networks. To me, the two most exciting contributions were the layer by layer canonical example visualizations and the influence diagrams showing how different features influenced each other across layers. I’m very excited to see how these can be applied across different domains to interpret and learn from the neural networks we train.

Minor Comments

Without having read other distill articles, it’s not immediately obvious that the little orange box selecting a certain part of the image can actually be moved around, and it might be worthwhile noting this somewhere. (I missed this entirely in the very first image at first, and only discovered it by accident.)
The wording ‘this explodes the ﬁrst problem combinatorially’ seems a little awkward to me. Maybe something of form ‘this greatly exacerbates the first problem’ might be better instead?
At the start of the How are concepts assembled section, there is a note on feature visualization answering the “what” question, but not the “how”. I would argue that to answer the “how” we would need to study the neural network through training and see how it became sensitive to different features.

Error while using a different model than inception_v1 in your notebook, unable to evaluate gradients (inception_v4)

Hi,

I'm trying to generalize your code in the Channel Attribution - Building Blocks of Interpretability notebook,
in order for it to work on all state-of-the-art CNN models.

I've used @ludwigschubert 's guide suggested here tensorflow/lucid#34
for converting prepared models to the modelzoo format, and successfully converted inception_v4 & nasnet_large for now.
I get prepared models here: https://github.com/tensorflow/models/tree/master/research/slim

I've used modelzoo files in order to create visualizations of channels in each network, and created spritemaps for each layer:
https://github.com/osoffer/Diamond-Cutter/tree/master/models/inception_v4/spritemaps

When I'm trying to load the modelzoo file for inception_v4 and use it in Channel Attribution - Building Blocks, an error occurs in the channel_attr_simple method.
It's not clear what layer I need to choose to replace "softmax2_pre_activation", set in your example for inception_v1
(line: logit = T("softmax2_pre_activation")[0])

I've tried all inception_v4 layers coming after the last convolutional layer, non of them worked.
When trying to calculate gradients, I get an error.
(I can add a link for a google colab notebook that reproduces my error )

for choosing layer InceptionV4/Logits/AvgPool_1a/AvgPool, the error is:
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
[[gradients/AddN_5/_3]]
(1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

for choosing layer InceptionV4/Logits/Predictions, the error is:
(0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
[[gradients/AddN_5/_3]]
(1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

the error occurs at line:
grad = t_grad.eval()

I use this modelzoo file, for inception_v4: https://drive.google.com/uc?id=15CmJ4UbUm8MXp8h0uHwbEe0n8rPrxARp

full error trace:

InvalidArgumentErrorTraceback (most recent call last)
in ()
3 model_layer = {"inception_v4" : "InceptionV4/InceptionV4/Mixed_7b/concat"}
4 class1 = "Labrador retriever"
----> 5 channel_attr_wrapper(img_s, model_layer, class1, n_show=10)

8 frames
in channel_attr_wrapper(img_s, model_layer, class1, n_show)
7 last_layer = models[model_name]["last_layer"]
8 channel_attr_simple(selected_model, model_name, model_layer[model_name],
----> 9 last_layer, img_s, n_show, class1)

in channel_attr_simple(selected_model, model_name, layer_name, last_layer, img_s, n_show, class1)
1 def channel_attr_simple(selected_model, model_name, layer_name, last_layer, img_s, n_show, class1):
2 # calc model activations
----> 3 channel_attr = channel_attr_simple_org_core(img_s, layer_name, last_layer, class1, selected_model)
4 channel_attr = channel_attr / len(img_s)
5

in channel_attr_simple_org_core(img_s, layer, last_layer, class1, selected_model)
45 # print(type(t_grad))
46 #print("t_grad shape " + str(t_grad.shape))
---> 47 grad = t_grad.eval()
48 print("grad")
49 print(grad)

/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.pyc in eval(self, feed_dict, session)
796
797 """
--> 798 return _eval_using_default_session(self, feed_dict, self.graph, session)
799
800 def experimental_ref(self):

/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.pyc in _eval_using_default_session(tensors, feed_dict, graph, session)
5405 "the tensor's graph is different from the session's "
5406 "graph.")
-> 5407 return session.run(tensors, feed_dict)
5408
5409

/tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
954 try:
955 result = self._run(None, fetches, feed_dict, options_ptr,
--> 956 run_metadata_ptr)
957 if run_metadata:
958 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
1178 if final_fetches or final_targets or (handle and feed_dict_tensor):
1179 results = self._do_run(handle, final_targets, final_fetches,
-> 1180 feed_dict_tensor, options, run_metadata)
1181 else:
1182 results = []

/tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1357 if handle is None:
1358 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1359 run_metadata)
1360 else:
1361 return self._do_call(_prun_fn, handle, feeds, fetches)

/tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _do_call(self, fn, *args)
1382 '\nsession_config.graph_options.rewrite_options.'
1383 'disable_meta_optimizer = True')
-> 1384 raise type(e)(node_def, op, message)
1385
1386 def _extend_graph(self):

InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
[[gradients/AddN_5/_3]]
(1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1]
[[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for u'import/InceptionV4/Logits/AvgPool_1a/AvgPool':
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "/usr/local/lib/python2.7/dist-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelapp.py", line 499, in start
self.io_loop.start()
File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 456, in _handle_events
self._handle_recv()
File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 486, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 438, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python2.7/dist-packages/ipykernel/ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python2.7/dist-packages/ipykernel/zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 5, in
channel_attr_wrapper(img_s, model_layer, class1, n_show=10)
File "", line 9, in channel_attr_wrapper
last_layer, img_s, n_show, class1)
File "", line 3, in channel_attr_simple
channel_attr = channel_attr_simple_org_core(img_s, layer_name, last_layer, class1, selected_model)
File "", line 7, in channel_attr_simple_org_core
T = render.import_model(selected_model, t_input, t_input)
File "/usr/local/lib/python2.7/dist-packages/lucid/optvis/render.py", line 234, in import_model
model.import_graph(t_image, scope="import", forget_xy_shape=True)
File "/usr/local/lib/python2.7/dist-packages/lucid/modelzoo/vision_base.py", line 62, in import_graph
self.graph_def, {self.input_name: t_prep_input}, name=scope)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
producer_op_list=producer_op_list)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 517, in _import_graph_def_internal
_ProcessNewOps(graph)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 243, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 3561, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 3451, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 1748, in init
self._traceback = tf_stack.extract_stack()

My goal is to have all state-of-the-art models for CNN available for use in your Channel Attribution tool, and adding more features for the task of choosing a model for transfer learning.

I would really appreciate your help.
Thanks!

Module build failed: TypeError: Cannot read property 'watchRun' of undefined in AllActivationGrids.html

I'm trying to run a dev repo.

My environment:

$ node --version
v9.3.0
$ npm --version
5.6.0

I see this in the browser. Note the page is working behind the error message.

Here is what I see in the terminal:

$ npm run dev

> saliency@ dev /Users/admin/temp/distil/post--building-blocks
> cross-env NODE_ENV=development webpack-dev-server --hot

Project is running at http://localhost:8080/
webpack output is served from /
Content not from webpack is served from /Users/admin/temp/distil/post--building-blocks/public
404s will fallback to /index.html
   574 modules

ERROR in ./src/diagrams/AllActivationGrids.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/Loading.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/ExamplePicker.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/Atoms.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/SemanticDict.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/StickyPicker.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/ActivationVecVis.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/ActivationGridSingle.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/Teaser.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/AttributionSpatial.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/AttributionChannel.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/ActivationGroups.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/Grammar.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/AttributionGroups.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/CubeGroups.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)

ERROR in ./src/diagrams/CubeNatural.html
Module build failed: TypeError: Cannot read property 'watchRun' of undefined
    at new VirtualModulesPlugin (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/lib/virtual.js:35:17)
    at Object.module.exports (/Users/admin/temp/distil/post--building-blocks/node_modules/svelte-loader/index.js:83:46)
webpack: Failed to compile.

spelling mistake

On the top of the page when selecting the Hog image, the caption says "dalmation" instead of "dalmatian".

this caption: Pointy ears seem to be used to classify a "hog". A dot detector is contributing highly to a "dalmation" classification.

Colab notebook does not work as is

Currently I get the following error:

SO suggest that changing TF version fixes this: https://stackoverflow.com/questions/61218853/problem-with-tensorflow-package-when-its-used-by-lucid-package

Some errors in "Spatial Attribution with Saliency Maps" visualization

Hi, great article! I found some small errors that some people maybe stumple upon when going in detail through the "Spatial Attribution with Saliency Maps" visualization:

When seleting the image of the hog and clicking on "center of the snout" in the description a piece in the center of the hog is selected and not the center of the snout.

You probably mean this center of the snout location

Same thing when clicking on "right ear" in the description: It shows a location which is centered on the right eye and not the ear.

You probably mean a position focused on the right ear like this

When selecting the image of the Brambling and clicking on "chain" in the description something weird happens. The position in the picture stays the same as the previously selected position (here on the bird head), the output attribution is a mixture of chain and brambling (which doesnt exist in the image) and the shown piece in "mixed4d" is the patch in the top left corner of the picture.

I really appreciate the article, very good work. :)

Triage

High Priority

Semantic dict with different bases/neuron groups (@arvind).
Finalize running examples (@arvind, @colah).

Lower Priority

Finalize example picker UI -- sticky/hovering/embedded? (@shancarter).
Better colab stickers/captions (@shancarter).

Done

Some <d-figure> tags failing to render with Firefox

On Firefox 69.0.1 (running on an Ubuntu 18.04.3 LTS) some <d-figure> tags are failing to render. Noticeable in the header teaser image:

As well as in the images with id="ActivationGroups" and id="AttributionGroups":

These render correctly on Chrome 76.0.3809.100 (Official Build) (64-bit).

Thanks for creating such a rich platform !

Design of controls for spatial attribution

The present "controls" feel a bit messy to me:

A couple possible redesigns:

Review #1

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

The reviewer chose to waive anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer, Qiqi Yan, for taking the time to write such a thorough review.

Summary:

There have been tons of techniques in the literature developed for probing into image networks to try to interpret how they operate. This article nicely categorizes those techniques into (1) feature visualization (2) feature attribution (3) feature grouping, and shows that these can be integrated into an interactive interface on top to let users get a sense of the internals of an image network.

Overall I definitely recommend accepting the article. The community would love this article, partially because it’s a nice integration of techniques, partially because it is visually appealing (I do wonder if some warnings should be given in the article on that visualization can give a false sense of understanding).

One challenging topic that I wish gets more discussion is, for these efforts of probing / visualizing internals of image networks:
Is it really necessary to probe into the internals? The network is trained to have some reasonable human-understandable i/o behavior. Maybe doing analyses also on i/o level is human-friendly, usually good enough, and seemingly safer as internals are often too complex / misleading.
Is it actually useful? If yes, any concrete or anecdotal example of how the interfaces to internals have informed people to take some actions, like fixing a type of misclassification?
Is it consistent or stable? For two functionally equivalent networks with different architectures, would the interfaces give similar view? For two identical networks, with different parameter initialization, would the interfaces give similar view?
Addressing these questions may not be easy or necessary for this article. But currently the article seems to at least lack discussions on motivations and utilities. The main sentence on motivation I can find is: by reference 1 (another long article), “we need to both construct deep abstractions and reify (or instantiate) them in rich interfaces”. Honestly I’m not sure how to interpret this deep sounding but actually super vague sentence.

Some comments:

“Interpretability” to me means the degree to which a model is interpretable. It should be an attribute of a model. (common use: linear models are more interpretable than deep ones) But the article is mostly about for given a model, how to probe into its internals.
The article only deals with image models. Should mention this somewhere. I actually think there is the potential of doing the whole thing for text models too.
When you say you use Googlenet, I prefer it immediately mentioned that the techniques don’t work well for e.g. ResNet (yet), as part of main text, not just a footnote. Otherwise I worry that an unfamiliar reader can walk away with the wrong impression that this works well for all popular image networks.
There are many types of feature visualizations, using image patches that maximize activations, deepdream style optimization, deconvolution, etc. I think there are pros and cons for each of them. Somewhere the article switches to using “feature visualization” to refer to only the deep dream style visualization, without explanations. There are people (like me) who have doubts about how good deepdream is as a feature visualization method, as optimization procedure could introduce artifacts.
Should briefly describe what mixed3a is.
For the visualization with varying box sizes, describe which correspond to activation level? The length of edge, or area of the box?
What does “linearly approximating” mean? Either give a reference, or describe it more precisely.
Footnote 6 seems to be in some functional PL syntax? Doesn’t look like it compiles. (I read Haskell only.) Seems an overly syntactic way of describing the space.
Overall on the writing style, it’s a long article, I’d wish there were more highlighted keywords / bulleted sentences so that the experts can skim through more easily.

Tough to reproduce flickering bug

When hovering over the activation grids, scrolling sideways so the pointer lands on a magnified activation visualization leads to flickering. See demo here.

Maybe it just needs a relaxed scope for the hover state?

Switch off drill-down view when changing example

Right now I can switch examples while in drill-down mode, leading to weird interface states such as this one:

While at it, it feels weird to me that the selected example is highlighted but still transparent—consider making it fully opaque when selected? (Maybe a matter of taste, though.)

Misspelling in notebook

https://colab.research.google.com/github/tensorflow/lucid/blob/master/notebooks/building-blocks/AttrChannel.ipynb

"...of them, and dependancies such as..."

Should be "dependencies".

Design Space Diagram

spelling: trusthworthy -> trustworthy

Great publication! I just led a well-received reading group discussion on it.

Minor typo in the Conclusions section: "trusthworthy" should be "trustworthy".

Typo

In the last paragraph of the section "How Trustworthy Are These Interfaces?":

post--building-blocks/public/index.html

Line 877 in 778a8de

 Trusting our interfaces is essentially for many of the ways we want to use interpretability. 

Trusting our interfaces is essentially for many of the ways we want to use interpretability.

Should it be "essential"?

typo

For example, should the interface emphasize what the network recognizes, prioritize how its understanding develops, or focus on making thing human-scale. --> For example, should the interface emphasize what the network recognizes, prioritize how its understanding develops, or focus on making things human-scale.

Review #2

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

Distill is grateful to the reviewer, Guillaume Alain, for taking the time to write such a thorough review.

The paper titled “The Building Blocks of Interpretability” demonstrates a number of ways in which we can visualize the role of elements of an image, and then suggests a unifying view to make life easier for the researcher who has to interpret all of this.

For the first half of the paper, we are given a lot of numbers and pictures. While these are interesting by themselves (the pretty colours sure help in that regards), it’s hard to know what to actually do with those elements. I’m a bit torn because of the fact that I expect Distill papers to be short and on point (which isn’t the case here), but on the flip side everything else is definitely right for a Distill paper (lots of visualization, good exposition with minimal math). The authors go on to say

We will present interfaces that show what the network detects and explain how it develops its understanding, while keeping the amount of information human-scale.

which leads to the question : Was the goal of the first half of the paper just to show how hard is it to use those visualizations? (establishing the problem to propose the solution afterwards)

I’m particularly fond of the second picture in “What Does the Network See”. By the way, is there a reason why the figures are not numbered in any way? It makes them harder to reference.

The concept of “neuron groups” is particularly nice. There is value is recognizing that this whole visualization business is not about single pixels or connected regions. When I read the paper, though, I thought that the authors were going to combine features from different layers. They leave that as a possible, but maybe it’s better to avoid it in order to simplify the visualization.

The authors suggest a certain formal grammar to help us find our way through all those visualization techniques. This is a good idea. I’m sure not that they’ve put their finger on the correct form, though, but don’t have a concrete suggestion to improve it.

They uses nodes and arrows in a table. One would imagine a priori that arrows would be operators and nodes would be objects. But when it comes to “attribution” the nodes are labeled “T”, which is basically just a dummy letter to provide a dummy node on which they can attach the attribution arrow. What they really wanted to link were the boxes in the table, but that would be clumsy. Then we have dotted arrows users for filters, which is nice. But we also have nodes that are stuck on other nodes (like the blue “I” on the green “T” from the “Filter by output attribution”). I’m guessing that being stuck on a node basically corresponds to a “full arrow” that is not being filtered. My point is that this feels more like an informal sketch than a formal way of describing things (which might end up being too limited because new ideas are hard to fit within a constrained framework). It’s worth including in the paper, but I’m not sure it’s as mature as the other ideas in there.

Throughout the reading of this paper, I wondered about how adversarial examples played into all this. I imagine that adversarial examples are not sufficiently well-understood and they would be more of a distraction than anything here. The authors reference the importance of good interfaces when faced with adversarial examples, but they don't really voice an opinion about whether they are an efficient first line of defence against them, or if they are easily fooled.

typo:
adverserial -> adversarial

Small rendering issue on certain screen width

MobileNet Visualization

I have been playing around trying to visualize MobileNets using lucid and a couple of other tools as well. No matter the tools I'm using I am having difficulty getting meaningful results, thus I must be missing something or doing something wrong. Most of the images that come out of these techniques have results that are either vanishingly small values that go to zero when converting to an integer for visualization or are seemingly random (i.e. no discernible structure). However when I use the same tools with other networks the results are what one would expect. This is with the image value range adjusted to something appropriate for each model.

Is there something about the MobileNet architecture that causes this to be the case? Is the image value range the only values I should have to tune here or are there others as well? Have you had any success visualizing hidden layers in MobileNet? I spent sometime looking into this with some rudimentary searches and can't seem to find any posts about the issue I am seeing. The models that I have been using are VGG16, InceptionV2, and MobileNetV1 all trained on ImageNet.

When example changes, default to interesting position

The only challenge here is a lack of a unified code path?

Question regarding "we use dimensionality reduction to produce a multi-directional saliency map"

Can you please elaborate a bit on this process (perhaps via a short reply to this issue)?

I'm guessing you reduce the thousand-dimension vector to some point on a line thru color space between blue and orange in the example given, where blue and orange correspond to the two top competing types of classes (e.g. orange : dog and blue : cat)