distillpub / post--feature-visualization Goto Github PK

View Code? Open in Web Editor NEW

128.0 128.0 33.0 741.15 MB

Feature Visualization

Home Page: https://distill.pub/2017/feature-visualization/

License: Creative Commons Attribution 4.0 International

HTML 97.13% JavaScript 2.74% TeX 0.11% CSS 0.02%

article

post--feature-visualization's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

To view the rendered version of the post, visit: https://distill.pub/2020/bayesian-optimization/

Authors

Apoorv Agnihotri and Nipun Batra (both IIT Gandhinagar)

Offline viewing

Open public/index.html in your browser.

NB - the citations may not appear correctly in the offline render

post--feature-visualization's People

Contributors

Stargazers

Watchers

post--feature-visualization's Issues

ux: freeze header and scroll rows in regularization options table

this would make the post much more legible on phones and tablets.

Change “car bodies” to “car faces”

This is a small point, so feel free to dismiss, but “Below, a neuron responds to two types of animal faces, and also to car bodies.” It might be more accurate to say ‘car faces’ or ‘car fronts’ instead of ‘car bodies’.

The primary evidence for this is that the images in the dataset that the neuron responds to appear to predominantly be car faces rather than sides or rears:

While the following isn’t evidence, note that this also matches our human intuition— there is something in common between the ‘face’ of a car and the ‘face’ of an animal.

A more definitive test would be to see the level of activation from car sides or rears. I wasn’t able to easily determine this using this neurons page in Microscope but could see extending microscope to make this type of investigation easier.

Thanks for your excellent work folks 🙏

Typo in 10th citation

In 10th citation of paper: Plug & play generative networks: Conditional iterative generation of images in latent space has an unnecessary backslash i.e. \.

Please remove it.

s/neuron to responds/neuron responds

s/neuron to responds/neuron responds

Citations in RegReview

Pre-render RegReview table so that citations show up correctly.

Table #feature-vis-history wraps poorly on iPad

https://distill.pub/2017/feature-visualization/#feature-vis-history

Above browser width 1051px it displays correctly.

From 1024px to 1050px it wraps the last column. From 1000px to 1023px it's using smaller text (css breakpoint) but it still wraps the last column. From 768px to 999px it wraps two columns. In this entire range from 768px to 1050px there's unused space in the margins — the table could be wider and not wrap.

Under 768px there's no longer any space in the margins, so it necessarily wraps.

I don't know what to actually do about it though. Although the margin has plenty of space, it'd be inconsistent with the rest of the page CSS to use the margin in this way. Maybe shrink the font size further from 768px to 1050px? Or maybe shrink the spacers. The left column might also shrink to make it not wrap.

I only noticed this because I was reading on the iPad (10.5") and it wrapped both in landscape and portrait mode.

Hovering on Sprites Sometimes Breaks

Sliders won't work anymore

Unfortunately, the sliders in the article do not work anymore and there are many reference errors when loading the page (ReferenceError: map is not defined). Since the article loads the general template.v2.js file, I guess that there is some version mismatch of d3.js between the recent version from the template file and the version used to write the article.

I don't have a solution but as a hacky workaround, it is possible to generate new sliders with the following code snippet (open the developer tools and paste it into the console window):

document.querySelectorAll('d-slider').forEach(function(slider) {
    var input = document.createElement('input');
    input.type = 'range';
    input.min = slider.min;
    input.max = slider.max;
    input.step = slider.step;
    input.value = slider.value;

    input.addEventListener('input', function() {
      var component = slider._svelte.component;
      var each_block_value_1 = slider._svelte.each_block_value_1,
              config_var_index = slider._svelte.config_var_index,
              config_var = each_block_value_1[config_var_index];
      component.setDeep('var_values.' + config_var.name, this.value - 1);
    });

    slider.parentNode.appendChild(input);
});

unable to load simple.html

From the readme: "Visit localhost:8080/index.html for a compilation of all components, localhost:8080/simple.html for an isolated example."

However, npm run dev followed by opening http://localhost:8080/simple.html results in: Cannot GET /simple.html

I'd find it pretty helpful to see how the developers iterated on the components - I'm assuming that something like simple.ejs is in the git history somewhere.

If it's helpful, I'd be happy to spin up a PR either reviving simple.ejs, or updating the README to reflect the current state.

Just some small typos

Visualizations of all channel are available in the appendix. (channels)

This article focusses on feature visualization. (focuses)

In the following sections we’ll examine techniques to get diverse visualizations, understand how neurons interact, and avoid high frequency artefacts. (artifacts)

How can we chose a preconditioner that will give us these beneﬁts? (choose)

used in adverserial work (adversarial)

As long as a correlation is consistent across spatial positions — such as the correlation between a pixel and its left neighbor being the same across all positions of an image — the Fourier coefﬁcients will be independant variables. (independent)

expand on basis directions comment

it seems that others have also found this passage harder than normal to parse. https://stackoverflow.com/questions/47268920/how-to-understand-individual-neurons-are-the-basis-directions-of-activation-spa

Anonymous Review 1

The following peer review was solicited as part of the Distill review process. Some points in this review were clarified by an editor after consulting the reviewer.

The reviewer chose to keep keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer, for taking the time to review this article.

Conflicts of Interest: Reviewer disclosed no conflicts of interest.

The images shown in this this paper are truly fascinating, and provide an interesting and useful manner to visualize the behavior of neurons in a neural network. Overall, this article is well written and contains many useful results, but in many areas omits background material that makes it difficult to follow exactly what is occurring. When clarified, this article would be highly useful to those not already familiar with the area.

There seems to be a short background paragraph that's missing from the introduction. This article immediately jumps into talking about feature visualization through optimization, but skips some important questions. What network is being used? On what dataset? Similarly, I presume that "conv2d0", "mixed3a", etc are layers of a network. If I look it up, I can see that it's Inception, but it would be good to state this explicitly. Similarly, what does "mixed4b,c" mean? Similarly, some figures are not clearly explained. In the figure talking about different objectives, what do x, y, z, and n represent? Is the layer_n the same n as the softmax[n]? (I assume not.)

I was caught off guard that after spending the majority of the paper describing how optimization can be used to produce these figures, they then say that directly optimizing for the objectives doesn't work. It might have been nice to mention this fact earlier, and just forward reference it -- if someone were to stop reading half way through they would just think that by performing direct optimization they'd be golden. The next section does survey regularization techniques (but even then, none of the regularized figures look nearly as nice as the ones prior). This seems to be the most important part of the paper, but I feel like i get the fewest details about how this is done. It also leaves me wondering which regularization methods were used to make the earlier figures.

When discussing preconditioning, I get the feeling that this is an important aspect of generating high-quality images, but I don't actually know what is happening. How is something spatially decorelated? What is done to minimize over this space? Similarly, what does "Let’s compare the direction of steepest descent in a decorelated parameterization to two other directions of steepest descent" -- I would expect there is only one steepest direction. How do you pick two other steepest ones that aren't the same? What does "compare" mean, and how do you compare to two other? This sentence seems important, but I don't understand what it is trying to say. (CSS issue: footnotes 7 and 8 do not display in Chrome.) On the whole, this section could be better explained.

Minor comments to author:

At various points, the authors make statements saying "it would be impossible to list all the things people have tried." or "The truth is that we have almost no clue how to select meaningful directions" or "and we don’t have much understanding of their beneﬁts yet". These statements are definitely true -- but they seem out of place and unnecessarily negative.
I'm not sure what "As is often the case in visualizing neural networks, this problem was initially recognized and addressed by Nguyen, Yosinski, and collaborators." is supposed to mean. I take it to mean that Nguyen, Yosinski, and collaborators often do the first work in visualization areas, is this right?
I didn't quite understand the purpose of the italicized text under the headers.
The phrase is "adversarial examples" not "adversarial counterexamples".
"And if we want to create examples of output classes from a classiﬁer, we have two options:" but nothing follows the colon, there's an image (with 5 figures, not 2). Was the sentence cut off?

Receptive Fields

First of all, congratulations on this post, or even on the whole of distill.pub. I really think it excels at explaining a lot of things precisely, beautifully, and thoroughly!

In terms of CNN, I like looking at the receptive field a lot. In this post, you mention that the first layers do have a smaller RF. I would love to be able to visualize that, either by cropping the visualizations, or by overlaying a small ROI box. (Personally, I also like to shade everything except the RF of a neuron by blending it to black 80% or so.)

I do understand the limitations of this approach, and the difference between the theoretical and the effective receptive field (which would also be nice to be visualized), but I also do think that a simple RF visualization would be a good addition to this post.

Anonymous review 3

The following peer review was solicited as part of the Distill review process. Some points in this review were clarified by an editor after consulting the reviewer.

Distill is grateful to the reviewer, for taking the time to review this article.

Conflicts of Interest: Reviewer disclosed no conflicts of interest.

Review Result for Feature Visualization in distill.pub

Feature visualization is one of the important techniques to understand what neural networks have learned from image dataset. This paper focuses on optimization methods and discusses major issues and explore common approaches to solving them.

This paper is well organized, and the logic is clear and easy to follow. Start from why use optimization method to visualize feature in neural networks compared to finding examples from the dataset. Then discuss how to achieve diversity with optimization, which overcomes the diversity problem to some extent. Then further discuss the interaction between neurons, which can explore the combinations of neurons working together to represent images in neural networks. Finally, discuss how to improve the optimization process better by adding different regularizations. I appreciate the authors' effort in giving readers a comprehensive overview on feature visualization, mainly focusing on optimization methods. It will be good to add more descriptions on technical parts, such as in preconditioning and parameterization part. Also, I feel the title is a little broad due to there are many other feature visualization methods, such as input modification methods and deconvolutional methods.

I agree that optimization can isolate the causes of behavior from mere correlations and are more flexible when compared to finding real examples in the dataset. While, besides diversity mentioned in the paper, real examples from the dataset seem to more interpretable than examples generated from optimization. If we directly explore examples generated from optimization, it will be very hard to interpret by people sometimes. I think the authors have noticed this, so they put dataset examples as the last column in the table of the spectrum of regularization part. I suggest the authors to put more arguments on the advantages of using optimization methods to visualize feature learned by neural networks.

Some other comments:

This paper misses some experiment setting details, for example, what are the model and dataset used in this paper? How do you generate the images in this paper? Also, what do you actually mean by "diversity term"? These questions need to be further answered in the paper.
This paper focuses on optimization methods for feature visualization. Actually, there are other methods for feature visualization, which can refer to the paper “A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks”. For example, the deconvolution method is used in the paper "Visualizing and Understanding Convolutional Networks" and the method training CNN to reconstruct the images from the activations is used in the paper “Inverting visual representations with convolutional networks”. It will be better to have a discussion and compare with them.
The paper mentions that "we’ll discuss it in more depth later" and "we’ll talk about them in much more depth later", but I do not see any explanation later in this paper.
Some sentences are not formal, such as "What do we want examples of?" and "If neurons are not the right way to understand neural nets, what is?"
Some typos, such as: "because it separates the things causing behavior from from things that merely correlate with the causes." ==> remove one "from"
"some approach to regularization will be one of their main points." ==> "approaches"

I personally feel that the writing of this paper is not so formal as an academic paper, and it looks more like a blog. Overall, it conducts a comprehensive survey on optimization methods for feature visualization, but does not propose new methods for feature visualization.

As an academic paper, I suggest to systematically summarize their approaches and further improve this draft.

According to the criteria in distill.pub and compared to other paper published in distill.pub, I think it can be accepted with some revisions.

Broken link in reference [17]

The following entry in the bibliography contains a broken link:

[17] DeepDreaming with TensorFlow [link]
Mordvintsev, A., 2016.

match the text in joint optimization diagram default view

It would be reduce cognitive load if the default selection for the joint optimization diagram matched the selection described in the text.

The text says black and white + mosaic. The diagram could show this black and white + mosaic view by default.

suggested

current

Pre-rendered KaTeX breaks

[placeholder for not forgetting about this and reminder to talk to @shancarter about why this happens]

`Independent`

https://www.google.com/search?q=independant

Availability of "Present Code Base"?

Just to clarify, under "Infrastructure Contributions" in "Author Contributions", the article mentions "Alex, Chris and Ludwig all contributed signiﬁcantly to reﬁning this into the present code base."

I know there is a link to this code as having been previously published, but I am wondering if the "present code base" referes to the code used to generate all of the images and carry out the investigation of Googlenet. If so, is that code also available?

Thank for you another excellent article

Anonymous review 2

This is an anonymous review that I am sharing from a peer reviewer. They sent it to me as an e-mail with formatted text. Since I don't know of a way to copy-paste a formatted e-mail into Markdown, I'm just sharing it as an RTF document:

https://drive.google.com/a/google.com/file/d/0Bz8CQw2wxLVwUEF5TTY3SjRoUkU/view?usp=sharing

s/these/this

also potentially

s/And in/In

Appendix "Layer" images not loading in Firefox

When I visit e.g., the appendix page for Layer 3a using Firefox 60.0.1 on either my macOS or Windows systems, none of the images load.

(They do load if I use Chrome.)

Feature Visualization Article not correctly displayed

The newest article "Feature Visualization" link is not correctly displayed.
Unsure of the bug, but figures are displayed way too big and the overall alignment of divs seems to be off. I tryed IE 11.1770.14393.0 and Firefox 45.3.0. Other articles are displayed correctly with both browsers on first sight.

Gram matrix notation unclear - multiplication seems element-wise, but is actually a dot product

In footnote 3, I think that the notation is confusing. For specific i and j, the element G_{i,j} is a number, equal to the dot product between the (flattened) response of filter i and filter j. But the formula makes it seem as if we do an element-wise multiplication (without summation), between \text{layer}_n\text{[:, :, i]} and \text{layer}_n\text{[:, :, j]} . Such an element-wise multiplication would result in a matrix, and not in a scalar. I know that there is a "dot" in there, but I would also add a summation sign, or at least mention that the operator is a dot product which results in a scalar.