distillpub / post--grand-tour Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 13.0 253.27 MB

License: Creative Commons Attribution 4.0 International

Python 0.27% HTML 3.89% TeX 0.60% CSS 0.34% JavaScript 94.69% GLSL 0.21%

post--grand-tour's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

To view the rendered version of the post, visit: https://distill.pub/2020/bayesian-optimization/

Authors

Apoorv Agnihotri and Nipun Batra (both IIT Gandhinagar)

Offline viewing

Open public/index.html in your browser.

NB - the citations may not appear correctly in the offline render

post--grand-tour's People

Contributors

Stargazers

Watchers

Forkers

jansellner andrehuang tristantrim knut0815 zichaolong touchesir rafaelcosman solider-shuwen scott-vsi lauraaisling soongxueyong standardgalactic oliver4701

post--grand-tour's Issues

Review 3 - Anonymous

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer for taking the time to write such a thorough review.

General Comments

Most of my concerns have been posted in the "Communication" feedback section under general comments. Please see there for a longer review. I encourage the authors to continue this work and polish the current draft.

Distill employs a reviewer worksheet as a help for reviewers.

The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.

Any concerns or conflicts of interest that you are aware of?: n/a
What type of contributions does this article make?: Novel results

Advancing the Dialogue	Score
How significant are these contributions?	3/5

Outstanding Communication	Score
Article Structure	3/5
Writing Style	1/5
Diagram & Interface Style	3/5
Impact of diagrams / interfaces / tools for thought?	4/5
Readability	2/5

Remarks on communication

Below I first list some pros of the communication of this submission, followed by a list of concerns and a general review.

Some pros

The first interactive graphic and text in the Background is well motivated and interesting to read
The description and motivation of the article using the Data-visual correspondence is a great framing of the visualization techniques used here, especially when applied to machine learning
After the Grand Tour technique is introduced and explored, the adversarial example exploration was compelling to see other uses of the developed visualization design

Major: Submission readability

While the submission is readable and the main ideas are accurately presented, it suffers from pervasive grammatical errors and uncommon word choices that negatively impact the communication of the research. This is my primary concern with the submission. The research here is interesting and I am excited to see it eventually published, but as is, it needs multiple rounds of prose iteration and improvement to meet academic publishing standards (Distill, or any other computer science conference).

Major: Grand Tour technique introduction

Given the emphasis on the Grand Tour technique, which is an interesting perspective on visualizing neural network activations, there is likely a better way to introduce and frame the technique with respect to modern techniques. Specifically, the phrase “somewhat-forgotten” feels self-defeating and could be improved to excite the reader about this technique. Furthermore, for readers that may not know of it, was it somewhat-forgotten for good reason? Or did it simply fall out of research popularity? Some discussion around this may be helpful to properly situate the technique in history.

Major: Better presentation of technical details

The technical details section would be better placed in the acknowledgements of the article. However, since it is quite long (somewhere between 1/4 and 1/3 of the article), another idea is that it could be highlighted with a banner description and hidden by default with an indicator to reveal upon a reader interaction (e.g., show/hide button).

Minor: Hero / banner interactive graphic

While the top-level graphic is eye-catching and hints at the techniques used in the paper, perhaps a caption and light annotation could help improve its message to readers. This would help push it towards substantial preview of the article, rather than only a nice looking animation.

Minor: Broken graphic

“The Grand Tour of the softmax layer” figure was broken: the image/points drop down menu did not update the visualization correctly, nor did the data instance positions did not update.

Minor: Miscellaneous

Capitalize headings?
“The math behind the two is simple.” Be careful with phrases such as these, since readers will come from many diverse backgrounds with different levels of mathematical fluency.
Adding a zoom slider to all Grand Tour visualizations (not just the later ones) could be useful
“vis” instead of “visualization,” although explicitly indicated in the submission, reads a bit informal

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	3/5
How easy would it be to replicate (or falsify) the results?	4/5
Does the article cite relevant work?	3/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	3/5

Comments on Scientific Correctness and Integrity

While there could be other citations to include to support the first paragraph of the introduction (e.g., work related to visualization and visual analytics in deep learning, neural network interpretability), the current text passes.

In the Discussion, I expected to see citations or relevant links/materials to corroborate “The trade-offs between small multiples and animations is an ongoing discussion in the vis community.”

emdash in page

Thanks for the great paper! The 'The State-of-the-art Dimensionality Reduction is Non-linear' section currently renders with 'emdash'es directly visible. I've tested in both Chrome and Safari so I'm fairly sure this isn't a browser issue.

Screenshot:

It would be helpful for the reader to have a more complete list of tour references and currently available software.

This article https://www.jstatsoft.org/article/view/v040i02 provides a contemporary starting point.

Review 2 - Anonymous

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

Distill is grateful to the reviewer for taking the time to write such a thorough review.

General Comments

The idea of using the Grant Tour approach to high-dimensional visualization for working with neural network representations is very good. The mathematical underpinnings are clean and clever. Most of the illustrative interactive figures are very helpful.

There are a few issues with the paper, mostly due to the writing. For example, while the writeup of the math is overall convincing there are typos that confuse, e.g. including more terms in (dx, dy, ...) than are actually there. I also think that even in the technical explanation part, more care to use citations with linear algebra and neural networks literature would be helpful. Much of the linear algebra is core material, so instead of just citing a textbook, it would be helpful to include a few sentences of explanation rather than stating something as straightforward fact. Even more so with neural networks material throughout the paper - the audience for this work could be wide, so stating things about neural network internals without citation or explanation is damaging. One example is the idea that linear methods for projection are most appropriate because of the linear nature of the networks. This is not really straightforward and I'm not convinced but it is left unsupported. One more example is that a sentence about how network components become vector inputs to this system may widen the audience. Relatedly, the enthusiasm is a bit too high for academic material. Using words like amazing and including an exclamation point are not appropriate. Perhaps the most disconcerting was an assertion at the beginning that neural networks are now the default classifier. This is far from true for many reasons. They are slow and expensive to train and require lots of data, plus they are difficult to interpret, making them a poor choice for many machine learning applications. Not only is that assertion inaccurate, it will be galling to many potential readers.

The main confusion is actually about how the user is meant to take advantage of this technique and when. This seems to be about confusion over two separate things. First, the concept of moving axes vs moving selected groups of points. These distinctions are neatly identified in the math, but I do not see how this works with the user perspective. The case has been made for moving axes to get a good view, but when would the points be more appropriate? Would I have to know which ones to look at from some other source? There is some coverage of the different uses, but it needs to be organized better to lay it out from the user perspective. The other issue is that the idea about a continuum of projections is lovely in the mathematics, but isn't really practical for understanding how a user would take advantage. In all cases it seems like the someone really needs to know exactly what they're looking for already. I think breaking this down by use cases, again by user tasks, would really clarify what this system can do. Right now, I'm left with an interesting concept, and an implementation I believe in, but no clear vision for how this helps.

Some other comments -

'distribution of data from softmax layers is spherical' - without citation? In this case it's just about orthogonality and makes sense, but that strong unclear statement weakens the paragraph.
typos, especially of the misconjugation variety, are very common
watch GN^{new} vs GT^'. They seem to have been used interchangeably.

Distill employs a reviewer worksheet as a help for reviewers.

Any concerns or conflicts of interest that you are aware of?: n/a
What type of contributions does this article make?: Novel results

Advancing the Dialogue	Score
How significant are these contributions?	4/5

Outstanding Communication	Score
Article Structure	2/5
Writing Style	3/5
Diagram & Interface Style	4/5
Impact of diagrams / interfaces / tools for thought?	4/5
Readability	4/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	3/5
How easy would it be to replicate (or falsify) the results?	3 /5
Does the article cite relevant work?	3/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	3/5

Review 1 - Anonymous

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer for taking the time to write such a thorough review.

General Comments

This article presents a novel approach to visualizing the behavior of neural networks using the Grand Tour, a classic linear dimensionality reduction method. The authors argue that the Grand Tour provides a more intuitive way of interpreting the models, compared with widely-used non-linear embedding methods such as t-SNE, because it follows the principle of data-visual correspondence. To explain how the approach works, the article presents multiple animated, interactive visualizations using motivating examples on CNN models for three popular image datasets including MNIST.

I particularly like that the article uses the same intuitive examples (e.g., digit 1 recognized at epoch 14) throughout the sections. Presenting a series of interactive visualizations that use the same examples clearly helps readers easily follow the authors' explanations and understand why the Grand Tour method can be powerful compared with other methods.

The figures are well-designed with interactive visualizations. Many of them show animations of visual representations, which works well in describing why the proposed approach could be better. They are easy to interpret and easy to interact with.

However, I wish the article provided more detailed descriptions of when to use the Grand Tour method and when not to use among several dimensionality reduction techniques. As the Grand Tour has not been used in the context of interpreting neural nets and the authors do not seem to claim that non-linear methods should be replaced with linear ones for every case, providing some practical guidances or limitations would help researchers and practitioners get benefits from the article for their work. Discussing its limitations can also promote further research in this area.

Overall, the article is well-structured with a good balance between intuitive explanations using visualizations and theory behind them. But I think the presentation of the beginning part could be improved, especially by restructuring the subsections, or even simply changing the title of the Background section and/or subsections. The Background section is not just designed to provide the "background" knowledge or related work, but to introduce an important motivating example used throughout the article and also present a crucial argument for the article.

There exist many grammatical errors which I list a few of them below. I suggest the authors go over the article to fix them.

how neurons activates -> how neurons activate
We have to emphasis one -> We have to emphasize one
The In the figure below -> In the figure below
Examples fall between sandal and sneaker classes indicates -> Examples that fall between sandal and sneaker classes indicate
training and testing dataset -> training and testing datasets
should be consider as -> should be considered as

Distill employs a reviewer worksheet as a help for reviewers.

Any concerns or conflicts of interest that you are aware of?: n/a
What type of contributions does this article make?: Novel results

Advancing the Dialogue	Score
How significant are these contributions?	4/5

Outstanding Communication	Score
Article Structure	4/5
Writing Style	3/5
Diagram & Interface Style	5/5
Impact of diagrams / interfaces / tools for thought?	3/5
Readability	4/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	4/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	2/5
How easy would it be to replicate (or falsify) the results?	3/5
Does the article cite relevant work?	3/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	3/5

distillpub / post--grand-tour Goto Github PK

post--grand-tour's Introduction

Post -- Exploring Bayesian Optimization

Breaking Bayesian Optimization into small, sizable chunks.

Authors

Offline viewing

post--grand-tour's People

Contributors

Stargazers

Watchers

Forkers

post--grand-tour's Issues

General Comments

Remarks on communication

Some pros

Major: Submission readability

Major: Grand Tour technique introduction

Major: Better presentation of technical details

Minor: Hero / banner interactive graphic

Minor: Broken graphic

Minor: Miscellaneous

Comments on Scientific Correctness and Integrity

General Comments

General Comments

Recommend Projects

Recommend Topics

Recommend Org

Jobs