reiinakano / neural-painters Goto Github PK

View Code? Open in Web Editor NEW

261.0 261.0 60.0 18.44 MB

License: Creative Commons Attribution 4.0 International

HTML 1.96% JavaScript 0.15% TeX 0.25% CSS 0.04% Jupyter Notebook 97.59%

neural-painters's People

Contributors

Stargazers

Watchers

neural-painters's Issues

Review #1

The following peer review was solicited as part of the Distill review process.

The reviewer chose to keep anonymity.

Distill is grateful to the reviewer for taking the time to review this article.

I liked the diagram style a lot. writing felt hurried, I'd recommend the authors take additional time to polish their writing. there aren't many spelling mistakes.

writing seems solid but whimsical? Editors decide whether that's OK in Distill, but I'd recommend more concrete, ML specific language where possible. ("it can kill training due to bad gradients" why are gradients bad? are they zero? etc)

"as there aren't that many claims beyond what's shown in the paper the claims are well supported. the linked ""Colaboratory"" notebooks seem very helpful. (I have not evaluated them beyond trying them out.)

the article does not explicitly look into limitations of neural painters. while authors cite many recent ML papers, I'd recommend the authors to also look into broader NPAR (Non-Photorealistic Animation and Rendering) literature and classic work in this are of computer graphics (think SIGGRAPH in early 90s), such as (not an exhaustive list):

Cockshott T, England D, ""Wet and Sticky: Supporting Interaction with wet paint"", Proceedings of BCS HCI '91, Cambridge University Press, August, 1991.
Aaron Hertzmann. "Painterly Rendering with Curved Brush Strokes of Multiple Sizes." Proceedings of SIGGRAPH'98

I also recommend "Non-Photorealistic Computer Graphics: Modeling, Rendering, and Animation", by Thomas Strothotte, Stefan Schlechtweg (https://isgwww.cs.uni-magdeburg.de/pub/books/npr/index.htm)

Distill employs a reviewer worksheet as a help for reviewers.

The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: Novel results

Advancing the Dialogue	Score
How significant are these contributions?	2/5

Outstanding Communication	Score
Article Structure	2/5
Writing Style	2/5
Diagram & Interface Style	4/5
Impact of diagrams / interfaces / tools for thought?	2/5
Readability	3/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	3/5
How easy would it be to replicate (or falsify) the results?	4/5
Does the article cite relevant work?	2/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	2/5

AttributeError: 'module' object has no attribute 'io'

Thanks for sharing this awesome project reiinakano - I've been working on running your notebooks locally on my own machine inside a docker container.

When running the notebook learning_human_strokes.ipynb I get the following error when running the code mnist_dispenser = MNISTDispenser():

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-cf17af3d2e33> in <module>()
----> 1 mnist_dispenser = MNISTDispenser()

<ipython-input-5-da69b5df8a4d> in __init__(self, data_dir, screen_size)
      4         self.data_dir = data_dir
      5         self.height, self.width = screen_size, screen_size
----> 6         self.prepare_mnist()
      7 
      8     def get_random_target(self, num=1, squeeze=False):

<ipython-input-5-da69b5df8a4d> in prepare_mnist(self)
     14 
     15     def prepare_mnist(self):
---> 16         ut.io.makedirs(self.data_dir)
     17 
     18         # ground truth MNIST data

AttributeError: 'module' object has no attribute 'io'

Looking at the code, it looks like the line import utils as ut imports utils.py from the SPIRAL-tensorflow repo. utils.py doesn't define an io attribute, which is why the code breaks.

I checked the git history of the file in case it used to have an io method but no longer does, and didn't manage to find anything.

Is this correct? I'm not sure what the ut.io module is supposed to do, so I'm nervous about overwriting a hack onto it.

Review #2

The following peer review was solicited as part of the Distill review process.

The reviewer chose to waive anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to Ludwig Schubert for taking the time to review this article.

Structure

This submission introduces the concept of a ""neural painter"", which is defined to be a differential painting ""simulator"", It explores:

How to directly optimize such differentiable architectures without RL
Comparison to existing methods and results
Applies neural painters to class visualization using their regularizing effect
Applies neural painters to style transfer, again capitalizing on their regularizing effect

The submission divides into these three (the last two bullet points are in one section) parts cleanly, and introduces each section with a brief introduction of motivation and any potentially unusual technical details. For example, when introducing the action space of the model, the author details the artistic consequences of the space.

The current article structure is clear, but reads to me like this:

problem setup
solution via neural painters
comparison to a RL solution
- interlude about discrete action spaces
adding preconditioning
using neural painters for other things

I wonder if the submission could be tighter if it took the discrete action space problem as its central theme, at least initially. The story would flow like:

problem setup
present current approach: RL
explain problem of discrete action spaces
show neural painters work well without it
and can be extended using preconditioning and used as a diff param generally.

No need to take this as a literal suggestion, but I recommend the authors think explicitly about the flow of ideas the submission presents, and whether that flow feels consistent and logical over the entire paper.

General Feedback

The writing feels very approachable, which I enjoyed a lot! It uses compelling visualizations that are well-designed and feel inviting to the reader. I wonder if some of the interactions could be simplified, for example in Figure 1 I couldn't figure out if the color chooser is relevant to the point of the diagram. What if the stroke changed control points on hovering already? I believe it might help discoverability of the interaction even more.
In addition to the interactive diagrams, the author uses static, structural diagrams that help visualize their optimization setups at a glance.

I enjoyed the article as an exploration of possibilities, and bringing together various recent techniques. While the submission doesn't present fully novel or unexpected results, I found it an enjoyable read. Some sections end a little unmotivated, leaving me wanting to know more about the precise ideas they hint at as future work. (e.g. ""Many interesting environments […] will have unavoidable discrete actions"" — so how could neural painters overcome this issue? Are there other approaches than just optimizing in a continous relaxation? How might they fare compared to the current approach?)

The submission could increase the significance of its contributions for example by adding additional analyses of what the differences between neural painters and existing approaches lead to, or maybe by going into more depth on the explanations of the various setups to be more educational?
Generally, I'd recommend the authors give more context or discussion to each individual result. The current writeup does feel a little rushed—not that brevity isn't a virtue, but at the moment it often left me being curious for more details. The content feels exploratory rather than explaining to me. That makes for swift reading, yet I wish the submission had more content in the same amount of text it currently has.

I believe the submission to be in scope for the Distill journal, and with some amount of polish can become a great contribution to Distill!

Minor Comments

Figure 2's outline style is inconsistent with others
Some Figures, e.g. Figure 6 have very small text that might be easier to read as part of a sidenote or the main text,
Most figures might be made to feel visually lighter by removing their black outline
Spelling or formatting: ""L2 distance""
In all three column figures (e.g. Figure 11) it would help to inline the legend from the caption into the figure. (e.g. Target image, neural painter, MyPaint)

Section 1

This section introduces the idea of neural painters, as well as two ways of differentiably training them.

""The reasoning behind this will be explained in more detail in a later section."" -> I wonder if you can explain this inline, or in another way make this feel less mysterious?

""serve as a good starting point"" -> This is a nitpick, but I'd suggest rewording to a more precise adjective than good. Maybe the notebooks are helpful, convenient, or simply ""a starting point""?

Section 2

This section reproduces results from a RL based approach with neural painters.

""We can simply train the agent using regular adversarial methods."" -> I was confused at first what this was clarifying—I imagine ""gradient-based"" might be a more broadly used term to contrast this setup to an RL one?

""You can explore the results for our agent in the diagram below:"" -> I wish the results were presented in a way that makes them easier to compare at a glance (maybe a static diagram in addition to the live one?), maybe pointing out differences or the lacak thereof. The reduction in compute would be nice to make a little more quantitative, though of course training setups may vary widely.

""The effect of discrete actions"" -> The first paragraph here is hard for me to understand. 'Why? It is not impossible to train a neural painter on a discrete action space.' Since this sentence comes some time after the first mention of this, it may help to mention why the discrete setup is not differentiable. The last sentence also feel more like an assertion than an argument. I wish it first introduced me to the argument a little more (""steelmanning"") and then explained why this is not a problem for the suggested setup.

""Somehow, the network has decided that "", ""it is free to do it however it wants"", ""At worst, it can kill training due to bad gradients."" -> I'm not generally against using anthropomorphizing language, but these instances seem like they could be replaced by more precise phrases.

Section 3

This section adds preconditioning in order to generate more human-like strokes. It is really well motivated, and the approach is clearly laid out. What if it was shown in a static diagram in addition to the bullet point list?

""we need to condition the agent in a way that it distinguishes different modes of the class"" -> this sounds really intresting, I wish the section hadn't stopped there! No need to solve this issue in this paper, but I'd love to hear potential approaches. Listing those, even if they have clear failure modes, as long as those are disclosed that can still be an effective tool to get readers to wonder about solutions themselves, potentially leading to more engagement with the topic!

Section 4

How cool to see how cleanly the idea of a differentiable image parameterization maps to neural painters! The examples demonstrated the potential of neural painters, though I wish they were put more into context of existing approaches—showcasing them in isolation makes it hard to evaluate their suitability beyond qualitative judgements of visual appeal. (""how good does it look"")

""the optimal panda"" -> it's not the panda that's optimal, right? :D Maybe a phrasing such as ""optimize an image that causes the network to assign it the label 'Panda' with maximum probability""?

Distill employs a reviewer worksheet as a help for reviewers.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: Novel results

Advancing the Dialogue	Score
How significant are these contributions?	2/5

Outstanding Communication	Score
Article Structure	4/5
Writing Style	3/5
Diagram & Interface Style	4/5
Impact of diagrams / interfaces / tools for thought?	4/5
Readability	3/5

Scientific Correctness & Integrity	Score
Are claims in the article well supported?	4/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?	3/5
How easy would it be to replicate (or falsify) the results?	5/5
Does the article cite relevant work?	3/5
Does the article exhibit strong intellectual honesty and scientific hygiene?	3/5

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.