GithubHelp home page GithubHelp logo

distillpub / post--growing-ca Goto Github PK

View Code? Open in Web Editor NEW
82.0 7.0 24.0 19.02 MB

Growing Neural Cellular Automata

Home Page: https://distill.pub/2020/growing-ca/

License: Creative Commons Attribution 4.0 International

HTML 73.16% JavaScript 11.41% TeX 14.16% CSS 0.42% Shell 0.23% Python 0.61%

post--growing-ca's Introduction

Growing Neural Cellular Automata

post--growing-ca's People

Contributors

arvind avatar colah avatar eyvindn avatar fredhohman avatar girving avatar ludwigschubert avatar ncammarata avatar oteret avatar znah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

post--growing-ca's Issues

Hero diagram feedback

[Originally discussed by email, but capturing on github to allow future reference)

Here's a first mock up, inspired by Ian Johnson and Shan Carter's work on the t-sne article (https://distill.pub/2016/misread-tsne/). You could probably copy some of the necessary css from there.

image

A bit more design feedback:

image

Training algorithm illustration

Old diagram:

image

Proposed new version:

image

I think this would be a little clearer. In the old version:

  • I was a little confused about the placement of "backprop through time"
  • why don't the input/outputs feed into the update process (I could imagine thinking it was a separate diagram side by side)?
  • Wha are the light gray arrows? why are they different?

I'm also always a fan of more annotation. :)

CA not working in MacBook

My MacBook (both Chrome and Safari) shows the simulation with the following slowly growing lump. It's OK in my PC notebook.

Screen Shot 2020-02-19 at 3 45 42 PM

Screen Shot 2020-02-19 at 3 46 05 PM

Misc design feedback

(1) Align title with hero diagram, so instead of:

image

you have:

image

(2) I love your table of contents in the margin! <3
image

(3) I also love you moving code blocks into the margin. Two things to consider: (a) I think this section doesn't need the bullet points -- bold paragraph headers would suffice. (b) could we tighten the code for this first one slightly so there isn't the awkward white space? Or split the paragraph into two?

image

(4) Another awkward whitespace spot. Could we move this block below the figure so that on desktop instead of:

image

we get:

image

(5) Whitespace on top of code blocks causes them to not align with the pragraph when in the margin. Let's fix this:

image

Just add something like:

@media (min-width: 1700px)
d-code {
    ...
    margin-top: -10px;
}

Living cell masking

The description and the 'A single update step of the model.'-illustration doesn't match.

alive = max_pool(state_grid[:, :, 3], (3,3)) > 0.1
state_grid = state_grid * cast(alive, float32)

But the collab notebook (and the illustration):

pre_life_mask = get_living_mask(x)

.. update state ...

post_life_mask = get_living_mask(x)
life_mask = pre_life_mask & post_life_mask
return x * tf.cast(life_mask, tf.float32)

In the colab notebook, cells are only living if they were living and are living after the update ?

What is the reason behind this modified life_mask ?

Separating evolution from representation

Hello, First thank you for this amazing post.
I tried to modify the model, to separate the evolution form the representation,
meaning that I have a function that evolve the state and at the end a function that use the
state evolved and compute a representation that compare it to the image to fit.
(I also changed the living_channel to the first channel but this worked fine).
However it seems that the gradient is not propagated to the weight of the evolution
layer.
Do you know why?

class CAModel(tf.keras.Model):

  def __init__(self, channel_n=CHANNEL_N, fire_rate=CELL_FIRE_RATE):
    super().__init__()
    self.channel_n = channel_n
    self.fire_rate = fire_rate

    input_with_gradient = tf.keras.Input(shape=(None,None,self.channel_n*3),
                                         name="gradient")
    current_state = tf.keras.Input(shape=(None,None,self.channel_n),
                                         name="current")
    
    evolution =  layers.Conv2D(self.channel_n, 1, activation=tf.nn.relu,
                               name="evolution")(input_with_gradient)

    representation = layers.Conv2D(3, 1, 
                                   activation=tf.nn.relu,
                                   name="representation")(current_state)
    self.model = tf.keras.Model(inputs=[current_state,
                                 input_with_gradient],
                                outputs=[evolution,representation], name="global")

    self(tf.zeros([1, 3, 3, channel_n]))  # dummy call to build the model

  @tf.function
  def perceive(self, x, angle=0.0):
    identify = np.float32([0, 1, 0])
    identify = np.outer(identify, identify)
    dx = np.outer([1, 2, 1], [-1, 0, 1]) / 8.0  # Sobel filter
    dy = dx.T
    c, s = tf.cos(angle), tf.sin(angle)
    kernel = tf.stack([identify, c*dx-s*dy, s*dx+c*dy], -1)[:, :, None, :]
    kernel = tf.repeat(kernel, self.channel_n, 2)
    y = tf.nn.depthwise_conv2d(x, kernel, [1, 1, 1, 1], 'SAME')
    return y

  @tf.function
  def call(self, x, fire_rate=None, angle=0.0, step_size=1.0):
    pre_life_mask = get_living_mask(x)

    y = self.perceive(x, angle)
    dx,representation = self.model([x,y])
    dx = dx*step_size
    if fire_rate is None:
      fire_rate = self.fire_rate
    update_mask = tf.random.uniform(tf.shape(x[:, :, :, :1])) <= fire_rate
    x += dx * tf.cast(update_mask, tf.float32)

    post_life_mask = get_living_mask(x)
    life_mask = pre_life_mask & post_life_mask
    casted_life_mask = tf.cast(life_mask, tf.float32)

    # a representation is an alpha channel at the top and 
    # rgb channel

    return x * casted_life_mask , tf.concat([casted_life_mask,
                                             representation * casted_life_mask],
                                            axis=-1)

and for the evolution of the state:

for i in tf.range(iter_n):
      x = ca(x)
      x,representation = x
loss = tf.reduce_mean(loss_f(representation,img))

Outdated Tensorflow version in Google Colab

Hi @znah, @oteret, others,

First of all, my compliments on the amazing article!

I was having a look at the Google Colab notebook provided with the original Growing NCA Distill article.

When running the notebook, I encountered an error, and it might originate from incompatibility with the newer version of TensorFlow (and tf.keras). The two things I encountered:

  • when the training loop tries to save the model using export_model, I get the following error: "ValueError: The filename must end in .weights.h5. Received: filepath=train_log/0000". This can be solved by changing ca.save_weights(base_fn) to ca.save_weights(base_fn + '.weights.h5') in export_model

  • when making this adjustment in export_model, I am no longer able to run the interactive demo cell "TensorFlow.js Demo" at the end of the notebook. The demo works perfectly for the pre-computed models, but not for the CHECKPOINT setting which should use the model that was just trained. When inspecting the HTML, it appears to fail when when parsing the content from the model.json file in /train_log.

I have the feeling it is because of the updated tf or tf.keras version. When running the default notebook, it uses versions tf = 2.17.0 and tf.keras = 3.4.1, which might be newer versions than when the notebook was originally published. Do you maybe know what the problem could be? And would you have a solution to get the interactive demo working on newly trained models?

Thanks a lot for the help! ๐Ÿ™Œ

Faster and more stable training

Hi, first thank you for your insightful work.

During my experiment, I found that the training result varies. And it may heavily depend on how I do the weight initialization.

For Example, if you do kaiming_uniform for the update convolutional layer, the picture the CA generates will soon "blow up". In a few iterations every pixel in the picture will be NAN.

I doubt that this is because in the train_step function, the iteration steps are set in range(64, 96). So the gradient descent can't impact the weight timely. And if you choose the initial weight wrongly, you can't get a satisfied result.

So to avoid this, I suggest to do some "warm up" first. You can set the iteration steps in a very small range ((1, 9) for example) at first, and gradually raise it as you train the model.

I have made some tests and it seems that by doing "warm up" you can get ideal result regardless of your weight initialization method. What's more, this also speeds up the training.

This is like you first teach the model some simple characteristics of the picture(like color and the position), and later you give model more space to do some harder tasks(like the details of the picture).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.