GithubHelp home page GithubHelp logo

allendowney / thinkcomplexity2 Goto Github PK

View Code? Open in Web Editor NEW
712.0 39.0 647.0 188.35 MB

Book and code for Think Complexity, 2nd edition

Home Page: https://allendowney.github.io/ThinkComplexity2/

Makefile 0.07% TeX 5.59% HTML 0.10% Python 1.08% Jupyter Notebook 93.16%

thinkcomplexity2's People

Contributors

allendowney avatar apan64 avatar branchwelder avatar buttegab avatar cjwoodard avatar daniel6 avatar ericmjl avatar gwtaylor avatar jaredbriskman avatar joeylmaalouf avatar odewahn avatar phuston avatar seanccarter avatar willythor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

thinkcomplexity2's Issues

Cell2DViewer is missing

Where is the class Cell2DViewer? It's missing from Cell2D.py. Life.py in the code directory uses Cell2DViewer

Running Notebooks in VSCode

As a developer, I wanted to run the notebooks in VSCode. I struggled a bit to get this working (imports cell threw ModuleNotFoundError), so sharing here for future readers.

Install Anaconda

https://www.anaconda.com/products/distribution

Clone the repo

git clone https://github.com/AllenDowney/ThinkComplexity2.git
cd ThinkComplexity2

Create and activate the environment

conda env create -f environment.yml
conda activate ThinkComplexity2

Select the environment in VSCode

From the Command Palette choose Python: Select Interpreter and select the workspace environment (should be the Recommended one).

VSCode docs.

Open a notebook and select the Conda Kernel

The kernel picker in the top right (base (Python 3.9.12) in the screenshot below.

image

VSCode docs.

Run the import cell to check it work

PDF (continuous) plot being used for PMFs

I wasn't sure if it was intentional that you're using thinkplot.Pdf instead of thinkplot.Pmf for the analysis in Chapter 4. This is in section 4.3, e.g.

thinkplot.Pdf(pmf_fb, label= Facebook )
thinkplot.Pdf(pmf_ws, label= WS graph )

It makes more sense to me to use thinkplot.Pmf since we are dealing with discrete quantities.

Using itertools for `make_all_agents` in chapter 11

Currently the make_all_agents function in chapter 11 makes use of binary arithmetic:

def make_all_agents(fit_land, agent_maker):
    """Make an array of Agents.
    
    fit_land: FitnessLandscape
    agent_maker: class used to make Agent
    
    returns: array of Agents
    """
    N = fit_land.N
    xs = np.arange(2**N)
    ys = 2**np.arange(N)[::-1]    
    locs = np.bitwise_and.outer(xs, ys) > 0
    agents = [agent_maker(loc, fit_land) for loc in locs]
    return np.array(agents)

make_all_agents uses the outer product of bitwise_and, which is not the most obvious operation.

Would using itertools.product (which returns the product of any set of spaces) make this more readable/obvious?

def make_all_agents(fit_land, agent_maker):
    """Make an array of Agents.
    
    fit_land: FitnessLandscape
    agent_maker: class used to make Agent
    
    returns: array of Agents
    """
    locations =  itertools.product([0, 1], repeat=fit_land.N)
    agents = [agent_maker(loc, fit_land) for loc in locations]
    return np.array(agents)

(itertools.product(S, repeat=N) creates a generator that gives the elements of $S^N$)

Currently make_all_agents is not included in the text of the book. This is understandable given the slightly complex use of binary arithmetic, perhaps using itertools would allow for it to be included.

Just a suggestion :)

(Note that that that particular implementation creates tuples of 0/1 and not lists of True/False, this could be refactored.)

(A very minor further suggestion would be to also use the variable name agent_class instead of agent_maker, I spent a little while looking around for some sort of factory function but that could well be a very personal miss read).

Chapter 5 of ThinkComplexity2e syntax error on upload

I was able to upload the Jupyter notebooks for chapters 1-4 of ThinkComplexity2e but attempting Chapter 5 (chap05.ipynb) led to the following error:

error was: SyntaxError: JSON Parse error: Unrecognized token '<'

PROD: equations don't look great

Rendering of equations doesn't look very good. Page 30 is the first example to look at. Another is the exponents on page 116. And on page 118, the number 1.5 is spaced strangely.

Is there different DocBook I could generate to make these look better? Or can we tweak them by hand?

Dead links in Appendix Exercise A.6

The Appendix Exercise A.6 provides some dead links:

You can download my map implementations from \url{thinkcomplex.com/Map.py}, and the code I used in this section from \url{thinkcomplex.com/listsum.py}.

Simple is_connected() function

Hi Dr Allen, could your is_connected() function be replaced by a simple mathematical function, i mean, instead of:

def is_connected(G):
start = next(iter(G))
reachable = reachable_nodes(G, start)
return len(reachable) == len(G)

it could be:

def is_connected(g: nx.Graph):
maxEdges = (gr.number_of_nodes() * (g.number_of_nodes() - 1)) // 2

if g.number_of_edges() == maxEdges:
    return True
return False

this simple logic will replace the reachable_nodes() function.

Turing diffusion model

The beginning of section 7.1 discusses Turing's diffusion model:

In 1952 Alan Turing published a paper called “The chemical basis of morpho- genesis”, which describes the behavior of systems involving two chemicals that diffuse in space and react with each other.

It then branches to talk about a CA-based version

Turing’s model is based on differential equations, but it can also be imple- mented using a cellular automaton.
But before we get to Turing’s model, we’ll start with something simpler: a diffusion system with just one chemical.

The way it's written implies returning to Turing's DE-based model, but the chapter never returns to discuss this.

PROD: captions with code in them are broken

It looks like when I have a code snippet in a caption, the typesetting of the caption gets messed up. There's an example on page 119.

Same issue comes up in the footnote on page 124.

Is the DocBook I am generating correct, and getting rendered wrong, or is there something wrong with it?

Chapter 6 notebook: Straightforward translation of GoL rules using loops doesn't handle boundaries correctly

Note that the output for the straightforward loop-based implementation is different from the correlation-based versions. In the static version of the notebook, the input is:

[[1 1 1 0 1 1 1 1 0 1]
 [0 1 0 1 0 1 0 0 1 1]
 [0 0 1 0 0 1 1 0 0 1]
 [0 0 0 0 0 1 1 0 0 1]
 [0 0 0 1 0 1 1 1 1 1]
 [0 0 1 0 1 0 1 0 0 0]
 [0 0 0 1 0 0 0 0 0 0]
 [0 1 0 0 0 1 0 1 1 0]
 [1 1 0 1 0 0 1 1 1 1]
 [0 0 1 0 0 0 0 0 0 0]]

The output for the loop-based implementation is:

[[0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 0 0 0 0 0 0]
 [0 0 1 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 0 0 0 0 1 0]
 [0 0 1 0 1 0 1 0 1 0]
 [0 0 1 1 1 1 1 1 0 0]
 [0 1 0 0 1 0 0 0 0 0]
 [0 1 0 0 0 0 1 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

The output for the cross correlation version is:

[[1 1 1 1 1 1 1 1 0 1]
 [1 0 0 1 0 0 0 0 0 1]
 [0 0 1 0 0 0 0 1 0 1]
 [0 0 0 0 0 0 0 0 0 1]
 [0 0 0 1 0 0 0 0 1 1]
 [0 0 1 0 1 0 1 0 1 0]
 [0 0 1 1 1 1 1 1 0 0]
 [1 1 0 0 1 0 0 0 0 1]
 [1 1 0 0 0 0 1 0 0 1]
 [0 1 1 0 0 0 0 1 1 0]]

Note that the difference is on the boundaries, and that's because the following line doesn't work for the boundaries:

neighbors = a[i-1:i+2, j-1:j+2]

A student in my class made this observation, and also supplied an alternative version that explicitly checks for boundaries. I have confirmed that this version produces the same output as the cross correlation version.

b = np.zeros_like(a)
rows, cols = a.shape
for i in range(0, rows):
    for j in range(0, cols):
        state = a[i, j]
        
        if i == 0 and j == 0:
            neighbors = a[i:i+2, j:j+2]
        elif j == 0:
            neighbors = a[i-1:i+2, j:j+2]
        elif i == 0:
            neighbors = a[i:i+2, j-1:j+2]
        else:
            neighbors = a[i-1:i+2, j-1:j+2]
            
        k = np.sum(neighbors) - state
        if state:
            if k==2 or k==3:
                b[i, j] = 1
        else:
            if k == 3:
                b[i, j] = 1

print(b)

Typo in Chapter 6

Chapter 6, page 98 says "if the center cell is 1" instead of "if the center cell is 10".

targets initialized as a list instead of a set in barabasi_albert_graph

I noticed targets was initialized as a list, but then replaced by a set (_random_subset returns a set) in section 4.6, barabasi_albert_graph:

def barabasi_albert_graph(n, k):

    G = nx.empty_graph(k)
    targets = list(range(k))
    repeated_nodes = []

    for source in range(k, n):
        G.add_edges_from(zip([source]*k, targets))

        repeated_nodes.extend(targets)
        repeated_nodes.extend([source] * k)

        targets = _random_subset(repeated_nodes, k)
    return G

Would it make more sense to just initialize targets as a set?

Maybe a "correction" for Chapter 2 Graphs

I don't know what is your preferred way to get those kind of corrections/typos, so I posted it here.

In Chapter 2 Graphs you say

We can use reachable_nodes to write is_connected:

def is_connected(G):
    start = next(G.nodes_iter())
    reachable = reachable_nodes(G, start)
    return len(reachable) == len(G)

is_connected chooses a starting node by calling nodes_iter, which returns an iterator object, and passing the result to next, which returns the first node.

seen gets the set of nodes that can be reached from start. If the size of this set is the same as the size of the graph, that means we can reach all nodes, which means the graph is connected.

You say seen when I think you should say reachable, because seen is the returned variable from reachable_node function which in turns gets assigned to reachable. But you are explaining now the function is_connected not the function reachable_nodes.

[BUG] Chapter 12 notebook treats probability of survival as probability of death

In chapter 11's default implementation of choose_dead we see that a random 10% of agents die in every round. So 0.1 is the probability of death:

def choose_dead(self, ps):
        """Choose which agents die in the next timestep.
        ps: probability of survival for each agent
        returns: indices of the chosen ones
        """
        n = len(self.agents)
        is_dead = np.random.random(n) < 0.1
        index_dead = np.nonzero(is_dead)[0]
        return index_dead

When this is overridden to use differential survival, we flip < to > in the is_dead line to interpret ps as probability of survival:

class SimWithDiffSurvival(Simulation):
    def choose_dead(self, ps):
        """Choose which agents die in the next timestep.
        ps: probability of survival for each agent
        returns: indices of the chosen ones
        """
        n = len(self.agents)
        is_dead = np.random.random(n) > ps
        index_dead = np.nonzero(is_dead)[0]
        return index_dead

However, this doesn't happen in chapter 12 and I believe it's a bug that affects the conclusions made in the chapter.

Note that chapter 12 uses the same default implementation of choose_dead as chapter 11. It interprets 0.1 as probability of death. But when introducing differential survival, choose_dead is overridden as this:

# class PDSimulation(Simulation):

    def choose_dead(self, fits):
        """Choose which agents die in the next timestep.
        fits: fitness of each agent
        returns: indices of the chosen ones
        """
        ps = prob_survive(fits)
        n = len(self.agents)
        is_dead = np.random.random(n) < ps
        index_dead = np.nonzero(is_dead)[0]
        return index_dead

Note that < isn't flipped this time. So imagine that all the agents had infinite fitness and therefore their probability of survival was 1.0 . The line is_dead = np.random.random(n) < ps would return all True and kill them all off. This doesn't make sense.

Since choose_dead takes a parameter ps (stands for probability of survival), I think that we should flip < to > and use 0.9 in the default implementation so that the semantics are consistent. Then we never need to do a sign flip and there won't be confusion down the road.

However, this still leaves open the experiments & conclusions made in chapter 12, which, I believe, come with this bug in place.

Make the stochastic experiments reproducible with a seed

There are numerous stochastic experiments in the book/notebooks. Would it be worthwhile setting a seed for them?

I believe all stochastic operations are done in numpy so the following would suffice for each experiment:

np.random.seed(0)

(
for repeated experiments I often use

for seed in range(number_repetitions):
    np.random.seed(seed)
    ...

)

This ensures reproducibility of results.

  • For the reader it would ensure they get the exact same images etc as in the book;
  • It's a good habit for potential research/grad students using the book.

Just a suggestion :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.