GithubHelp home page GithubHelp logo

silky / nodebook Goto Github PK

View Code? Open in Web Editor NEW

This project forked from stitchfix/nodebook

0.0 1.0 0.0 42 KB

Repeatable analysis plugin for Jupyter notebook

License: Apache License 2.0

Python 43.01% JavaScript 9.20% Jupyter Notebook 47.78%

nodebook's Introduction

Nodebook

CircleCI

Nodebook is a plugin for Jupyter Notebook designed to enforce an ordered flow of cell execution. Conceptually, Nodebook notebooks operate like a script where each cell depends on the cells above it. This prevents messy and difficult to maintain out-of-order execution which frequently occurs in vanilla Jupyter notebooks where each cell modifies the global state. For more information, see this post.

Installation

Nodebook is available on pypi and can be installed with pip. Additionally, the jupyter extension must be registered:

pip install nodebook
jupyter nbextension install --py nodebook

Usage

To use Nodebook, add the following lines to a cell in your Jupyter notebook:

#pragma nodebook off
%load_ext nodebook.ipython
%nodebook {mode} {name}

Where {mode} is one of memory or disk, and {name} is a unique identifier for your notebook.

Mode determines whether variables are stored in memory or on disk.

For additional example usage, see nodebook_demo.ipynb. Also see below for a quick demo showing the basic difference in behavior between Nodebook and standard Jupyter:

demo

FAQ

Q: Should I use Python 2 or Python 3 with Nodebook?

There has been an increasing consensus toward sunsetting support for Python 2, including in Project Jupyter. Nodebook currently supports both Python 2 and Python 3, but Python 3 is preferred.

Q: Why am I seeing "ERROR:root:Cell magic %%execute_cell not found."?

Nodebook loads a javascript plugin to modify Jupyter's behavior. If you encounter this error, it means that the javascript plugin is loaded but the ipython plugin is not. This can happen when the javascript is already loaded but you have restarted the kernel and haven't run %nodebook {mode} {name}. The solution is either to run the %nodebook magic (if you want to run nodebook), or delete the cell with the %nodebook magic and refresh your browser to unload the javascript (if you want to turn nodebook off).

Q: What are the tradeoffs between memory and disk mode?

Nodebook serializes all cell outputs to maintain consistent state between cells. In memory mode, objects are serialized to an in-memory dictionary, in disk mode objects are serialized to a directory within your notebook's working directory. Speed can be a factor when choosing between them, but on a modern SSD, serialization time generally dominates and memory and disk mode have similar performance. The main consideration is that disk mode has the advantage of persisting your environment when the python kernel is restarted, but the disadvantage of leaving behind a directory on your local filesystem that you may want to manually clean up later (this can add up especially if you are working with large objects in your notebook).

Q: What are the limitations of Nodebook?

While Nodebook supports most Python operations, it has a few limitations related to the use of serialization. First, not all objects are currently serializable, most noteably generators. Second, serialization adds some extra time. This is imperceptible for small objects, but is noticable for objects larger than a few hundred MB. Instead of working directly with very large objects in Nodebook, I recommend using it to prototype your analysis on a subset of data.

nodebook's People

Contributors

jni avatar kzielnicki avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.