GithubHelp home page GithubHelp logo

thomasballinger / papermill Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nteract/papermill

0.0 2.0 0.0 456 KB

๐Ÿ“š Parameterize, execute, and analyze notebooks

Home Page: http://papermill.readthedocs.io/en/latest/

License: BSD 3-Clause "New" or "Revised" License

Python 89.87% Jupyter Notebook 9.58% Shell 0.54%

papermill's Introduction

Papermill

https://travis-ci.org/nteract/papermill.svg?branch=master https://codecov.io/github/nteract/papermill/coverage.svg?branch=master Documentation Status

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Papermill lets you:

  • parametrize notebooks
  • execute and collect metrics across the notebooks
  • summarize collections of notebooks

This opens up new opportunities for how notebooks can be used. For example:

  • Perhaps you have a financial report that you wish to run with different values on the first or last day of a month or at the beginning or end of the year, using parameters makes this task easier.
  • Do you want to run a notebook and depending on its results, choose a particular notebook to run next? You can now programmatically execute a workflow without having to copy and paste from notebook to notebook manually.
  • Do you have plots and visualizations spread across 10 or more notebooks? Now you can choose which plots to programmatically display a summary collection in a notebook to share with others.

Installation

From the commmand line:

pip install papermill

Installing In-Notebook bindings

  • Python (included in this repo)
  • R (available in the papermillr project)

Usage

Parametrizing a Notebook

To parametrize your notebook designate a cell with the tag parameters. Papermill looks for the parameters cell and treat those values as defaults for the parameters passed in at execution time. It acheive this by inserting a cell after the tagged cell. If no cell is tagged with parameters a cell will be inserted to the front of the notebook.

docs/img/parameters.png

Executing a Notebook

The two ways to execute the notebook with parameters are: (1) through the Python API and (2) through the command line interface.

Execute via the Python API

import papermill as pm

pm.execute_notebook(
   'path/to/input.ipynb',
   'path/to/output.ipynb',
   parameters = dict(alpha=0.6, ratio=0.1)
)

Execute via CLI

Here's an example of a local notebook being executed and output to an Amazon S3 account:

$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

Python In-notebook Bindings

Recording Values to the Notebook

Users can save values to the notebook document to be consumed by other notebooks.

Recording values to be saved with the notebook.

"""notebook.ipynb"""
import papermill as pm

pm.record("hello", "world")
pm.record("number", 123)
pm.record("some_list", [1, 3, 5])
pm.record("some_dict", {"a": 1, "b": 2})

Users can recover those values as a Pandas dataframe via the read_notebook function.

"""summary.ipynb"""
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')
nb.dataframe

docs/img/nb_dataframe.png

Displaying Plots and Images Saved by Other Notebooks

Display a matplotlib histogram with the key name matplotlib_hist.

"""notebook.ipynb"""
import papermill as pm
from ggplot import mpg
import matplotlib.pyplot as plt

# turn off interactive plotting to avoid double plotting
plt.ioff()

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)

docs/img/matplotlib_hist.png

Read in that above notebook and display the plot saved at matplotlib_hist.

"""summary.ipynb"""
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')
nb.display_output('matplotlib_hist')

docs/img/matplotlib_hist.png

Analyzing a Collection of Notebooks

Papermill can read in a directory of notebooks and provides the NotebookCollection interface for operating on them.

"""summary.ipynb"""
import papermill as pm

nbs = pm.read_notebooks('/path/to/results/')

# Show named plot from 'notebook1.ipynb'
# Accepts a key or list of keys to plot in order.
nbs.display_output('train_1.ipynb', 'matplotlib_hist')

docs/img/matplotlib_hist.png

# Dataframe for all notebooks in collection
nbs.dataframe.head(10)

docs/img/nbs_dataframe.png

Documentation

We host the papermill documentation on ReadTheDocs.

papermill's People

Contributors

aaronmak avatar adparker avatar betatim avatar charsmith avatar chyzzqo avatar cloud-princess avatar ewmassey avatar harph avatar jarrekk avatar lukasheinrich avatar menglewis avatar michelorengomoodys avatar mortadelle07 avatar mpacer avatar mseal avatar rgbkrk avatar willingc avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.