GithubHelp home page GithubHelp logo

fastai / execnb Goto Github PK

View Code? Open in Web Editor NEW
103.0 2.0 9.0 741 KB

Execute a jupyter notebook, fast, without needing jupyter

Home Page: https://fastai.github.io/execnb/

License: Apache License 2.0

Python 27.86% Jupyter Notebook 70.83% Shell 0.62% CSS 0.69%
fastai jupyter nbdev notebook

execnb's Introduction

Welcome to fastai

CI PyPI Conda (channel only) docs

Installing

You can use fastai without any installation by using Google Colab. In fact, every page of this documentation is also available as an interactive notebook - click “Open in colab” at the top of any page to open it (be sure to change the Colab runtime to “GPU” to have it run fast!) See the fast.ai documentation on Using Colab for more information.

You can install fastai on your own machines with conda (highly recommended), as long as you’re running Linux or Windows (NB: Mac is not supported). For Windows, please see the “Running on Windows” for important notes.

We recommend using miniconda (or miniforge). First install PyTorch using the conda line shown here, and then run:

conda install -c fastai fastai

To install with pip, use: pip install fastai.

If you plan to develop fastai yourself, or want to be on the cutting edge, you can use an editable install (if you do this, you should also use an editable install of fastcore to go with it.) First install PyTorch, and then:

git clone https://github.com/fastai/fastai
pip install -e "fastai[dev]"

Learning fastai

The best way to get started with fastai (and deep learning) is to read the book, and complete the free course.

To see what’s possible with fastai, take a look at the Quick Start, which shows how to use around 5 lines of code to build an image classifier, an image segmentation model, a text sentiment model, a recommendation system, and a tabular model. For each of the applications, the code is much the same.

Read through the Tutorials to learn how to train your own models on your own datasets. Use the navigation sidebar to look through the fastai documentation. Every class, function, and method is documented here.

To learn about the design and motivation of the library, read the peer reviewed paper.

About fastai

fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes:

  • A new type dispatch system for Python along with a semantic type hierarchy for tensors
  • A GPU-optimized computer vision library which can be extended in pure Python
  • An optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4–5 lines of code
  • A novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training
  • A new data block API
  • And much more…

fastai is organized around two main design goals: to be approachable and rapidly productive, while also being deeply hackable and configurable. It is built on top of a hierarchy of lower-level APIs which provide composable building blocks. This way, a user wanting to rewrite part of the high-level API or add particular behavior to suit their needs does not have to learn how to use the lowest level.

Layered API

Migrating from other libraries

It’s very easy to migrate from plain PyTorch, Ignite, or any other PyTorch-based library, or even to use fastai in conjunction with other libraries. Generally, you’ll be able to use all your existing data processing code, but will be able to reduce the amount of code you require for training, and more easily take advantage of modern best practices. Here are migration guides from some popular libraries to help you on your way:

Windows Support

Due to python multiprocessing issues on Jupyter and Windows, num_workers of Dataloader is reset to 0 automatically to avoid Jupyter hanging. This makes tasks such as computer vision in Jupyter on Windows many times slower than on Linux. This limitation doesn’t exist if you use fastai from a script.

See this example to fully leverage the fastai API on Windows.

We recommend using Windows Subsystem for Linux (WSL) instead – if you do that, you can use the regular Linux installation approach, and you won’t have any issues with num_workers.

Tests

To run the tests in parallel, launch:

nbdev_test

For all the tests to pass, you’ll need to install the dependencies specified as part of dev_requirements in settings.ini

pip install -e .[dev]

Tests are written using nbdev, for example see the documentation for test_eq.

Contributing

After you clone this repository, make sure you have run nbdev_install_hooks in your terminal. This install Jupyter and git hooks to automatically clean, trust, and fix merge conflicts in notebooks.

After making changes in the repo, you should run nbdev_prepare and make additional and necessary changes in order to pass all the tests.

Docker Containers

For those interested in official docker containers for this project, they can be found here.

execnb's People

Contributors

deven367 avatar dleen avatar hamelsmu avatar jph00 avatar ralfg avatar seem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

execnb's Issues

What is equivalent to `InteractiveShell.kernel.comm_manager.register_target('file_request', target_func)`?

Hi,
I encountered this error while using an interactive tool through CaptureShell:

---> 61 get_ipython().kernel.comm_manager.register_target('file_request', target_func)
AttributeError: 'CaptureShell' object has no attribute 'kernel'

Is there any equivalent to InteractiveShell's InteractiveShell.kernel.comm_manager.register_target('file_request', target_func) or colab notebook's google.colab.output.register_callback('ReadFile', callback) attributes that I could use to resolve the error?
Thanks.

UTF-8 encoding is not specified; breaking `write_nb` on Windows for certain notebooks

In the execnb.nbio.write_nb function, the file encoding is not specified as UTF-8 to Path().read_text(), while it is specified elsewhere in execnb.nbio. With certain notebooks this can result in the following error when the default system encoding is not UTF-8 (e.g. Windows):

Traceback (most recent call last):
[...]
  File "C:\Users\ralfg\miniconda3\lib\site-packages\execnb\nbio.py", line 94, in write_nb
    old = Path(path).read_text() if path.exists() else None
  File "C:\Users\ralfg\miniconda3\lib\pathlib.py", line 1267, in read_text
    return f.read()
  File "C:\Users\ralfg\miniconda3\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 48369: character maps to <undefined>

On the following line, the encoding should be specified as UTF-8:

old = Path(path).read_text() if path.exists() else None

displaying plots using matplotlib in a notebook after `CaptureShell()`

Once I instantiate a CaptureShell in a notebook, I can no longer display plots.

image

There seems to be quite a bit of functionality in code relating to this, but I am not sure how I am supposed to use it. Could I please ask if there is anything I can call/pass somewhere in order to be able to plot from the host notebook?

Thank you!

should I close `CaptureShell`?

When I initialize CaptureShell(), I have an interactive shell running, should I close it somehow after computation?

I was experimenting with CaptureShell by creating hundreds of shells, and the RAM usage didn't increase a lot, I was expecting that each shell would need ~200MB RAM. Why was RAM usage so low?

I started to work on V2 for Mercury framework (converting notebook to apps) and using execnb as execution engine.

How fast is it?

Is there any speed comparison with nbconvert?

Is it possible to use execnb with an already running kernel?

I'm working on a tool for converting Jupyter Notebook to interactive Web Apps - to make notebooks accessible for non-technical users. The tool is called Mercury. You can add widgets to the notebook by simply adding a YAML header in the first cell. The framework is based on nbconvert. I would love to use a faster tool for executing notebooks.

[Feature request] Running multiple notebooks from the command line

Wondering if it's possible to accept multiple notebook sources as arguments to the command line tool.
e.g. exec_nb nb1.ipynb nb2.ipynb --exc_stop

Still need the --exc_stop option to stop if any notebook raises an exception in any cell.

This enables the use of notebooks as testable documentation that can be integrated with other tools like CMake. Currently, the way to do that would be to write a shell script that iterates over said notebooks and calls exec_nb one by one.

nbdev_test fails due to unescaped backslash in windows path

I'm just getting started, and was following the tutorial.
When I got to the nbdev_prepare step, I saw two errors of the following form:

WARNING:root:SyntaxError in C:\Users\john\projects\figure-game-solver\index.ipynb:
===========================================================================

While Executing:
  File "<ipython-input-1-22bbf77d2f75>", line 1
    import sys; sys.path.insert(0, 'C:\Users\john\projects\figure-game-solver')
                                                                              ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

nbdev Tests Failed On The Following Notebooks:
==================================================
        00_core.ipynb
        index.ipynb

It seems that the raw (unescaped) path to my directory has been included.

I wondered whether this may have been related to the pip install -e .[dev] step.
pip list includes the following:

fastparquet                      0.8.1
figure-game-solver               0.0.1        c:\users\john\projects\figure-game-solver
fsspec                           2022.5.0

I used pip uninstall figure-game-solver to remove this from the pip list successfully. But it didn't change the results of nbdev_test.

Have I done something wrong, or is there perhaps a workaround for Windows?

Notes:

  • I am using python 3.9.5.
  • I am using VSCode rather than the jupyter notebook & browser.
  • The cells in the 00_core.ipynb and index.ipynb seem to execute correctly from within VSCode.

matplotlib images are not captured when outer shell uses inline backend

I think the cause is that matplotlib_inline.backend_inline.select_figure_formats is skipped in this case:

https://github.com/ipython/matplotlib-inline/blob/f764b4354b2358aa001471215494d7dfdc505593/matplotlib_inline/backend_inline.py#L202

I've implemented a fix in a fork by overriding enable_matplotlib to explicitly call select_figure_formats, which I'll make as a separate PR:

https://github.com/seeM/execnb/blob/da0764acdfd392064f5b0a243ccbd14e111b3602/execnb/shell.py#L79-L84

bug or feature? CaptureShell capture all console printouts between threads

I'm using CaptureShell in a thread. When CaptureShell is executing long computation all printouts from other cells are captured by CaptureShell

Code to reproduce issue

# cell 1
import threading
from execnb.shell import CaptureShell

# cell 2
def worker():
    s = CaptureShell()
    response = s.run("""
import time

for i in range(10):
    print(i)
    time.sleep(1)
""")
    print(response)

# cell 3
threading.Thread(target=worker, daemon=True).start()


# cell 4
print("where am I?")

I got output:

[{'name': 'stdout', 'output_type': 'stream', 'text': ['0\n', 'where am I?\n', '1\n', '2\n', '3\n', '4\n', '5\n', '6\n', '7\n', '8\n', '9\n']}]

Why where am I? string is in CaptureShell execution output? why it is mixed?

Notebook with example code

image

@jph00 is it a bug or a feature?

Daemon mode to speed up startup

There've been ongoing discussions in various channels about adding a daemon mode to execnb. Although this is probably a longer-term thing, I thought an issue could be a good place to store useful notes and important decisions made along the way.

What is a daemon mode?

@jph00's thread sums it up well:

Python scripts can be slow to start because the imports can take a long time. I have lots of scripts that read/write stdin/out. I'd like to replace them with something with identical behavior, but which behind the scenes auto-launches a little server.
[...]
The idea is that the overhead of starting the script only happens once. This is useful for stuff like git hook scripts.

References

cc @hamelsmu

Outputs is left as array instead of string

Compared to nbformat the outputs field of execnb remains as an array of strings (for compatability with json) instead of being squashed to a single string.

This prevents execnb from being a drop in replace for nbformat at least when using with the nbdime library.

For example see:

https://github.com/dleen/notebook-examples/blob/main/00_core.ipynb

The logic in nbformat for splitting/joining: https://github.com/jupyter/nbformat/blob/640a0c6830bb6dc0ef963a8caab377d89ed24c6a/nbformat/v4/rwbase.py#L26

Trailing `;`s should silence the display hook

Current:

In [1]: from execnb.shell import CaptureShell
   ...: CaptureShell().run('0;')
Out[1]:
[{'data': {'text/plain': ['0']},
  'metadata': {},
  'output_type': 'execute_result',
  'execution_count': 1}]

Expected: should return [].

Seaborn is not compatible with execnb

Hi Jeremy / nbdev team,

Thank you for the update to nbdev! Unfortunately, I've run into a few issues with the latest update that prevents me from migrating to v2.

Problem: Seaborn breaks nbdev_test and subsequently nbdev_docs, etc due to some issue with ipywidgets.

Error

(nbdevtest) ➜  nbdev-test git:(main) ✗ nbdev_test

WARNING:root:AttributeError in /Users/jvivian/Desktop/nbdev-test/00_core.ipynb:
===========================================================================

While Executing Cell fastai/nbdev#5:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-1680d1956dc6> in <module>
----> 1 import seaborn as sns
      2
      3 sns.lineplot(x=[1,2,3], y=[1,2,3])

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/seaborn/__init__.py in <module>
     10 from .miscplot import *  # noqa: F401,F403
     11 from .axisgrid import *  # noqa: F401,F403
---> 12 from .widgets import *  # noqa: F401,F403
     13 from .colors import xkcd_rgb, crayons  # noqa: F401
     14 from . import cm  # noqa: F401

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/seaborn/widgets.py in <module>
      5 # Lots of different places that widgets could come from...
      6 try:
----> 7     from ipywidgets import interact, FloatSlider, IntSlider
      8 except ImportError:
      9     import warnings

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/ipywidgets/__init__.py in <module>
     51     load_ipython_extension(ip)
     52
---> 53 _handle_ipython()

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/ipywidgets/__init__.py in _handle_ipython()
     49     if ip is None:
     50         return
---> 51     load_ipython_extension(ip)
     52
     53 _handle_ipython()

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/ipywidgets/__init__.py in load_ipython_extension(ip)
     31     if not hasattr(ip, 'kernel'):
     32         return
---> 33     register_comm_target(ip.kernel)
     34
     35

~/miniconda3/envs/nbdevtest/lib/python3.7/site-packages/ipywidgets/__init__.py in register_comm_target(kernel)
     38     if kernel is None:
     39         kernel = get_ipython().kernel
---> 40     kernel.comm_manager.register_target('jupyter.widget', Widget.handle_comm_opened)
     41     kernel.comm_manager.register_target('jupyter.widget.control', Widget.handle_control_comm_opened)
     42

AttributeError: 'NoneType' object has no attribute 'comm_manager'


nbdev Tests Failed On The Following Notebooks:

Steps to reproduce

  • Hardware: Mac M1

  • Environment: Conda using rosetta layer for compatibility

  • pip install -U nbdev seaborn jupyterlab

  • gh repo create

  • nbdev_new

    • I had to delete from nbdev_test.core import * as I got an error saying no module named nbdev_test exists.
  • jupyter lab

  • Add a new cell, import seaborn as sns, make a simple plot, save.

  • nbdev_test

Here's the nbdev_test error for completeness:

(nbdevtest) ➜  nbdev-test git:(main) ✗ nbdev_test
WARNING:root:ModuleNotFoundError in /Users/jvivian/Desktop/nbdev-test/index.ipynb:
===========================================================================

While Executing Cell fastai/nbdev#1:
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-9efe2d387d46> in <module>
----> 1 from nbdev_test.core import *

ModuleNotFoundError: No module named 'nbdev_test'


nbdev Tests Failed On The Following Notebooks:
==================================================

I'm a big fan of nbdev, so please let me know if there are other ways I can assist.

Cheers,
John

Is it possible to stream output?

Right now execnb is waiting for the final response from code execution. Is it possible to steam execution results without waiting for final response?

I'm using execnb for executing notebooks in the Mercury framework, and got such question from user mljar/mercury#301

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.