GithubHelp home page GithubHelp logo

Add a small CLI? about nbclient HOT 38 CLOSED

jupyter avatar jupyter commented on July 26, 2024
Add a small CLI?

from nbclient.

Comments (38)

choldgraf avatar choldgraf commented on July 26, 2024 3

@MSeal yeah, I agree with that. I think if we add a CLI to nbclient, it should be explicitly restricted to the behavior that we expose in execute. I'd see this as "just" a command-line interface for the execute function. People will ask for extra parameters etc, and in those cases we suggest they use papermill and request them there.

Maybe it makes sense to use papermill for this, but I'm a bit worried about, e.g., a total newcomer who would be scared off by all of papermill's extra functionality (in the same way that they are scared off by nbconvert's extra functionality). It's why I think there's value in having something extremely simple as an option.

from nbclient.

palewire avatar palewire commented on July 26, 2024 3

I think the super simple proposals here sound great.

My take is that the term 'nbclient" is a little opaque. I think something that hooks into the jupyter namespace and uses a more direct verb like execute reads clearer and will be easier for noobs to grok.

Also, I think the starting expectation that this command would output anything is a hangover effect from a conversion tool, nbconvert, being the prior way to achieve this goal. In my view, the execution use case and the conversion use case are separate. They should have separate CLIs with different options.

Following on that, one questionable jupyterism I think a more narrowly tailored execution command might be able to solve is the practice of hiding any error tracebacks in the output file. In my view, a CLI user expects and deserves the traceback to surface into STDOUT and STDERR in the term.

So my first thought is that something like

jupyter execute [path]

is the best starting point, with no output option required to run it

from nbclient.

drscotthawley avatar drscotthawley commented on July 26, 2024 2

Found this issue from this forum post. This is not directly related to the CLI but to the question of "how do I run this notebook from the command line".

The solutions in the forum thread included

$ jupyter nbconvert --to notebook --execute mynotebook.ipynb

as a solution, but that doesn't print out any of the print statements that are in my notebook. I tried Papermill but it just generates a bunch of JSON without any of my print statements....so again if this is "executing" it's not obvious to this user.

So I'm just leaving this solution that serves my simple needs for now in case others come across this thread:

I defined a bash alias/function like so:

nbrun() { jupyter nbconvert --to script "$1"; cat "${1%.*}".py | grep -v get_ipython > run_this.py; python3 run_this.py;}

and then I just run

$ nbrun mynotebook.ipynb

The grep -v get_ipython is to strip out calls to !pip... Certainly one could do a better job and actually call pip as part of the bash script.. but for now this is all I need. Presumably a proper CLI would have more careful things involved, so like I say, this is just for other lost web-searchers, to tide us over. :-) Thanks for your hard work!

from nbclient.

fperez avatar fperez commented on July 26, 2024 2

BTW, sorry I missed this earlier! First, awesome job @palewire and team, thx Chris for opening up the initial issue!

Before this gains a lot of traction, I would suggest naming, or at least aliasing it, to run instead of execute. The reason is that IPython has had runsince basically forever (it was pretty much the raison d'Γͺtre for IPython, absent notebooks), and unbeknownst to many, %run even recognizes notebooks:

image

I think it will be valuable for users to simply remember that 'jupyter run' runs notebooks, and in IPython the same command works the same.

Sorry for not having pitched earlier during the discussion, but I think it's worth considering.

from nbclient.

cdeil avatar cdeil commented on July 26, 2024 1

I would like to execute a bunch of notebooks just to time their execution, without having to write output files and clean them up after. Don't care really where this functionality is offered, any of the suggestions mentioned above seems fine. This could be another simple option to implement?

jupyter nbconvert --to none

from nbclient.

davidbrochart avatar davidbrochart commented on July 26, 2024 1

You're right, no documentation yet, it's very new! I will work on that soon, in the meantime there is the README πŸ˜„

from nbclient.

davidbrochart avatar davidbrochart commented on July 26, 2024 1

Thanks for your work @palewire !

from nbclient.

palewire avatar palewire commented on July 26, 2024 1

I can make a patch renaming the subcommands to run, tho I'm tempted to leave an execute command in there with a Python deprecation warning since we've just blasted it a bit. But if you want to just pull the bandaid we can do that too.

from nbclient.

palewire avatar palewire commented on July 26, 2024 1

It was out on PyPI, but only for a few days. My patch PR is in now for your review.

#173

from nbclient.

mgeier avatar mgeier commented on July 26, 2024

What about:

python3 -m nbclient mynotebook.ipynb --inplace

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

that could work as long as nbclient only did one thing, which was executing notebooks πŸ‘

from nbclient.

mgeier avatar mgeier commented on July 26, 2024

That's what I was wondering all along: what else does nbclient?

from nbclient.

mgeier avatar mgeier commented on July 26, 2024

I would be genuinely interested in what else nbclient is supposed to do.

If it does more than one thing, the natural way to call it on the command line would be this:

python3 -m nbclient.execute mynotebook.ipynb --inplace

This is just like e.g. Python's built-in HTTP server is started:

python3 -m http.server

from nbclient.

MSeal avatar MSeal commented on July 26, 2024

For now the intention is to have a very narrow scoped notebook executor that has few dependencies, like jupyter_client does for interacting with kernels. This library doesn't handle output IO and does one thing well. It'll likely get an async execution pattern, but beyond that I don't see major responsibilities being added to the core the library for now.

Coupling execution with transformation changes made improving or fixing execution patterns very slow to update in nbconvert. Thus nbclient is meant to just execute notebooks in memory without execution opinions or notebook manipulations. nbconvert will keep it's ExecuteProcessor which will wrap this library so we don't drop support for nbconvert's piping pattern. And similarly papermill will keep it's opinionated execution pattern with parameterization and input / output isolation but without needing the rest of nbconvert's dependencies. Both can lean on the in-memory execution model here and implement just the tool specific concerns. Given that, I'm not sure if we need a CLI for this library with the two downstream opinionated patterns. But it wouldn't be hard to add one if it feels useful to folks.

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

Makes sense - I don't think a CLI is anything urgent...maybe something to stew on and see if others jump in and make the same request over time

from nbclient.

meeseeksmachine avatar meeseeksmachine commented on July 26, 2024

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/how-to-run-a-notebook-using-command-line/3475/6

from nbclient.

meeseeksmachine avatar meeseeksmachine commented on July 26, 2024

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/how-to-run-a-notebook-using-command-line/3475/10

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

@MSeal recommended using papermill and this pattern for execution w/o outputs:

papermill {test_file_name.ipynb} /dev/null {optional args...} 

Perhaps we should document use-cases like these in the FAQ of nbclient. I think many people will see this repository "expecting" it to handle things like command-line execution, so we could point them towards papermill or other relevant repositories in those cases

from nbclient.

davidbrochart avatar davidbrochart commented on July 26, 2024

I guess a small CLI wouldn't hurt? We could start with:

jupyter nbclient --execute my_notebook.ipynb

from nbclient.

MSeal avatar MSeal commented on July 26, 2024

It kinda hurts, because we already have nbconvert and papermill CLIs for executing. If we did add one I'd want to mark the one in nbconvert as deprecated and warn users that CLI option is going away. I think this would be better overall for simplicity of tools, but I find removing things from nbconvert can be a hard sell for users. Thoughts if we went that direction?

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

I'd imagine a super simple execution CLI like

nbclient mynotebook.ipynb [-o output_ntbk.ipynb]

and that's it. For anything fancier, we would recommend that folks use papermill, and for execution in the context of conversion we either recommend nbconvert, or we find a way to chain execution with nbconvert so that nbconvert isn't responsible for that at all.

Whatever package does it, I think there should be a way to run <verb> <notebook> and have it execute the notebook. The mental model I assume many people have is python myfile.py...we should have something that is just as simple, and doesn't require people to know developer-specific terminology like /dev/null or lots of extra parameterization.

That said, in the meantime, I think we can still improve this by just adding documentation to answer the question "how do I quickly execute a notebook from the command line".

from nbclient.

MSeal avatar MSeal commented on July 26, 2024

The thing I worry about is the 100 issues that arise with "can option A be added to nbclient CLI?" Makes it hard to keep by KISS when you add another non DRY path in the code.

Let's do

That said, in the meantime, I think we can still improve this by just adding documentation to answer the question "how do I quickly execute a notebook from the command line".

for sure.

from nbclient.

davidbrochart avatar davidbrochart commented on July 26, 2024

Just pointing to nbterm, which can execute notebooks from the command line.

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

@davidbrochart that looks really cool! Though I don't see any documentation. Where should folks go to learn how to use nbterm to execute notebooks?

from nbclient.

palewire avatar palewire commented on July 26, 2024

Here's a start at it for discussion. What do you think @choldgraf and others?

#165

from nbclient.

drscotthawley avatar drscotthawley commented on July 26, 2024

Note also Jeremy Howard's new effort nbprocess may be of interest however it still very much under-construction. https://nbprocess.fast.ai/

from nbclient.

palewire avatar palewire commented on July 26, 2024

Thanks for the tip @drscotthawley. I don't know how the rest of the crowd here would poll, but I'm still a strong believer that nblicent and notebook world is in desperate need of a simple CLI for running notebooks. This seems like the right place to provide it, IMHO.

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

Some quick thoughts now that I've looked at discussion in #165 as well:

I think @MSeal is saying that this package is meant to be a "back-end" package that is consumed by other tools (like nbconvert, papermill, etc). The design, dependencies, etc of this package are optimized for developer consumers rather than end-user consumers. So the challenge is that if we make this package more user-facing, then it will create a tension in design, maintenance, scope, etc. For example, as described here the title nbclient is not a very end-user-friendly name...most users don't know what "client" means, but that's OK because this package is meant for developers to consume.

I'm also thinking of comments like #165 (comment) - it feels like we are making a sub optimal choice there about CLI design just because nbclient uses traitlets.

In addition, if this library adds user-facing dependencies like rich or click, are tools that consume this library (like nbconvert) happy with that dependency? I think we'd need to be very disciplined to not scope-creep on features like this (whether in this repo or another), because making nbclient more user-facing will also attract different kinds of requests for improvements.

So, I wonder if a solution would be to create a lightweight CLI package that is primarily designed for end-users. It'd be called something like nbexecute and it could lean a bit more heavily into the user-facing dependencies and design (e.g., depend on things like click, rich, etc). Its job would be to use nbclient under the hood, and provide user-friendly APIs, CLIs, and documentation to make it really easy to use.

The thing that I worry about is that I suspect a natural evolution of such a CLI will become more and more like papermill over time. Right now we just want a simple CLI, but what happens when somebody opens a feature request to add a variable to the notebook?

I'm curious what @palewire considers to be the "overkill" aspects of papermill. Is it that the CLI feels too complex? Or like it also feels designed for conversation rather than execution? Or the fact that it's not a core Jupyter tool? Or the dependencies are too beefy? Or the documentation doesn't make this "simple" use-case obvious enough?

To be clear, I think that something like jupyter execute mynotebook.ipynb would be a super useful CLI to expose to users, my questions here are more around "where is the right place for the tool that does this job?"

from nbclient.

palewire avatar palewire commented on July 26, 2024

Thanks for the thoughtful consideration, @choldgraf. Here's where I come from on papermill.

Distance from the core jupyter package is tough on newbies and novices. It's got a different brand name. It's not integrated with the core jupyter command. It requires users asking "How do I cron this thing?" to stumble around the web in confusion.

Papermill is presented and configured as a tool for super users. Below is a screenshot of the first use cases presented in the papermill README.

Screenshot from 2021-10-02 08-42-12

It then goes on to immediately describe how to output results to Amazon S3. Those cases are leagues beyond the use cases I have in mind, which are:

  • I have a web scraper I want to schedule to run once a day in my crontab
  • I have a GitHub Action or other CI workflow that needs to pull some data, process it with a notebook and commit every so often
  • I have a notebook to gather, transform and output data into a structure I can work with in a data visualization or publishing tool. I'd like to run it ad hoc from my shell or via a bash script or Makefile

In these cases, which I would wager are common, all the user needs to do is run a single notebook every so often. No parameters. No pipeline control maneuvers. No fancy outputting techniques. Just run a notebook. And if it crashes I get the errors spit back in my face right away. I think the goal of a tool like this should be to surface those simple options first.

One thing I like about nbclient and find admirable is that it takes a similar approach to the "back-end" of Python modules. So to me it seems like the natural place to bring it to the "front-end" of the command line as well.

This is beyond what we've discussed thus far, I think, but my view is that such a tool should also be packaged with the core packages installed by beginner users like jupyterlab so that with the single, initial install a user can run jupyter execute. If those master packages can run notebooks in your browser, I think the user's expectation is that they can run them in the shell too. I know it was mine. And in the little world where I work I've encountered at least a half dozen users who've shared the same expectation. Ultimately, I think that's the best outcome for the software. Where the CLI lives is less important to me.

from nbclient.

davidbrochart avatar davidbrochart commented on July 26, 2024

So, I wonder if a solution would be to create a lightweight CLI package that is primarily designed for end-users. It'd be called something like nbexecute and it could lean a bit more heavily into the user-facing dependencies and design (e.g., depend on things like click, rich, etc). Its job would be to use nbclient under the hood, and provide user-friendly APIs, CLIs, and documentation to make it really easy to use.

Actually I was planning on doing almost that, by extracting out the execution logic of nbterm into its own library, but without relying on jupyter_client (as it's already the case in nbterm).

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

Just a quick thought that all of @palewire's comments above make sense to me, I think those are reasonable concerns with the current state of tooling. There is not a "dead simple" way to executive Jupyter Notebooks from the CLI right now. I quite like the vision of pip install jupyter and immediately have access to verbs like jupyter execute mynotebook.ipynb. I'll think on it more but wanna leave space for the ideas of others as well!

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

It sounds like there aren't really strong opinions that we shouldn't do this, so what do folks think about:

  • Getting #165 ready to merge
  • In the documentation, mention that the CLI functionality here is meant to be relatively simple, and for more advanced use-cases, recommend tools like Papermill

from nbclient.

palewire avatar palewire commented on July 26, 2024

Sounds good to me. Happy to take on whatever else you'd like to see.

from nbclient.

palewire avatar palewire commented on July 26, 2024

I think this ticket can be closed thanks to #165 being merged. One final thing I'm trying to push through on it: Adding nbclient docs to the main jupyter package's docs.

from nbclient.

fperez avatar fperez commented on July 26, 2024

The above reminds me I need to add that to the %run docs, that functionality is buried in the code but not obvious from the docs.

from nbclient.

choldgraf avatar choldgraf commented on July 26, 2024

I didn't know about that pattern in ipython - but now that i know about it I agree that we should follow precedent and name it run since it is basically the same functionality.

this makes me wonder what the ipython run verb does under the hood - does it have a notebook execution implementation inside of ipython?

from nbclient.

fperez avatar fperez commented on July 26, 2024

Ask run??, dear @choldgraf and ye shall receive

image

We've had safe_execfile_ipy in there forever, and it knows how to run noetbooks. Again, ?? is your friend:

image

Which now makes me realize, in addition to documenting ipynb support in %run, we should add the --allow-errors flag there too for consistency.

from nbclient.

fperez avatar fperez commented on July 26, 2024

@palewire had it been pushed out to pypi yet? If it was only the twitter blast, then I think it's fine to rename now - anyone running prod from git sources knows the game they're playing πŸ”₯ :)

from nbclient.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.