jupyter-server / jupyter-scheduler Goto Github PK
View Code? Open in Web Editor NEWRun Jupyter notebooks as jobs
Home Page: https://jupyter-scheduler.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
Run Jupyter notebooks as jobs
Home Page: https://jupyter-scheduler.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
Users want to see whether a job definition is active or paused. Users want to pause or unpause a job definition.
See #81.
We are currently in the process of adding the ability for users to run jobs automatically on a schedule. This issue is to describe and track work needed to enable scheduling on the job detail view.
We should expand the current job detail view to be able to render both jobs and job details. Some key points of this work:
Add ability to specify email notifications when submitting jobs.
In a recent PR, I added a basic Button
component. This issue is to track work to enable this button to be rendered as a pure icon button or icon + text.
Provide a UI extension point for output file links in the job list, so that a developer can override the appearance and behavior of output file links.
Offer a UI extension point and a component that reflects the default behavior.
See #24 for another example of an extension point.
We plan on having a job details view that contains more detailed information about a job. That view should have a breadcrumb of the form Jobs / [Job Name]
or Job Definitions / [Job Def Name]
that enables a user to navigate back to the list of jobs or job definitions. This is an issue to track our work to create those components.
We are currently in the process of adding the ability for users to run jobs automatically on a schedule. This issue is to describe and track work needed to enable scheduling on the create job form. The core user story here is this:
"As a user, I would like to create a job to run automatically on a schedule"
The idempotency token is not part of the schema for a job definition, but the "Idempotency token" continues to appear in the create form after the user switches modes to "Run on a schedule"
After the user switches to "Run on a schedule," no longer show the idempotency token input. When the user switches back to "Run now," show the idempotency token input.
Users want to view job run errors for every job they run.
Make job run errors available in the API and make errors visible in the job details view.
I believe the create job form is missing some fields in the create REST API. Let's review that and make sure the UI is complete.
Implement these features for create_job
API
Currently, the job details appears below the selected item in the jobs list. This multi-column layout doesn't leave enough room to display a lot of information. In addition, the column headers do not line up with the details in the inline details view. Users can expand multiple jobs' details at once, cluttering the list view.
Add a details view that takes up the entire size of the list view's container (i.e., a main area widget). When the user clicks on a job name, they are brought to the details view. From the details view, a user can click a breadcrumb link to return to the list view.
Remove the details that appear below the row in the list.
We can also move the action buttons out of the list view and into the details view.
We can provide for an extension point, allowing developers to customize the details displayed in the details view.
Only one job's details can be viewed at a time.
The create job API takes an output_prefix
argument that determines in which directory the output files will be put when the job completes. Currently the TextField
placeholder for this UI element reads "Output prefix". We should improve that to give the user more useful information with example. This information should include:
While editing the name or value of a parameter, the cursor jumps to the end of the input box after every character entry. This is present in the code in #32.
Click in the middle of the name or value of a parameter. Type a character. The character is inserted at that position and the cursor moves to the end of the input.
The cursor appears immediately following the inserted character.
This does not occur with other text inputs in this form.
This appears to be common behavior with React when the state update is not fast enough. I attempted to remediate this using local state, as mentioned below, but I didn't have the same success with the parameter inputs as I did with the standalone inputs.
In our initial work on adding job scheduling, we probably won't get to making Job Definitions editable. This issue is to track that follow on work.
The list and detail view for a Job Definition should have an Edit action that allows the user to reconfigure elements of the scheduled job. See the REST API for which fields are editable:
In rerunJob
in job-detail.tsx
, the output formats are passed as name: format, label: format
where format
is 'ipynb'
or 'HTML'
:
jupyter-scheduler/src/mainviews/job-detail.tsx
Lines 64 to 78 in c776e2d
Because these output formats do not match those in the environment's output formats list, they will never be selected on the create job form.
The jobs should be in the same format as is used in the environments list's environment definition: for example, { name: 'ipynb', label: 'Notebook' }
.
The utility function outputFormatsForEnvironment
is used to get an environment's properly formed output formats.
We want to show the most newly-created jobs at the top of the jobs list. Jobs have a start date and we sort the jobs list by default in descending order by start date.
The problem is that newly-created jobs are in state "STOPPED" with no start date, so they appear at the bottom (not at the top) of the jobs list by default.
Add two additional states for jobs to be in: "CREATED" and "QUEUED", both of which indicate that a job has not yet started.
Add a creation date/time for each job.
Sort the jobs list, by default, in descending order by creation date/time.
Should we add the creation date/time to the jobs list so that users can sort by it?
The value of runtime_environment_parameters
is not passed to the API endpoint on form submission.
Create a job using the advanced options extension point that modifies runtime_environment_parameters
in the model.
runtime_environment_parameters
is present in the API call.
We are currently in the process of adding the ability for users to run jobs automatically on a schedule. This issue is to describe and track work needed to enable scheduling on the job list view. Users stories include:
"As a user, I would like to view and manage both jobs and job definitions"
listJobsView
model property here (https://github.com/jupyter-server/jupyter-scheduler/blob/main/src/model.ts#L93)See issue #80 for parallel work on create view.
There is some cleanup that we need to do to how we are handling the overall state of the extension:
ReactWidget
or VDomRenderer
to automatically re-render the react component tree based on that state.Information that will be needed in the model:
Our current errors
object used for the create-job form uses a mapping from input name to error (as a string). For parameters, this is cumbersome, because parameters are ordered but the errors
object is not.
Revise the errors
object to be keyed on the possible members of the model
, including an array for parameters
.
Revise the removeParameter
function passed to the parameters picker to modify the errors
object's parameters
.
This is a follow-up to #65.
What's confusing is that
model.jobsView = 'ListJobs'
will render either the job list or the job definition list, dispatching based onmodel.listJobsModel.listJobsView
.@andrii-i For now, you'll want to somehow set the
listJobsView
to'JobDefinitionDetail'
immediately after line 29 above.@jweill-aws We need to have a more thorough discussion on the purpose of these models. To my recollection, the whole point of models as a global state store was to allow persistence (serialization to JSON in local storage) and rehydration upon page reload. However, the list jobs view really doesn't have any state that needs to be persisted. Whether the jobs list or the job definitions list is being rendered should be stored in
model.jobsView
rather than insidemodel.listJobsModel.listJobsView
, which meansmodel.listJobsModel
can be deleted entirely.Another refactoring suggestion, rename
JobsModel
=>SchedulerModel
andmodel.jobsView
=>model.view
.
Originally posted by @dlqqq in #92 (comment)
The job panel has a model that is used to render the state of the view:
https://github.com/jupyter-server/jupyter-scheduler/blob/main/src/model.ts#L142
We should serialize this model to JSON by implementing a toJSON
method and wire up that state to the JupyterLab layout restoration extension. This will improve usability by fully restoring the state of this panel when JupyterLab is reloaded.
When the user clicks "Delete Job" or "Delete Job Definition" from the details view, the job or job definition is deleted immediately, with no confirmation step. This is irreversible.
The user should see a confirmation dialog (e.g., a modal) when they click a delete button so that they can confirm their decision to delete.
In certain cases, the environments dropdown in the create form will have a single value. In that case, the environment idea is likely to only confuse the user. We should simplify the UX in that case by auto-selecting the single value and hiding the dropdown and its label.
Wanted to raise as a point of discussion if our REST APIs that accept parameters should take those as URL query string parameters or a JSON body. I am probably leaning towards a JSON body, which are we doing currently?
Developers would like to call a different API when "Run Now" is clicked.
Let developers override the "Run Now" action to call a different API.
Right now, we allow the user to edit the input file path in the create form. This introduces new error possibilities (user can enter a file that doesn't exist) and seems redundant at that point in the flow (they just picked a file).
I propose that we make the input file read only in the create form.
Users may not know that they can create a job or job definition from the current notebook.
In the current notebook, add a toolbar button to open the create-job form to create a job or job definition from the current notebook. As is the case when the user right-clicks on a file in the file browser, the form should be pre-populated based on the current notebook.
This is an issue to track work to create a reusable Checkbox component for this extension that has the default JupyterLab styling.
Some job platforms have logs that the user may want to view when the job is completed. Users will want a way to view the log files.
For now, we are going to keep it simple and have backends that support log files treat them as an output format. Those log files should automatically be copied into place with the rest of the outputs selected by the user.
In addition to the model
and modelChanged
props, add two more:
errors
— a mapping from input names to error strings.setErrors
— a function, in the useState
style, to update the list of errors.If any input has a "truthy" error value, display the MUI error treatment on the control and do not permit the user to submit the form.
This is a follow-up to #24.
From looking at the code, here's how I'm reading the idempotency token (IT) logic:
abc
. Job is created with ID def
.abc
. Job is not created and ID def
is returned.This is confusing because, without any warning or error message, the user is brought to the job list page where the IT is not displayed prominently.
Provide a warning message that is displayed on non-creation, as above, due to IT conflict.
In the UI, interpret this warning message and provide a link to view the job details for the job with IT=def
.
This is a follow up to #33.
Users cannot run multiple notebooks at once.
In the file browser, let users select multiple notebooks (multiple files or directories), right-click, and choose a context menu option to run all notebooks as jobs.
Modify the create job form to accept multiple notebooks as inputs and to create multiple jobs on "Run Now".
Should users be able to create multiple job definitions and schedules as a result?
JobDetail and buildTableRow have duplicate OutputFile element
We should be able to reuse this code instead of duplicating it
Currently we are showing an error if the user does not enter text in the Input File
text field. This is a good first step, but there might be more validation needed to prevent user from entering malformed input.
We can instead check for the existence of the file instead of "emptiness" for a better form of validation.
While our underlying data model for a schedule is a CRON expression, we should include a human readable form of the CRON expression in the list and detail views. Here is once such library that generates a human readable form of a CRON expression:
I would like to suggest to extend the support of jupyter-scheduler to JupyterHub.
I feel this could be beneficial for several use cases. For example in our case we use JupyterHub to interact with smart manufacturing systems and prototype various codes in Jupyter Notebooks. Tu run on regular base some of them (e.g. to check availability of running services etc) would be definitively a use case for us.
This is an issue to track the work to create a form label component. The create job form has a number of form components (text, checkbox, dropdown) that have a label component. This would be a simple component to help with reusability and consistency:
label
element.If the advanced options panel is collapsed, and at least one input in it has an error, the user might not know whey they cannot submit the form.
When the expandable section is collapsed and at least one error is contained in it, display an indicator on the section.
Alternatively, keep the section expanded and do not let the user collapse the section if at least one error is present.
This is an issue to track work to create a dropdown component for this extension. This component is used in the create form. For now, we can create a fairly simple version of a CSS-styled dropdown that uses JuptyerLab's theme variables.
Looks like output_formats needs to be added to the StaticEnvironmentManager
Install via pip and then running jupyter server extension list
shows a complaint about psutils not found
Installing psutils manually resolves the issue
The job definition schedule is stored in cron format (e.g., */5 * * * *
) which most users will not recognize in text.
In the job definition list and details view, display the job definition schedule in a human-readable way (e.g., */5 * * * *
as "every 5 minutes").
All supported JupyterLab languages should be supported; the user's chosen locale should be used for display.
cron explanation libraries exist for multiple languages.
Users don't have a way to pick and existing job definition and submit a notebook job that would run immediately.
Add a REST API that allows a job definition and runtime parameters as inputs, and enables a run now job overriding the passed in attributes in the REST API.
If the input file for a notebook job has a relative path including a subdirectory, the "input" column shows only the subdirectory. The filename does not appear.
foo/bar.ipynb
.foo
, but not the filename bar.ipynb
, appears in the "Input file" column.foo/bar.ipynb
or bar.ipynb
appears.
Seen in a 2022-09-30 demo.
This is an issue to track work to create a TextField component for form input. We use this component in the create job form and need to encapsulate the visual styling and interaction behavior. Some rough guidelines:
ch
units.A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.