Update 2022 by Erik My understanding is that we don't look to impl

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

FWIW, the verify is what we started using in <

we discussed this a bit today - <a class="user-mention notranslate" data-hovercard-typ

Document a `verify` script pattern about repo2docker HOT 20 OPEN

choldgraf commented on August 28, 2024 1

Document a `verify` script pattern

from repo2docker.

Comments (20)

jhamman commented on August 28, 2024 2

@alando46 and I have been discussing this idea a bit in pangeo-data/pangeo#544. Something users could do today is include a test suite in their repo and test a built image like this:

docker run -i -t ${IMAGE_NAME} "py.test"

I suppose a generic verify interface would be useful as well if you wanted to add this as an option to repo2docker cli or binderhub in someway.

from repo2docker.

lheagy commented on August 28, 2024 1

It seems like there is potentially some overlap with the conversation happening on discourse about testing: https://discourse.jupyter.org/t/testing-notebooks/701/2. Although the discourse post is mostly focussed on notebooks, testing the environment as a first step seems like an appropriate first step

from repo2docker.

betatim commented on August 28, 2024 1

I think we could/should establish a convention for the name of a script that a robot/human can run to convince themselves that the repo was properly setup and its contents "works".

For that I think we should promote verify as a friend of postBuild and start. It should be an "executable" that is executed inside a built image and returns 0 if all is good or anything but zero to signal "stuff is broken". What exactly the script does should be left to the user. With some docs and examples for "run all notebooks in my repo" and similar tasks.

I don't think we should try and offer the convenience of running all notebooks in the repo because my feeling is that there are enough cases where running all notebooks would be super expensive or impossible because something is missing. Basically too many edge cases.

from repo2docker.

jhamman commented on August 28, 2024 1

FWIW, the verify script is what we started using in pangeo-stacks.

Here's an example: https://github.com/pangeo-data/pangeo-stacks/blob/master/pangeo-notebook/binder/verify
And its execution as part of our CI workflow: https://github.com/pangeo-data/pangeo-stacks/blob/0425c2df72c0badd2f37dde4b6f02f2a55b8de2b/build.py#L77-L88

from repo2docker.

yuvipanda commented on August 28, 2024

It could be something like 'verify' that runs like postBuild, and should exit cleanly for success or non-cleanly for failure.

I don't know how useful it will be, however, or what specific need it solves.

from repo2docker.

choldgraf commented on August 28, 2024

The way I think of it is that a lot of people create repositories that are just for showing off a binder, and aren't really part of a development package. In this case, you probably aren't using proper testing or CI, so you may be able to build an image w/ repo2docker just fine, but there will be some underlying flaw that you didn't know about.

E.g., in the case of the "cost estimator" notebook, bqplot had been updated that introduced an error in the notebook code. However you'd only knew that if you actually opened the notebook and ran the cells. If there were something like a verify script, then I could have written that into the repo and caught it.

from repo2docker.

yuvipanda commented on August 28, 2024

I do like the idea. However if it runs an expensive test suite that might take quite a while. Would we want to limit the amount of time this could run? Resources it could consume?

…

On Oct 11, 2017 7:18 PM, "Chris Holdgraf" ***@***.***> wrote: The way I think of it is that a lot of people create repositories that are just for showing off a binder, and aren't really part of a development package. In this case, you probably aren't using proper testing or CI, so you may be able to build an image w/ repo2docker just fine, but there will be some underlying flaw that you didn't know about. E.g., in the case of the "cost estimator" notebook, bqplot had been updated that introduced an error in the notebook code. However you'd only knew that if you actually opened the notebook and ran the cells. If there were something like a verify script, then I could have written that into the repo and caught it. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23rqrNrk3yF3wswNMLb6sJaK9cPPfks5srXbrgaJpZM4P2Q2T> .

from repo2docker.

choldgraf commented on August 28, 2024

ya I agree with you - not sure the best way to set limitations there...but I imagine something like this will need to exist long-term for reproducibility anyway (e.g. ensuring that some conditions are always there, such as the result of an analysis being the same number)

from repo2docker.

choldgraf commented on August 28, 2024

we discussed this a bit today - @Carreau suggested a script whose contents would be run after the docker image was built.

from repo2docker.

jzf2101 commented on August 28, 2024

FYI there is interest in this issue from the ML community.

from repo2docker.

betatim commented on August 28, 2024

Maybe the way forward is to create some examples (like @jhamman showed) and see what people do and use.

I don't think we should execute anything verify during the build process. For me it is more about agreeing a "spec" on where to find and how to run what ever scripts/tests/machinery the repo author envisioned to be used to determine if "this container still works".

from repo2docker.

choldgraf commented on August 28, 2024

I think "verify" is basically just some version of "run some tests" and "emit a verify-specific message depending on how it goes".

Eventually (e.g. if you wanted a binderhub specifically for publication), you'd want to distinguish between "this repo didn't build because it had an error in repo2docker" vs. "this repo didn't build from the verify step"

However in the meantime (and to get inspiration for what this would look like in the future) it'd be great to start collecting some "best practices" patterns for reproducibility with Binder

from repo2docker.

choldgraf commented on August 28, 2024

Now that we've got Zenodo integration with BinderHub, I think this question is more relevant than ever. A couple main points of discussion:

Should this be a specification for testing, or just a file that gets run?
Should these tests be run at build time, or should they be executed through some other mechanism (e.g. optionally after a container is launched from the built image)

Answers to these will affect which repositories need changes. Maybe this is another benefit of extending the REE specification outside of just repo2docker. Then we could add a specification for "how to define validation tests in a repository" that repo2docker wouldn't necessarily touch, but that BinderHub could leverage to occasionally test certain repositories (we probably wouldn't do this for mybinder.org, but if you were e.g. a journal running a binderhub then you'd want something like this)

from repo2docker.

choldgraf commented on August 28, 2024

@betatim I agree about using a postBuild-like file that is just a way to signal to either a reader or a builder "this should be run to determine if the repository does what it says it does". Also +1 on others deciding what goes into that file.

It sounds like you're also leaning towards: treat this as a pattern but don't automatically run it within a build process?

from repo2docker.

betatim commented on August 28, 2024

Nods, I would not run this as part of building the repo as it could/should be something that takes minutes or hours to run.

Following the mantra that repo2docker doesn't introduce new conventions but copies existing ones I'd say lets start using verify in our own repos with a CI setup like repo2docker . ./verify that should build and run the verify step as part of the CI. Maybe together with some docs this pattern will start spreading and then we can have a CLI flag for repo2docker like --verify that builds and runs the verify script?!

from repo2docker.

choldgraf commented on August 28, 2024

@betatim that sounds good to me - I think we could codify this in the REES page and then just mention it in the docs to see where it goes from there.

from repo2docker.

choldgraf commented on August 28, 2024

@jhamman perfect - do you think this sounds like a reasonable pattern to codify in the repo2docker spec?

from repo2docker.

manics commented on August 28, 2024

Ping! It sounds like there was a good discussion here, anyone want to follow up?

from repo2docker.

betatim commented on August 28, 2024

I'd close it or convert it into a issue about creating documentation that suggest to people that a pattern we encourage is to create a verify script that exits with status 0 or 1 depending on whether the image is "good". However I wouldn't add much tooling in repo2docker itself for this (for example automatically running the script after building the image).

from repo2docker.

consideRatio commented on August 28, 2024

Do we have agreement that the verify script is a reasonable pattern worth suggesting, but we won't develop a repo2docker feature to use it?

I'll assume so, and update this issue to an issue about documentation.

from repo2docker.

Document a `verify` script pattern about repo2docker HOT 20 OPEN

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs