regro / cf-scripts Goto Github PK
View Code? Open in Web Editor NEWFlagship repo for cf-regro-autotick-bot
License: Other
Flagship repo for cf-regro-autotick-bot
License: Other
@CJ-Wright commented on Tue Feb 27 2018
Human readable graphs (eg in yaml) take a bunch of time to write. Additionally we have some scripts that we want to run but don't want to take up much needed time from the actual work of the bot (top 100 lists, checking for cyclic deps, etc.). I think we can tack some of these on 01 since that only uses about 15 mins of time total.
@CJ-Wright commented on Fri Feb 23 2018
We may need to spoof different OSs for reading the meta.yaml (for packages which have different urls/sha for different OSs)
eg
https://github.com/conda-forge/git-lfs-feedstock/pull/6/files
@CJ-Wright commented on Thu Mar 01 2018
Also conda-forge/git-feedstock#33
@jakirkham commented on Thu Mar 01 2018
IDK if you are using conda-build
2 or 3. If 2, I can point you to some code in conda-smithy
/conda-build-all
for this. If 3, there is a feature in conda-build
that you can use.
@CJ-Wright commented on Thu Mar 01 2018
Currently we are using 2, but I don't think there is any reason we can't use 3, so which ever will be easier to implement.
@jakirkham commented on Thu Mar 01 2018
Probably both are easy. Will leave it up to you.
We use a function called fudge_subdir
to do this in conda-smithy
(though it will be gone in conda-smithy
3). In conda-build
3, there is a function called render
.
We may need a way for feedstocks to opt out via a blacklist. We'll still keep track of the feedstocks in the graph, but we can remove them from the PR process.
Itβs certainly reasonable to start out with a corn job for these sorts of things. Also as we resolve some technical debt, the cron job is very helpful. That said, we have generally found in conda-forge that cron jobs inevitably struggle to scale.
To solve this problem, have ultimately moved all of them to web services that use webhooks. This allows them to deal with notifications as they come in and respond by doing some task. This approach seems well suited for updates. However it will require some thought into how we can get notifications from package indexes, GitHub, etc. Expect this will iron out any issues related to load.
@CJ-Wright commented on Thu Mar 01 2018
conda-forge/plumpy-feedstock#2
https://github.com/conda-forge/plumpy-feedstock/blob/master/recipe/meta.yaml#L2
@CJ-Wright commented on Thu Mar 01 2018
At this point we should have a matrix of find and replaces which also includes some good regex matches (arb number of spaces) and a correcting system (normalize the output)
@jakirkham commented on Fri Mar 02 2018
This becomes more important when we start to consider downstream packages that have tight version constraints on upstream packages. Admittedly that can get into a whole other can of worms finding the dependencies (w/version constraints) and using them to solve how best to update the graph.
@CJ-Wright commented on Fri Mar 02 2018
I meant just normalizing the jinja2 variables. Yes, optimal bumping on the graph would be awesome. It may even be possible as soon as we get rid of the cycles in our otherwise perfect DAG (although maybe we can excise the cycles from the DAG and handle them separately?)
In the PR body we should include a statement which encourages maintainers to push to bot branches as needed. The bot itself doesn't push to these branches (unless we specify a redo in the graph, which is rare) and the bot devs aren't really equipped to troubleshoot each feedstock's issues.
With conda build 3 there is a capacity for multiple package/source recipes. This may create a whole bunch of issues for the bot.
@tkelman commented on Fri Feb 23 2018
see conda-forge/awscli-feedstock#29
@CJ-Wright commented on Fri Feb 23 2018
Thank you for reporting!
@CJ-Wright commented on Fri Feb 23 2018
This came about because you were able to update the version before the bot realized there is a new version. Normally that PR would have included a version bump.
@scopatz thoughts?
This is being overwritten because in the replacements a) there is no {{ build ...}
in our replacement setup, b) there is also a replacement which overwrites the build setting. I don't suppose that there is a meaningful way to do this in the regex?
To be fair, having the build number as a jinja variable is a bit uncommon, especially as it is usually only used once, hence why we didn't include it in the initial setup.
@scopatz commented on Mon Feb 26 2018
Well, you could check and make sure that the build: NUM
really is just a a simple number before overwriting it. If it isn't you can punt.
@scopatz commented on Mon Feb 26 2018
Basically, this comes down to the fact that the ticker is assuming that there is a specific form of meta.yaml, and if it isn't in that form, it will fail
@CJ-Wright commented on Fri Mar 02 2018
@CJ-Wright commented on Wed Feb 14 2018
(there are some common funcs and such)
Sometimes during version updates, URLs and summaries change. It would be nice if the bot could add these changes as well. This should be possible in Python by running python setup.py egg_info
inside the sdist and parsing *.egg-info/PKG-INFO
. There may also be a way to get this info from the PyPI API. Might be possible to pick up on Python version support as well (e.g. is Python 2 supported or now dropped?).
This overlaps to some extent with issue ( #22 ).
@CJ-Wright commented on Sat Feb 24 2018
Should we clear out patches on version bump?
@justcalamari
@scopatz
@isuruf
@scopatz commented on Mon Feb 26 2018
What are you asking, precisely?
@CJ-Wright commented on Mon Feb 26 2018
If we version bump a package, most likely the patch won't work. Should we delete the patch or should we just leave it there for the maintainers to deal with?
noarch
all most of the things!
In the process of bumping version numbers on many feedstocks I found that a lot of them may be good candidates for noarch
. It would be good to have something that goes round and tries to build noarch
version of feedstocks which match some criteria which make a project likely to be noarch friendly.
I expect that some of the criteria would be:
toolchain
depI don't know if that is a comprehensive list, but it may be better to just put out the PRs and see if they work.
@CJ-Wright commented on Thu Mar 01 2018
It would be good to pull the current package version from anaconda.org. See conda-forge/github3.py-feedstock#11 (comment)
Is there an API for anaconda.org?
Attn:
@scopatz
@scopatz commented on Thu Mar 01 2018
I think if you use conda seach with '*' and a the --channel option (and probably the Json option) you will get a list of all the packages
@CJ-Wright commented on Thu Mar 01 2018
That is very clever!
@CJ-Wright commented on Sun Feb 25 2018
And count them so we can target upstreams by the number of recipes which require them.
In aggregate this question is about how many PRs we should expect the bot to make in one day.
@CJ-Wright commented on Sat Feb 24 2018
It would be good to track the reason why we thought something was bad. This would require more elegant parsing of bad
@CJ-Wright commented on Sat Feb 24 2018
This also would require a complete emptying of the bad file
@CJ-Wright commented on Thu Feb 15 2018
We can't tick the versions, but we could track the package name/version via calls to their gh page, I think.
@CJ-Wright commented on Wed Feb 14 2018
@CJ-Wright commented on Sat Feb 24 2018
Currently we have split up the scripts over different repos, however that may not be the most efficient use of Travis or our API calls. We could consider back filling our current scripts with 03
to fill up the API calls and other status scripts (cycle checking, current PR/out-of-date, etc.) to fill up the time.
Current usage status:
Script | API calls | time |
---|---|---|
00 |
few | 4:12 |
01 |
3756 (many) | 21:09 |
02 |
few | 7:10 |
03 |
few | 43:43 |
For example we could tack a run of 03
onto 00
to almost double our time creating PRs.
Note that the main changes required are:
@CJ-Wright commented on Tue Feb 13 2018
See conda-forge/conda-forge.github.io#428
@jakirkham commented on Fri Mar 02 2018
Expect cb3 will help as we will have a list of pinned dependencies in one file.
As we make enhancements to the code we should consider cleaning out bad and bad upstream. That way things that we couldn't process we can get another shot at.
...and issue the webservices command to update it.
There are two PRs in this feedstock, one created manually by me, and the other was created by the autotick-bot:
conda-forge/menuinst-feedstock#1
and
conda-forge/menuinst-feedstock#2
The changeset is exactly the same, but the CI succeeds in the manual PR, but fails in the bot PR with this error
INFO:conda.gateways.disk.delete:rm_rf failed for c:\users\appveyor\appdata\local\temp\1\tmpl4pwk8
Traceback (most recent call last):
File "C:\Miniconda-x64\Scripts\conda-build-script.py", line 5, in <module>
sys.exit(conda_build.cli.main_build.main())
File "C:\Miniconda-x64\lib\site-packages\conda_build\cli\main_build.py", line 342, in main
execute(sys.argv[1:])
File "C:\Miniconda-x64\lib\site-packages\conda_build\cli\main_build.py", line 333, in execute
noverify=args.no_verify)
File "C:\Miniconda-x64\lib\site-packages\conda_build\api.py", line 97, in build
need_source_download=need_source_download, config=config)
File "C:\Miniconda-x64\lib\site-packages\conda_build\build.py", line 1524, in build_tree
config=config)
File "C:\Miniconda-x64\lib\site-packages\conda_build\build.py", line 1159, in build
built_package = bundlers[output_dict.get('type', 'conda')](output_dict, m, config, env)
File "C:\Miniconda-x64\lib\site-packages\conda_build\build.py", line 939, in bundle_conda
path_to_package=tmp_path)
File "C:\Miniconda-x64\lib\site-packages\conda_verify\verify.py", line 30, in verify_package
getattr(package_check, method)() is not None]
File "C:\Miniconda-x64\lib\site-packages\conda_verify\checks.py", line 309, in check_windows_arch
file_object_type = get_object_type(file_header)
File "C:\Miniconda-x64\lib\site-packages\conda_verify\utilities.py", line 117, in get_object_type
return "DLL " + DLL_TYPES.get(i)
TypeError: cannot concatenate 'str' and 'NoneType' objects
Command exited with code 1
Why is this happening? I imagine that build config is different for bot and manual, but probably there is something to be fixed here.
We now have quite the treasure trove of data on how packages depend on one another. It would be awesome to have some cool visualization of this data.
@CJ-Wright commented on Wed Feb 14 2018
Some recipes make jinja2 calls to os.environ. I'm not certain how to deal with that.
@CJ-Wright commented on Thu Feb 15 2018
It would be nice to have a bit of documentation (potentially in the code) on what is going on.
@CJ-Wright commented on Wed Feb 14 2018
As @isuruf mentioned we may be able to update the CF version in the graph by pulling commit messages from feedstocks.
@CJ-Wright commented on Sat Feb 24 2018
This might be nice for understanding the highest build and run deps separately.
@jakirkham commented on Fri Mar 02 2018
What sorts of things are you thinking about here? Are you contemplating what rebuilds of packages might be needed based on an upstream version change?
@CJ-Wright commented on Fri Mar 02 2018
Yes, also separating the most build depended on packages from the most run depended on packages (maybe for stress testing).
Admittedly this may not always make sense. However in case that it does, this would be very useful. Namely would be good to trigger rebuilds of downstream dependencies when an upstream dependency is rebuilt. As a simple example, oniguruma
and jq
.
It would be good to track the re-render version in the meta.yaml
.
We have some cases like this one where the version in Jinja ends up being a function of other Jinja variables. This is typically motivate by two things AFAIAA.
These might be rare enough that the answer is we adjust the recipes so the bot can update them more easily. Figured I'd raise the issue anyways though to see if anyone had other ideas.
@CJ-Wright commented on Sat Feb 24 2018
The bot needs a logo
@pkgw commented on Sat Feb 24 2018
I just got this PR from the bot. I was able to figure out the intent, but the initial message that it posted was not super clear to me, due to the empty table of "pending dependencies" (I'm not sure what "pending" means here). If I'm understanding the purpose of the bot properly, I think it would be helpful to have some text in the PR message along these lines:
This PR updates $PACKAGE to the latest version on PyPI, $VERSION, from $OLD_VERSION. It also updates the following dependencies in the meta.yaml file: $BLAH ...
@CJ-Wright commented on Sat Feb 24 2018
Thank you for reporting!
The purpose of the bot is to tick versions, we currently don't have a way to update which dependencies are in the recipe. The pending dependencies are stated dependencies which also need to be version bumped (since one may want to wait for the deps to be updated before updating the downstream packages).
The pending dependencies table is being removed in #25 (if there are no pending deps).
We can add something to the effect of your statement, although we support more than pypi.
eg
This PR updates $PACKAGE to $VERSION from $OLD_VERSION.
Preferably in its own directory with the ndoe name, exception and traceback.
@bsipocz commented on Fri Mar 02 2018
First of all, thank you for this bot, I'm sure many maintainers will agree that this is a great to have feature to conda-forge.
I have one feature request though. We usually include the conda-forge update in our release procedure, so it happened already a few times that there is already a version update PR (usually waiting for CIs to pass) when the bot opens one, too clogging the CI services even more. I think it would be rather awesome if it would check not only the main repo, but the content of already opened PRs, too.
@CJ-Wright commented on Fri Mar 02 2018
Thank you for reporting!
@CJ-Wright commented on Fri Mar 02 2018
This may need its own script and/or worker since it may be GH API heavy.
My understanding of what needs to happen:
feedstocks = get_feedstocks()
for feedstock in feedstocks:
for PR in feedstock.PRs():
get_yaml()
update_node_attributes_with_new_info()
@isuruf commented on Fri Mar 02 2018
Or the linter can write to a file/db on each PR and the bot could check this file/db
@CJ-Wright commented on Fri Mar 02 2018
This may need atomic-like operations on the graph.
I'll open another issue to discuss that.
See: https://github.com/regro/cf-graph/issues/52
@jakirkham commented on Fri Mar 02 2018
Definitely agree with this issue. Though I wonder to what extent this is a consequence of the bot recently coming online vs. a recurrent problem we will face well into the future (if not otherwise addressed).
@bsipocz commented on Fri Mar 02 2018
@jakirkham - you're probably right, if this bot becomes the default behaviour I suspect most maintainers will top opening those update PRs the first place. However in that case having a way to opt out may be useful.
@CJ-Wright commented on Fri Mar 02 2018
@bsipocz although the bot is currently pushing the CI's rather hard, I think it will get easier once we enter steady state (and finish running through all the packages). At that point I think it would be ok to just close the bot's PRs. My assumption (which may not be true) is that the rate of version bumps will be slow enough that the bot opening an erroneous PR would not be too burdensome.
So it seems that some of the feedstocks we fail on are due to 404 errors.
Some of these are from the discrepancy between pypi.io and pypi.python.org
Others are 404 errors from github where there is a missmatch between what 02
found and what exists.
@jakirkham commented on Thu Mar 01 2018
Not sure how you are checking for updates currently, but you may find issue ( pypi/warehouse#1683 ) interesting. Basically asking PyPI to include some sort of feed for subscribing to specific packages for updates.
@CJ-Wright commented on Fri Mar 02 2018
That would be very cool.
Currently we troll through all of the packages looking at their upstream URLs, but having all the pypi packages managed through their own stream would be great!
@jakirkham commented on Fri Mar 02 2018
By upstream URLs do you mean home
, dev_url
, or something else?
@CJ-Wright commented on Fri Mar 02 2018
Wherever the meta.yaml
describes url
to be.
We should track the entire meta.yaml
. There are just too many interesting introspections that need more than what we are currently tracking.
@CJ-Wright commented on Mon Feb 19 2018
I'm not certain if/how possible this is, but it might be nice to capture the conda build version being used and the binary compatibility information.
@msarahan commented on Mon Feb 19 2018
conda-build version used is easy. It's encoded in a package's about.json, so you don't need to record it at build time.
Binary compatibility is a lot harder. It's different on every platform, and I think you'd need to start up some kind of database to match up symbols provided with symbols used. Conda-build 3's run_exports is a decent approximation, but not truly tracking binary compatibility directly.
@CJ-Wright commented on Wed Feb 28 2018
It would be good to have a way to bump dependencies along with the versions.
See: https://github.com/conda/conda-build/blob/master/conda_build/skeletons/pypi.py#L869
@isuruf
@sodre
@CJ-Wright commented on Wed Feb 21 2018
It would be nice to have the webservices write to the graph as it would eliminate some of the workers.
meta.yaml
infoThis would eliminate scripts 00
and 01
.
From @isuruf
@CJ-Wright commented on Sun Feb 25 2018
This would prevent us from bumping things that have already been bumped with the most accuracy.
@justcalamari commented on Fri Mar 02 2018
We can run into problems when multiple jobs update the graph at the same time. We do not pull before pushing with doctr, so if the repo has been updated the push will fail. If we do pull, we can also have merge conflicts when the graph is updated by multiple people/bots, so we need a way to either prevent such merge conflicts or resolve them correctly.
@ocefpaf commented on Thu Mar 01 2018
See conda-forge/dropbox-feedstock#4 (comment)
IMO we should just drop the jinja variable for build number. It is silly to have a variable that s used in one place. If that action is taken there is nothing to do in cf-graph. If not we need to fix the bot π
@dougalsutherland commented on Thu Mar 01 2018
The reason I've often used it is just because it makes it harder to forget to reset the build number to 0 when you're bumping the version yourself if the build number is specified right next to the version, instead of potentially 15 lines away. It makes just as much sense as an sha256
variable that's also only used once (as you noted there)....
If both sha256
and build
variables are going away in general to make life easier for bots, that's fine in that the build will definitely break if you forget to update the checksum. :) (Alternatively, maybe the linter could check that the build number is 0 for package versions that don't have a build yet?)
@ocefpaf commented on Thu Mar 01 2018
The reason I've often used it is just because it makes it harder to forget to reset the build number to 0 when you're bumping the version yourself if the build number is specified right next to the version, instead of potentially 15 lines away. It makes just as much sense as an sha256 variable that's also only used once (as you noted there)....
I disagree but I am more used to the conda recipe format than most people. Anyway, I don't oppose to use the jinja, but I would love to get rid of the excess, like the file extension for example π
If both sha256 and build variables are going away in general to make life easier for bots, that's fine in that the build will definitely break if you forget to update the checksum. :)
Yeah, in like of the age of the bots all this may change π
@CJ-Wright commented on Thu Mar 01 2018
I think we can implement either way. I like the idea of using the bot to PR in basic maintenance to recipes (removing excess variables, fixing jinja variables (there was at least on instance of {%set...
rather than {% set...
).
At the end of the day though the bots serve the humans, so unless a change/feature is especially painful for the bot, I'd rather have things be human friendly than tailored to the bot.
@CJ-Wright commented on Fri Mar 02 2018
Attn: @jakirkham
@CJ-Wright commented on Fri Mar 02 2018
@justcalamari would you mind taking a look at this?
@CJ-Wright commented on Thu Mar 01 2018
It seems that some pre-releases are getting through.
I think we've blacklisted rc
in new version numbers. Maybe we need to add dev
. It might be nice if we had a format for pre-release tags.
@CJ-Wright commented on Thu Mar 01 2018
Attn: @ocefpaf
@ocefpaf commented on Thu Mar 01 2018
If you want to filter them out you can use pep440 to get a list of valid versions.
@CJ-Wright commented on Thu Mar 01 2018
See:
@jakirkham commented on Fri Mar 02 2018
PEP440 will work with Python packages. However not all packages are Python (or follow Python versioning rules). Might be good to reach out to conda and conda-build devs about what is a good indicator of an rc
, dev
, ... version.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.