Comments (9)
if I understood it correctly where as bors-ng is stateless and homu stateful
Unfortunately, no. There are three borses, not two.
- bors — https://github.com/graydon/bors — is stateless
- homu — https://github.com/rust-lang/homu — is stateful. It exists because of GitHub's rate limits.
- bors-ng — https://github.com/bors-ng/bors-ng — is also stateful. It exists because self-hosting homu is too complicated. Since literally anybody can sign up their repository at any time, bors-ng also has to worry about rate limits.
from homu.
Related Zulip:
https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Bors.20races.20.2F.20out.20of.20sync/near/288119101
So homu is the result of GitHub REST API rate limitation if I understood it correctly where as bors-ng is stateless and homu stateful relying entirely on the ingress webhooks for state synchronisation - which seems to have failed few or more times.
I am unsure if there is some type of retry mechanism in the ingress GH webhook calls and whether homu honors retries or whether this is something that could be configured on GH side or make Homu support - which could provide easy low hanging fruit fix to increase stability - webhooks are supposed to get status back until they are delivered
EDIT: Yes there is new API for exactly that ^^^
1 - Figure out the impact first
I am going to see if I can do a GraphQL query/ies to scrape off all the PR r+ comments/commits and correlate to figure out how many times - and exactly where - homu has picked up the wrong commit vs what the latest commit was given PR
That analysis would / will allow us to see which and how many PRs are affected and then see how much of an issue this is.
2 - Potentially fix something
Then to potentially fix this in Homu it would mean either - depending on reliability / real impact factors as above
- Using GraphQL queries to ensure the state is synchronsised that doesn't hurt rate limits - if impact was high
- or alternatively low hanging fruit if impact was low -
- Figure out why webhook calls are being lost and maybe put some more reliability tlc around those
from homu.
Re: GitHub delivery tracking API
Seems GitHub has introduced API exactly for what we might need re 2.2) ..
https://github.blog/changelog/2021-06-30-webhook-deliveries-api/
https://docs.github.com/en/rest/reference/repos#webhooks
Can someone go see what the situation looks for 30 last days?
Also it may as well be that simple "healthcheck" polling monitor that can trigger the retries via this new API in homu might greatly improve webhook deliverability if required ?
And if the "healthcheck" monitor simply reports to Homu that there are undelivered webhooks it would hang on before acting on things..
from homu.
I think rather than focusing on approval time, we should probably have a check at (bors) merge commit generation time that the approved source commit is equivalent to the PR head commit at that time. That also helps us avoid issues with further pushes being missed.
I don't think focusing on missed webhooks is going to work well; GitHub has just not delivered things in the past for some time, we shouldn't depend on that for reliability here.
from homu.
That would require querying GH one per commit generation for those but I guess there isn't too many of those that would break the rate limit as was the reason for Homu to exist
Also that would be just to provide a failure and feedback mechanism but not really fixing the sync issue that happens at approval time
Are we happy with that?
Or do we mean that homu will pick nearest-before-approval commit regardless what was supposed approved commit (out of sync) without providing feedback about some internal sync failure e.g. just deal with it?
from homu.
We would post an error message in that case. There's no real issue with rate limits -- we're only generating new merge commits probably roughly <15 times/day, so checking at that time is very cheap.
I think there's more work that can be done to improve homu's synchronization with real state, but that's a separate issue from this one and doesn't need to be coupled.
from homu.
I'll also try to finish the impact analysis during the weekend off curiosity -
just needs some regular expression on bors messages and comparing them to commit history
I just got some quick GraphQL running on GH API for doing analysis what has been going on:
https://github.com/pinkforest/rust-gh-homu-off-syncs
---- PR(true): Add a `--build-dir` flag to rustbuild
79f8dc0 - commit
bors - :pushpin: Commit 79f8dc0b898b0a387df684a539cd97446a0f964f has been approved by `jyn514`
<!-- @bors r=jyn514 79f8dc0b898b0a387df684a539cd97446a0f964f -->
<!-- homu: {"type":"Approved","sha":"79f8dc0b898b0a387df684a539cd97446a0f964f","approver":"jyn514"} -->
---- PR(true): Add macro_rules! rustdoc change to 1.62 relnotes
4ea18cc - commit
bors - :v: @ can now approve this pull request
<!-- homu: {"type":"Delegated","delegator":"jyn514","delegate":""} -->
bors - :v: @CAD97 can now approve this pull request
<!-- homu: {"type":"Delegated","delegator":"jyn514","delegate":"CAD97"} -->
bors - :pushpin: Commit ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5 has been approved by `Mark-Simulacrum`
<!-- @bors r=Mark-Simulacrum ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5 -->
<!-- homu: {"type":"Approved","sha":"ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5","approver":"Mark-Simulacrum"} -->
bors - :pushpin: Commit 4ea18ccf7e4e4afe213c2b3a74c558135c423fde has been approved by `jyn514`
<!-- @bors r=jyn514 4ea18ccf7e4e4afe213c2b3a74c558135c423fde -->
<!-- homu: {"type":"Approved","sha":"4ea18ccf7e4e4afe213c2b3a74c558135c423fde","approver":"jyn514"} -->
---- PR(true): Rollup of 9 pull requests
1d845bd - commit
c6f362a - commit
ddb6313 - commit
debee1e - commit
8931fbd - commit
e043821 - commit
c29e584 - commit
cd7bd8b - commit
3fcf84a - commit
d791310 - commit
79f8dc0 - commit
4ea18cc - commit
0d5636c - commit
41e7991 - commit
0420231 - commit
e59693a - commit
734f21c - commit
80dd48b - commit
c4acd06 - commit
335e7d3 - commit
18d4228 - commit
bors - :pushpin: Commit 18d4228456a98fd6d8950f74fd117aba7fb45757 has been approved by `matthiaskrgr`
<!-- @bors r=matthiaskrgr 18d4228456a98fd6d8950f74fd117aba7fb45757 -->
<!-- homu: {"type":"Approved","sha":"18d4228456a98fd6d8950f74fd117aba7fb45757","approver":"matthiaskrgr"} -->
bors - :hourglass: Testing commit 18d4228456a98fd6d8950f74fd117aba7fb45757 with merge 7e2733bb1dd9afe5fd20370ca4d539d42ac50419...
<!-- homu: {"type":"BuildStarted","head_sha":"18d4228456a98fd6d8950f74fd117aba7fb45757","merge_sha":"7e2733bb1dd9afe5fd20370ca4d539d42ac50419"} -->
bors - :sunny: Test successful - [checks-actions](https://github.com/rust-lang-ci/rust/runs/7148849106?check_suite_focus=true)
Approved by: matthiaskrgr
Pushing 7e2733bb1dd9afe5fd20370ca4d539d42ac50419 to master...
<!-- homu: {"type":"BuildCompleted","approved_by":"matthiaskrgr","base_ref":"master","builders":{"checks-actions":"https://github.com/rust-lang-ci/rust/runs/7148849106?check_suite_focus=true"},"merge_sha":"7e2733bb1dd9afe5fd20370ca4d539d42ac50419"} -->
After that I can push a PR for Homu to do that merge commit commit sync check
from homu.
Raised #178 for suspected general WebHook delivery instability which addressing commit mixups alone would not fix (this ticket)
from homu.
Related Issues (20)
- Unable to create a status (422 No commit found for SHA)
- may reuse a dirty worktree from previous PR errors
- Need prefix to PR number in merge commit message, otherwise incorrect linking occurs with subtrees HOT 2
- /queue/all links to non existing /retry_log/all
- bors squash doesn't work with rolled-up PRs
- retry auto approves previously unapproved pr?
- Pull requests merged via `@bors r+ squash` aren't closed automatically HOT 1
- Interation with rust-highfive assignment
- Make bors ignore mentions in inline code blocks.
- Escape HTML when generating rollup messages
- Run `try` with a specific parent commit
- GitHub Webhooks Delivery Instability
- `bors squash` ignored on rollups HOT 1
- bors squash does not close PRs HOT 3
- bors squash left commit separate from PR/rollup in gh history HOT 2
- Include repo name in merge commit message?
- Don't allow try-builds to be rolled up HOT 1
- Bors sometimes gets stuck, doesn't process the queue any more HOT 5
- bors should prevent rolling up already merged PRs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from homu.