GithubHelp home page GithubHelp logo

Sometimes approves the wrong commit about homu HOT 9 OPEN

ehuss avatar ehuss commented on August 24, 2024 3
Sometimes approves the wrong commit

from homu.

Comments (9)

notriddle avatar notriddle commented on August 24, 2024 1

if I understood it correctly where as bors-ng is stateless and homu stateful

Unfortunately, no. There are three borses, not two.

from homu.

pinkforest avatar pinkforest commented on August 24, 2024

Related Zulip:
https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Bors.20races.20.2F.20out.20of.20sync/near/288119101

So homu is the result of GitHub REST API rate limitation if I understood it correctly where as bors-ng is stateless and homu stateful relying entirely on the ingress webhooks for state synchronisation - which seems to have failed few or more times.

I am unsure if there is some type of retry mechanism in the ingress GH webhook calls and whether homu honors retries or whether this is something that could be configured on GH side or make Homu support - which could provide easy low hanging fruit fix to increase stability - webhooks are supposed to get status back until they are delivered
EDIT: Yes there is new API for exactly that ^^^ 🥳

1 - Figure out the impact first

I am going to see if I can do a GraphQL query/ies to scrape off all the PR r+ comments/commits and correlate to figure out how many times - and exactly where - homu has picked up the wrong commit vs what the latest commit was given PR

That analysis would / will allow us to see which and how many PRs are affected and then see how much of an issue this is.

2 - Potentially fix something

Then to potentially fix this in Homu it would mean either - depending on reliability / real impact factors as above

  1. Using GraphQL queries to ensure the state is synchronsised that doesn't hurt rate limits - if impact was high
  • or alternatively low hanging fruit if impact was low -
  1. Figure out why webhook calls are being lost and maybe put some more reliability tlc around those

from homu.

pinkforest avatar pinkforest commented on August 24, 2024

Re: GitHub delivery tracking API

Seems GitHub has introduced API exactly for what we might need re 2.2) ..

https://github.blog/changelog/2021-06-30-webhook-deliveries-api/

https://docs.github.com/en/rest/reference/repos#webhooks

Can someone go see what the situation looks for 30 last days?

Also it may as well be that simple "healthcheck" polling monitor that can trigger the retries via this new API in homu might greatly improve webhook deliverability if required ?

And if the "healthcheck" monitor simply reports to Homu that there are undelivered webhooks it would hang on before acting on things..

from homu.

Mark-Simulacrum avatar Mark-Simulacrum commented on August 24, 2024

I think rather than focusing on approval time, we should probably have a check at (bors) merge commit generation time that the approved source commit is equivalent to the PR head commit at that time. That also helps us avoid issues with further pushes being missed.

I don't think focusing on missed webhooks is going to work well; GitHub has just not delivered things in the past for some time, we shouldn't depend on that for reliability here.

from homu.

pinkforest avatar pinkforest commented on August 24, 2024

That would require querying GH one per commit generation for those but I guess there isn't too many of those that would break the rate limit as was the reason for Homu to exist
Also that would be just to provide a failure and feedback mechanism but not really fixing the sync issue that happens at approval time
Are we happy with that?
Or do we mean that homu will pick nearest-before-approval commit regardless what was supposed approved commit (out of sync) without providing feedback about some internal sync failure e.g. just deal with it?

from homu.

Mark-Simulacrum avatar Mark-Simulacrum commented on August 24, 2024

We would post an error message in that case. There's no real issue with rate limits -- we're only generating new merge commits probably roughly <15 times/day, so checking at that time is very cheap.

I think there's more work that can be done to improve homu's synchronization with real state, but that's a separate issue from this one and doesn't need to be coupled.

from homu.

pinkforest avatar pinkforest commented on August 24, 2024

I'll also try to finish the impact analysis during the weekend off curiosity -

just needs some regular expression on bors messages and comparing them to commit history

I just got some quick GraphQL running on GH API for doing analysis what has been going on:
https://github.com/pinkforest/rust-gh-homu-off-syncs

---- PR(true): Add a `--build-dir` flag to rustbuild
 79f8dc0 - commit
bors - :pushpin: Commit 79f8dc0b898b0a387df684a539cd97446a0f964f has been approved by `jyn514`

<!-- @bors r=jyn514 79f8dc0b898b0a387df684a539cd97446a0f964f -->
<!-- homu: {"type":"Approved","sha":"79f8dc0b898b0a387df684a539cd97446a0f964f","approver":"jyn514"} -->
---- PR(true): Add macro_rules! rustdoc change to 1.62 relnotes
 4ea18cc - commit
bors - :v: @ can now approve this pull request
<!-- homu: {"type":"Delegated","delegator":"jyn514","delegate":""} -->
bors - :v: @CAD97 can now approve this pull request
<!-- homu: {"type":"Delegated","delegator":"jyn514","delegate":"CAD97"} -->
bors - :pushpin: Commit ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5 has been approved by `Mark-Simulacrum`

<!-- @bors r=Mark-Simulacrum ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5 -->
<!-- homu: {"type":"Approved","sha":"ab437ad5e46db30f1dc08d21cd73f9ef9ffa13b5","approver":"Mark-Simulacrum"} -->
bors - :pushpin: Commit 4ea18ccf7e4e4afe213c2b3a74c558135c423fde has been approved by `jyn514`

<!-- @bors r=jyn514 4ea18ccf7e4e4afe213c2b3a74c558135c423fde -->
<!-- homu: {"type":"Approved","sha":"4ea18ccf7e4e4afe213c2b3a74c558135c423fde","approver":"jyn514"} -->
---- PR(true): Rollup of 9 pull requests
 1d845bd - commit
 c6f362a - commit
 ddb6313 - commit
 debee1e - commit
 8931fbd - commit
 e043821 - commit
 c29e584 - commit
 cd7bd8b - commit
 3fcf84a - commit
 d791310 - commit
 79f8dc0 - commit
 4ea18cc - commit
 0d5636c - commit
 41e7991 - commit
 0420231 - commit
 e59693a - commit
 734f21c - commit
 80dd48b - commit
 c4acd06 - commit
 335e7d3 - commit
 18d4228 - commit
bors - :pushpin: Commit 18d4228456a98fd6d8950f74fd117aba7fb45757 has been approved by `matthiaskrgr`

<!-- @bors r=matthiaskrgr 18d4228456a98fd6d8950f74fd117aba7fb45757 -->
<!-- homu: {"type":"Approved","sha":"18d4228456a98fd6d8950f74fd117aba7fb45757","approver":"matthiaskrgr"} -->
bors - :hourglass: Testing commit 18d4228456a98fd6d8950f74fd117aba7fb45757 with merge 7e2733bb1dd9afe5fd20370ca4d539d42ac50419...
<!-- homu: {"type":"BuildStarted","head_sha":"18d4228456a98fd6d8950f74fd117aba7fb45757","merge_sha":"7e2733bb1dd9afe5fd20370ca4d539d42ac50419"} -->
bors - :sunny: Test successful - [checks-actions](https://github.com/rust-lang-ci/rust/runs/7148849106?check_suite_focus=true)
Approved by: matthiaskrgr
Pushing 7e2733bb1dd9afe5fd20370ca4d539d42ac50419 to master...
<!-- homu: {"type":"BuildCompleted","approved_by":"matthiaskrgr","base_ref":"master","builders":{"checks-actions":"https://github.com/rust-lang-ci/rust/runs/7148849106?check_suite_focus=true"},"merge_sha":"7e2733bb1dd9afe5fd20370ca4d539d42ac50419"} -->

After that I can push a PR for Homu to do that merge commit commit sync check

from homu.

pinkforest avatar pinkforest commented on August 24, 2024

Raised #178 for suspected general WebHook delivery instability which addressing commit mixups alone would not fix (this ticket)

from homu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.