GithubHelp home page GithubHelp logo

Comments (5)

saurabhnanda avatar saurabhnanda commented on September 22, 2024 1

This is a bug. If you plan to raise a PR for this, it might be a good idea to first start with a test-case that fails.

from odd-jobs.

saurabhnanda avatar saurabhnanda commented on September 22, 2024

Because job runners only check the number of attempts on a job after the job fails i.e. throws an exception, long-running jobs which time out will be repeatedly picked up by job runners with an ever-increasing attempt count. My expectation was that jobs which have timed out and have too many attempts should be ignored by the polling logic.

I probably misread this earlier. I thought that you're trying to point out that job.attempts doesn't increase for timed-out jobs. But I just checked the code and that is not the case. job.attempts is increased in the case of any type of failure (including timeout).

Could you try explaining the problem that you're facing in the current job-polling implementation again, please?

PS: Also check if #73 is what you're looking for?

from odd-jobs.

ivanbakel avatar ivanbakel commented on September 22, 2024

Could you try explaining the problem that you're facing in the current job-polling implementation again, please?

When a job is picked up, either because it is queued or because it is locked and timed out, the attempt count is not checked, but is incremented. The attempt count check only happens when the job fails (throws an exception). If the job instead times out, it will just be picked up again - but the attempt count will not be checked against the configured max attempts to see if the job should run again.

This means that jobs that keep timing out will have their attempt count increment with each timeout, but the attempt count will never be checked to prevent them from running and mark them as failed.

from odd-jobs.

saurabhnanda avatar saurabhnanda commented on September 22, 2024

Ah - I understand now. And thanks for following this through. It's a subtle catch.

The problem is with job timeouts that are not caught by runJobWithTimeout (that function throws a TimeoutException which is cause and goes through the retryOrFail check). The problem is with jobs that remain in a locked state and are forcibly picked up the jobPollingSql.

Are you seeing a lot of these kinds of timeouts? Is runJobWithTimeout not handling this for you?

from odd-jobs.

ivanbakel avatar ivanbakel commented on September 22, 2024

Are you seeing a lot of these kinds of timeouts? Is runJobWithTimeout not handling this for you?

It seems that this was partly due to a problem with some dependency that we were using, since recent changes to the code have meant that it no longer occurs.

I believe the problem was that some code in the job logic itself was masking async exceptions, so runJobWithTimeout would throw the timeout exception to the job thread, and it would then block forever.

That in itself doesn't seem like a bug that Odd Jobs can do anything about - but I do think the polling logic should be checking the attempts counter before trying to queue up jobs that are locked and timed out.

from odd-jobs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.