soostone / retry Goto Github PK
View Code? Open in Web Editor NEWRetry combinators for monadic actions that may fail
License: BSD 3-Clause "New" or "Revised" License
Retry combinators for monadic actions that may fail
License: BSD 3-Clause "New" or "Revised" License
Likewise, another small doc tweak would be to mention that iteration numbers start at 0
, not 1
. (I had to look at the source to figure that out.)
As mentioned in the comment, I needed to retry the initialization portion of a function following the bracket pattern. So, once initialization is successful, I need to reset the counter. Since there's no way for the user of recovering
to modify the counter value, I'm using this:
{-# LANGUAGE ViewPatterns #-}
import Control.Concurrent
import Control.Monad.Catch
import Control.Monad.IO.Class
import Control.Retry
import Data.Function (fix)
import Data.IORef
import Prelude hiding (catch)
-- | Run an action and recover from a raised exception by potentially
-- retrying the action a number of times. This behaves the same as
-- 'recovering', except it also provides the action the ability to
-- reset the retry counter. This is useful when recovering from
-- exceptions that occur during the initialization of with-* style
-- functions which follow the bracket pattern.
recoveringWithReset
:: (MonadIO m, MonadMask m)
=> RetryPolicy
-- ^ Just use 'def' for default settings
-> [(Int -> Handler m Bool)]
-- ^ Should a given exception be retried? Action will be
-- retried if this returns True.
-> (IO () -> m a)
-- ^ Action to perform. The @IO ()@ action resets the retry
-- counter.
-> m a
recoveringWithReset (RetryPolicy policy) hs f = mask $ \restore -> do
counter <- liftIO $ newIORef 0
fix $ \loop -> do
r <- try $ restore (f (writeIORef counter 0))
case r of
Right x -> return x
Left e -> do
n <- liftIO $ readIORef counter
let recover [] = throwM e
recover ((($ n) -> Handler h) : hs')
| Just e' <- fromException e = do
chk <- h e'
if chk
then case policy n of
Just delay -> do
liftIO $ threadDelay delay
liftIO $ writeIORef counter $! n + 1
loop
Nothing -> throwM e'
else throwM e'
| otherwise = recover hs'
recover hs
One thing to point out about this implementation is that the reset action must be called during the execution of f
. Results are undefined if called concurrently.
I have no opinion about the API / naming here. Another possibility would be to provide direct access to the counter.
Hi,
I'm not raising this as a pull request because I'm unsure you would be ok with the changes I made:
recoveringWatchdog
function that makes a distintion between an aplication that fails very quickly each time (and thus the retry count applies), or an application that only fails ocasionally. In this last case, I want to retry it but reset the retry counter.Number 2), unfortunately adds a lot of dependencies (you might not be confortable with).
https://github.com/jcristovao/retry/commits/master
If there is interest, I might split this into (mergeable) tests applicable to the retry lib (as it is), and the recoveringWatchdog I would split into a different lib.
Please let me know what do you prefer,
Cheers
I have a use case (asynchronous job runner) where it would be useful to be able to specify the initial RetryStatus
in a retrying/recovering operation. Here is a demo commit adding just resumable recovering
variants here: jship@75ae34e
For more context on my use case, a job consists of some number of tasks, and the result of executing each task is persisted to the database. If the process dies and a task didn't complete, the job runner will pick up on that task again on process restart. The execution of each task is wrapped up in a recovering
block. The tasks in a job may potentially be long-running, so it would be useful if Control.Retry
exposed a means for "resuming" a retrying operation. In a simple case, if I set up a RetryPolicy
on a long-running task via limitRetries 1
, and the task threw an exception after some amount of time then began its retry run and then the process died, on restart of the process, the task would restart and so starts progressing through its RetryPolicy
from scratch.
I'm curious if there is interest in resumable retrying as a whole, and if so, am happy to PR support for it.
The two tests fail on Windows 7 x64. They work just fine on Ubuntu 14.04. Here's the output of cabal test
:
Running 1 test suites...
Test suite test: RUNNING...
QuadraticDelayRetry
quadratic delay
- recovering test with quadratic retry delay FAILED [1]
Retry
retry
- recovering test without quadratic retry delay FAILED [2]
1) QuadraticDelayRetry, quadratic delay, recovering test with quadratic retry delay
Assertion failed (after 3 tests):
2
1
2) Retry.retry recovering test without quadratic retry delay
Assertion failed (after 1 test):
11
Positive {getPositive = Small {getSmall = 2}}
Randomized with seed 1774098379
Finished in 0.0100 seconds
2 examples, 2 failures
Test suite test: FAIL
Test suite logged to: dist\test\retry-0.5-test.log
0 of 1 test suites (0 of 1 test cases) passed.
Any ideas? Thanks.
Hi,
Sometimes it may be useful to log a retry. I hadn't seen the changes you made to the library in the meantime, and I came to suggest adding an optional IO hook to RetrySettings, for optionally doing something (log, for example) each retry.
The new monoid retry settings look great, but now I am unsure where this feature could be added... what do you think?
Edit: to be clear, I'm just talking about the retrying
function, since the exception handler on recovering
can be used with that purpose.
Thanks
My app consumes an API which, for certain failure responses, will populate the Retry-After
header of the response with the number of seconds to wait before performing the next request.
As far as I can see, the interface to this library doesn't support adjusting the delay time based on information in the return value of (or the exception thrown by) the action that's retried. It seems that all the functions that apply a RetryPolicy
statically deduce the waiting time based on information inside RetryStatus
.
Would it be within the scope of this library to have e.g. Control.Retry.retrying
support dynamically adjusting the delay time based on information in the return value of the action to run (the b
type variable)?
Seems to me like the way to support this feature would be to change the Bool
return type, that's used to indicate whether a request should be retried, to something along the lines of:
data RetryAction
= NoRetry -- ^ Don't retry
| Retry -- ^ Retry with a delay calculated by the retry policy
| RetryOverridingDelay Int -- ^ Retry, overriding the delay of the retry policy with the specified number of microseconds
Think it's to do with the forall
fussiness, as it type errors upon:
fmap (something) genDuration
in test/Tests/Control/Retry.hs
.
Workaround Nix derivation override:
{
retry = (self.callHackage "retry" "0.8.1.2" {}).overrideDerivation(self: {
prePatch = ''
sed -i 's/fmap \(.*\) genDuration/fmap (\\x -> \1 x) genDuration/g' test/Tests/Control/Retry.hs
'';
});
}
helps by eta-expanding as recommended.
Currently not allowed by the cabal file.
From http://hackage.haskell.org/package/retry in points to http://hackage.haskell.org/package/changelog.md which of course gives 404
I am trying to build retry, and I ended up with a test failure
policy transformers
always produces positive delay with positive constants (no rollover): FAIL (0.69s)
✗ always produces positive delay with positive constants (no rollover) failed at test/Tests/Control/Retry.hs:221:11
after 2 tests and 57 shrinks.
┏━━ test/Tests/Control/Retry.hs ━━━
208 ┃ policyTransformersTests :: TestTree
209 ┃ policyTransformersTests = testGroup "policy transformers"
210 ┃ [ testProperty "always produces positive delay with positive constants (no rollover)" $ property $ do
211 ┃ delay <- forAll (Gen.int (Range.linear 0 maxBound))
┃ │ 1
212 ┃ let res = runIdentity (simulatePolicy 1000 (exponentialBackoff delay))
213 ┃ delays = catMaybes (snd <$> res)
214 ┃ mnDelay = if null delays
215 ┃ then Nothing
216 ┃ else Just (minimum delays)
217 ┃ case mnDelay of
218 ┃ Nothing -> return ()
219 ┃ Just n -> do
220 ┃ footnote (show n ++ " is not >= 0")
221 ┃ HH.assert (n >= 0)
┃ ^^^^^^^^^^^^^^^^^^
222 ┃ , testProperty "positive, nonzero exponential backoff is always incrementing" $ property $ do
223 ┃ delay <- forAll (Gen.int (Range.linear 1 maxBound))
224 ┃ let res = runIdentity (simulatePolicy 1000 (limitRetriesByDelay maxBound (exponentialBackoff delay)))
225 ┃ delays = catMaybes (snd <$> res)
226 ┃ sort delays === delays
227 ┃ length (group delays) === length delays
228 ┃ ]
-9223372036854775808 is not >= 0
This failure can be reproduced by running:
> recheck (Size 1) (Seed 17647478776705613149 5015347513854265309) always produces positive delay with positive constants (no rollover)
Use '--hedgehog-replay "Size 1 Seed 17647478776705613149 5015347513854265309"' to reproduce.
Use -p '/always produces positive delay with positive constants (no rollover)/' to rerun this test only.
positive, nonzero exponential backoff is always incrementing: FAIL (0.41s)
✗ positive, nonzero exponential backoff is always incrementing failed at test/Tests/Control/Retry.hs:226:18
after 1 test.
┏━━ test/Tests/Control/Retry.hs ━━━
208 ┃ policyTransformersTests :: TestTree
209 ┃ policyTransformersTests = testGroup "policy transformers"
210 ┃ [ testProperty "always produces positive delay with positive constants (no rollover)" $ property $ do
211 ┃ delay <- forAll (Gen.int (Range.linear 0 maxBound))
212 ┃ let res = runIdentity (simulatePolicy 1000 (exponentialBackoff delay))
213 ┃ delays = catMaybes (snd <$> res)
214 ┃ mnDelay = if null delays
215 ┃ then Nothing
216 ┃ else Just (minimum delays)
217 ┃ case mnDelay of
218 ┃ Nothing -> return ()
219 ┃ Just n -> do
220 ┃ footnote (show n ++ " is not >= 0")
221 ┃ HH.assert (n >= 0)
222 ┃ , testProperty "positive, nonzero exponential backoff is always incrementing" $ property $ do
223 ┃ delay <- forAll (Gen.int (Range.linear 1 maxBound))
┃ │ 1
224 ┃ let res = runIdentity (simulatePolicy 1000 (limitRetriesByDelay maxBound (exponentialBackoff delay)))
225 ┃ delays = catMaybes (snd <$> res)
226 ┃ sort delays === delays
┃ ^^^^^^^^^^^^^^^^^^^^^^
... with a very big array
... followed by
This failure can be reproduced by running:
> recheck (Size 2) (Seed 10571801305957791816 9377426414140356069) never exceeds the given cumulative delay
Use '--hedgehog-replay "Size 2 Seed 10571801305957791816 9377426414140356069"' to reproduce.
Use -p '/never exceeds the given cumulative delay/' to rerun this test only.
We might want to relax the upper dependency bounds, if possible:
2.26s$ stack $ARGS test --no-terminal --haddock --no-haddock-deps
Selected resolver: nightly-2017-08-30
Error: While constructing the build plan, the following exceptions were encountered:
In the dependencies for retry-0.7.4.2:
HUnit-1.6.0.0 must match >=1.2.5.2 && <1.6 (latest applicable is 1.5.0.0)
QuickCheck-2.10.0.1 must match >=2.7 && <2.10 (latest applicable is 2.9.2)
needed since retry-0.7.4.2 is a build target.
Building library for retry-0.9.2.0..
[1 of 1] Compiling Control.Retry ( src/Control/Retry.hs, /home/simon/trusteeing/retry/retry-0.9.2.0/.dist-newstyle-trustee/4059df488dc2e13f6af48218de17e58723a471c63dfeda57bd604431ae257faf7b940fe494eac6779ef0aeb347e20df89d5c997ae419dac1eb2051003a473a0b/build/x86_64-linux/ghc-8.6.5/retry-0.9.2.0/noopt/build/Control/Retry.o )
src/Control/Retry.hs:853:12: error:
• Variable not in scope:
lift :: m (Maybe Int) -> StateT RetryStatus m (Maybe Int)
• Perhaps you meant one of these:
‘liftA’ (imported from Control.Applicative),
‘liftM’ (imported from Control.Monad)
|
853 | delay <- lift (f stat)
| ^^^^
As a Hackage trustee I have revised v0.9.2.0: https://hackage.haskell.org/package/retry-0.9.2.0/revisions/
I was going to re-enable running this test-suite in stackage because the hspec bound has been relaxed but it seems QuickCheck-2.8 is not supported either.
Can this be relaxed?
This will be a very useful addition :
retryOnError :: (MonadIO m, MonadError e m)
=> (RetryPolicy m)
-> (RetryStatus -> e -> m Bool)
-> (RetryStatus -> m a)
-> m a
There was a PR earlier #24. Is there a reason for not adding this?
It would be nice to have this in addition to the Monoid instance. Not sure how this plays out with older GHC compatibility, though.
Already mentioned in #41, but thought I should make it a separate issue: Can we get a new Hackage release (or just an update to the existing package metadata) that loosens the bounds on ghc-prim, so that retry builds with GHC 8.0.1 and can be included in new Stackage nightly snapshots (which are now GHC 8.0.1-based)?
I haven't yet been able to profile this and track it down precisely. But based on circumstantial evidence it looks like retry was responsible for 100% CPU usage as discussed in this commit:
Which is part of this PR:
As seen in the commit, the retry policy was:
retrying (capDelay (500*1000) (exponentialBackoff 1000))
So it should have rapidly backed off to polling every 500ms. And the "try to grab lock" action running at 2hz should have been very cheap. But what I instead saw was that CPU usage while it was in this polling state would hang out at 2-3% for a few seconds, and then spike to 100+% CPU (all of one core), and stay there indefinitely.
This could be related to some kind of GHC runtime problem. In particular, Stack uses +RTS -N
(commercialhaskell/stack#680), and something may be going wrong with the extra HECs.
But in any case, switching stack to a blocking operation definitely got rid of the problem.
It doesn't look like the comment on the retrying
function matches its type signature.
Comment:
-- >>> import Data.Maybe
-- >>> let f = putStrLn "Running action" >> return Nothing
-- >>> retrying def isNothing f
-- Running action
-- Running action
-- Running action
-- Running action
-- Running action
-- Running action
-- Nothing
Type signature:
retrying :: MonadIO m
=> RetryPolicy
-> (Int -> b -> m Bool)
-- ^ An action to check whether the result should be retried.
-- If True, we delay and retry the operation.
-> m b
-- ^ Action to run
-> m b
Should that comment actually be:
-- >>> retrying def (const $ return . isNothing) f
There seems to be a weird issue with latest retry package. I don't know the source of the problem yet, but it seems that re-uploading latest master code as a new version release could resolve the problem.
I'd like to limit the retries so that overall the retries do not last more than X seconds, counting the delay and the IO operation.
I asume this is not possible right now. The question is, is this feature a valid addition to this library? I could contribute if wanted.
For a monad action retrying it would be useful to have access to the exception that triggered retry for debugging/logging purposes.
Seeing some behavior that suggests the mask
approach in recovering
is not blocking "interruptible" actions, including perhaps threadDelay
, therefore allowing a repeat async exception while the retry block is waiting to go un-caught and leak outside of the recovering
block.
What to do here? It is probably not prudent to use uninterruptibleMask
as explained in Control.Exception
docs.
Right now I'm stuck with the API, I would like to have a function like this:
cyclicExponentialBackoff :: Int -> Int -> RetryPolicy
cyclicExponentialBackoff base cap = ...
That uses exponentialBackoff
and once it reaches the cap it starts again from rsIterNumber = 0. Is there someway I could take advantage of the already implemented exponentialBackoff
? Any pointers would help, thanks.
(see https://travis-ci.org/Soostone/retry/jobs/476494787 )
Testing against the current Stackage nightly gives this :
Error: While constructing the build plan, the following exceptions were
encountered:
In the dependencies for hedgehog-0.5.3:
containers-0.6.0.1 from stack configuration does not match >=0.4 && <0.6
(latest matching version is 0.5.11.0)
pretty-show-1.9.5 from stack configuration does not match >=1.6 && <1.8
(latest matching version is 1.7)
stm-2.5.0.0 from stack configuration does not match >=2.4 && <2.5 (latest
matching version is 2.4.5.1)
template-haskell-2.14.0.0 from stack configuration does not
match >=2.10 && <2.14 (latest matching version
is 2.13.0.0)
needed due to retry-0.7.8.0 -> hedgehog-0.5.3
I am testing a GHC-9 build of another project, and I needed to pull retry
from git to do this. Is there any chance of a GHC-9-compatible release being published to Hackage?
Full jitter from the article should work like
sleep = random_between(0, min(cap, base * 2 ** attempt))
and
temp = min(cap, base * 2 ** attempt)
sleep = temp / 2 + random_between(0, temp / 2)
using in the package is name "equal jitter"
Also /
is not escaped in comments so formula gets rendered incorrectly.
I could create PR with a fix but that would be a breaking change as the algorithm will change.
I ran the policy simulator on a policy with the rules:
At certain points in the retry it calculates large, negative numbers which makes me suspect its an int rollover. I'm not actually sure if the bug is with the simulator (in which case its just giving misinformation to the user) or if retry itself will actually calculate these delays and retry at the wrong intervals. Example:
Prelude Control.Retry> let ms = 1000
Prelude Control.Retry> let s = ms * 1000
Prelude Control.Retry> let p = limitRetriesByDelay (60 * s) (capDelay (500 * ms) (exponentialBackoff (50 * ms)))
Prelude Control.Retry> simulatePolicyPP 100 p
0: 50.0ms
1: 100.0ms
2: 200.0ms
3: 400.0ms
4: 500.0ms
5: 500.0ms
6: 500.0ms
7: 500.0ms
8: 500.0ms
9: 500.0ms
10: 500.0ms
11: 500.0ms
12: 500.0ms
13: 500.0ms
14: 500.0ms
15: 500.0ms
16: 500.0ms
17: 500.0ms
18: 500.0ms
19: 500.0ms
20: 500.0ms
21: 500.0ms
22: 500.0ms
23: 500.0ms
24: 500.0ms
25: 500.0ms
26: 500.0ms
27: 500.0ms
28: 500.0ms
29: 500.0ms
30: 500.0ms
31: 500.0ms
32: 500.0ms
33: 500.0ms
34: 500.0ms
35: 500.0ms
36: 500.0ms
37: 500.0ms
38: 500.0ms
39: 500.0ms
40: 500.0ms
41: 500.0ms
42: 500.0ms
43: 500.0ms
44: 500.0ms
45: 500.0ms
46: 500.0ms
47: 500.0ms
48: -4372995238176751616us
49: -8745990476353503232us
50: 500.0ms
51: 500.0ms
52: 500.0ms
53: 500.0ms
54: -3170534137668829184us
55: -6341068275337658368us
56: 500.0ms
57: -6917529027641081856us
58: 500.0ms
59: -9223372036854775808us
60: 0us
61: 0us
62: 0us
63: 0us
64: 0us
65: 0us
66: 0us
67: 0us
68: 0us
69: 0us
70: 0us
71: 0us
72: 0us
73: 0us
74: 0us
75: 0us
76: 0us
77: 0us
78: 0us
79: 0us
80: 0us
81: 0us
82: 0us
83: 0us
84: 0us
85: 0us
86: 0us
87: 0us
88: 0us
89: 0us
90: 0us
91: 0us
92: 0us
93: 0us
94: 0us
95: 0us
96: 0us
97: 0us
98: 0us
99: 0us
100: 0us
Total cumulative delay would be: -1878001044587746832us
Thanks for this highly useful library.
Would you consider switching from using exceptions to unliftio?
With UnliftIO.Exception
, the exception handlers mask asynchronous exceptions by default, and it's less easy to catch asynchronous exceptions.
The classes MonadIO
, MonadCatch
, and MonadMask
are all replaced by MonadUnliftIO
.
I would encourage users to switch.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.