GithubHelp home page GithubHelp logo

sunng87 / diehard Goto Github PK

View Code? Open in Web Editor NEW
326.0 7.0 26.0 431 KB

Clojure resilience library for flexible retry, circuit breaker and rate limiter

License: Eclipse Public License 2.0

Clojure 100.00%
circuit-breaker rate-limiter retry-library clojure bulkhead resilience

diehard's Introduction

diehard

CI Clojars license Donate

Clojure library to provide safety guard to your application. Some of the functionality is wrapper over Failsafe.

Note that from 0.7 diehard uses Clojure 1.9 and spec.alpha for configuration validation. Clojure 1.8 users could stick with diehard 0.6.0.

Usage

A quick example for diehard usage.

Retry block

A retry block will re-execute inner forms when retry criteria matches.

(require '[diehard.core :as dh])
(dh/with-retry {:retry-on TimeoutException
                :max-retries 3}
  (fetch-data-from-the-moon))

Circuit breaker

A circuit breaker will track the execution of inner block and skip execution if the open condition triggered.

(require '[diehard.core :as dh])

(dh/defcircuitbreaker my-cb {:failure-threshold-ratio [8 10]
                          :delay-ms 1000})

(dh/with-circuit-breaker my-cb
  (fetch-data-from-the-moon))

Rate limiter

A rate limiter protects your code block to run limited times per second. It will block or throw exception depends on your configuration. (:rate is a floating point number, and can be less than 1.0. Example: 0.5 is once every two seconds.)

(require '[diehard.core :as dh])

(dh/defratelimiter my-rl {:rate 100})

(dh/with-rate-limiter my-rl
  (send-people-to-the-moon))

Bulkhead

Bulkhead allows you to limit concurrent execution on a code block.

(require '[diehard.core :as dh])

;; at most 10 threads can run the code block concurrently
(dh/defbulkhead my-bh {:concurrency 10})

(dh/with-bulkhead my-bh
  (send-people-to-the-moon))

Timeout

Timeouts allow you to fail an execution with TimeoutExceededException if it takes too long to complete

(require '[diehard.core :as dh])

(dh/with-timeout {:timeout-ms 5000
                  :interrupt? true}
  (fly-me-to-the-moon))

Examples

Retry block

(dh/with-retry {:retry-on          Exception
                :max-retries       3
                :on-retry          (fn [val ex] (prn "retrying..."))
                :on-failure        (fn [_ _] (prn "failed..."))
                :on-failed-attempt (fn [_ _] (prn "failed attempt"))
                :on-success        (fn [_] (prn "did it! success!"))}
               (throw (ex-info "not good" {:not "good"})))

output:

"failed attempt"
"retrying..."
"failed attempt"
"retrying..."
"failed attempt"
"retrying..."
"failed attempt"
"failed..."
Execution error (ExceptionInfo) at main.user$eval27430$reify__27441/get (form-init6791465293873302710.clj:7).
not good

Docs

More options can be found in the documentation from cljdoc.

Build

This project uses deps.edn and build.edn for dependency management. To build the project, run

clojure -T:build install

License

Copyright ยฉ 2016-2023 Ning Sun

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

Donation

I'm now accepting donation on liberapay, if you find my work helpful and want to keep it going.

diehard's People

Contributors

alinposho avatar arichiardi avatar belucid avatar buzzdan avatar codahale avatar dhruvbhatia avatar jainsahab avatar lewang avatar lverns avatar marcomorain avatar mjhanninen avatar mrichards42 avatar neilprosser avatar notnoopci avatar realgenekim avatar rslabbert avatar sunng87 avatar vendethiel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

diehard's Issues

question on future(...) contruct use

is there an equivalent construct here that allows something like this:

(require '[diehard.core :as dh])
(dh/with-retry {:retry-on TimeoutException
                :max-retries 3}
  (fetch-data-from-the-moon-future))

where fetch-data-from-the-moon-future is a future(...) and of course the function will return another future?

...i mean it doesn't have to be a future, it can be a thread or anything else as long as async construct equivalents are available.

Arguments for `backoff-ms` in `with-retry` throw spec error

In the docs for with-retry:

:backoff-ms specify a vector [initial-delay-ms max-delay-ms multiplier]

When supplying e.g. [2000 60000 2] I get a spec error:

Invalid input
   {:clojure.spec.alpha/problems
    ({:path [:backoff-ms :single],
      :pred clojure.core/int?,
      :val [2000 60000 2],
      :via [:retry/retry-block :retry/backoff-ms],
      :in [:backoff-ms]}
     {:path [:backoff-ms :tuple],
      :pred (clojure.core/= (clojure.core/count %) 2),
      :val [2000 60000 2],
      :via [:retry/retry-block :retry/backoff-ms],
      :in [:backoff-ms]}),
    :clojure.spec.alpha/spec :retry/retry-block,
    :clojure.spec.alpha/value
    {:on-retry
     #function[.../fn--79386],
     :retry-on java.lang.Exception,
     :max-retries 10,
     :backoff-ms [2000 60000 2]}}

Apparently the spec contradicts the documentation:

(s/or :single int? :tuple (s/tuple int? int?)))

Instant cannot be cast to Duration

I was trying to upgrade from 0.11.3 to 0.11.6 and one of my tests failed with this. Still investigating but possibly related to e82bf79?

#error {
 :cause "class java.time.Instant cannot be cast to class java.time.Duration (java.time.Instant and java.time.Duration are in module java.base of loader 'bootstrap')"
 :via
 [{:type java.lang.ClassCastException
   :message "class java.time.Instant cannot be cast to class java.time.Duration (java.time.Instant and java.time.Duration are in module java.base of loader 'bootstrap')"
   :at [khana.ghdeploy$eval6533$deploy_and_email__6534$fn$reify__6543 get "NO_SOURCE_FILE" 48]}]
 :trace
 [[khana.ghdeploy$eval6533$deploy_and_email__6534$fn$reify__6543 get "NO_SOURCE_FILE" 48]
  [dev.failsafe.Functions lambda$get$0 "Functions.java" 46]
  [dev.failsafe.internal.RetryPolicyExecutor lambda$apply$0 "RetryPolicyExecutor.java" 74]
  [dev.failsafe.SyncExecutionImpl executeSync "SyncExecutionImpl.java" 187]
  [dev.failsafe.FailsafeExecutor call "FailsafeExecutor.java" 376]
  [dev.failsafe.FailsafeExecutor get "FailsafeExecutor.java" 123]
  [khana.ghdeploy$eval6533$deploy_and_email__6534$fn__6535$fn__6545 invoke "NO_SOURCE_FILE" 171]
  [khana.ghdeploy$eval6533$deploy_and_email__6534$fn__6535 invoke "NO_SOURCE_FILE" 171]
  [khana.ghdeploy$eval6533$deploy_and_email__6534 invoke "NO_SOURCE_FILE" 170]
  [clojure.lang.Var invoke "Var.java" 388]
  [khana.ghdeploy_test$fn__21614 invokeStatic "ghdeploy_test.clj" 126]
  [khana.ghdeploy_test$fn__21614 invoke "ghdeploy_test.clj" 102]
  [cider.nrepl.middleware.test$test_var$fn__6252 invoke "test.clj" 242]
  [cider.nrepl.middleware.test$test_var invokeStatic "test.clj" 242]
  [cider.nrepl.middleware.test$test_var invoke "test.clj" 234]
  [cider.nrepl.middleware.test$test_vars$fn__6256$fn__6261 invoke "test.clj" 257]
  [clojure.test$default_fixture invokeStatic "test.clj" 687]
  [clojure.test$default_fixture invoke "test.clj" 683]
  [cider.nrepl.middleware.test$test_vars$fn__6256 invoke "test.clj" 257]
  [clojure.test$default_fixture invokeStatic "test.clj" 687]
  [clojure.test$default_fixture invoke "test.clj" 683]
  [cider.nrepl.middleware.test$test_vars invokeStatic "test.clj" 254]
  [cider.nrepl.middleware.test$test_vars invoke "test.clj" 248]
  [cider.nrepl.middleware.test$test_ns invokeStatic "test.clj" 270]
  [cider.nrepl.middleware.test$test_ns invoke "test.clj" 261]
  [cider.nrepl.middleware.test$test_var_query invokeStatic "test.clj" 281]
  [cider.nrepl.middleware.test$test_var_query invoke "test.clj" 274]
  [cider.nrepl.middleware.test$handle_test_var_query_op$fn__6300$fn__6301 invoke "test.clj" 319]
  [clojure.lang.AFn applyToHelper "AFn.java" 152]
  [clojure.lang.AFn applyTo "AFn.java" 144]
  [clojure.core$apply invokeStatic "core.clj" 667]
  [clojure.core$with_bindings_STAR_ invokeStatic "core.clj" 1990]
  [clojure.core$with_bindings_STAR_ doInvoke "core.clj" 1990]
  [clojure.lang.RestFn invoke "RestFn.java" 425]
  [cider.nrepl.middleware.test$handle_test_var_query_op$fn__6300 invoke "test.clj" 311]
  [clojure.lang.AFn run "AFn.java" 22]
  [nrepl.middleware.session$session_exec$main_loop__1389$fn__1393 invoke "session.clj" 218]
  [nrepl.middleware.session$session_exec$main_loop__1389 invoke "session.clj" 217]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.lang.Thread run "Thread.java" 1589]]}

:backoff-ms documentation does not match failsafe docs

The current docs say

the delay for nth retry will be (max (* initial-delay-ms n) max-delay-ms)

while failsafe says on http://jodah.net/failsafe/javadoc/net/jodah/failsafe/RetryPolicy.html#withBackoff-long-long-java.time.temporal.ChronoUnit-

Sets the delay between retries, exponentially backing off to the maxDelay and multiplying successive delays by a factor of 2.

so the correct formula would be
min(max-delay-ms, (initial-delay-ms ร— 2^(n-1)))

cljdoc not accessible since 0.11.4

According to cljdoc:
image

Version 0.11.3 is still working but next versions from 0.11.4 to 0.11.6 are failing.

The error raised is :

exception-during-import

{:cause
   "Could not find revision 0.11.4 in repo https://github.com/sunng87/diehard",
 :data {:origin "https://github.com/sunng87/diehard", :rev "0.11.4"},
 :trace [[cljdoc.git_repo$tree_for invokeStatic "git_repo.clj" 127]
         [cljdoc.git_repo$tree_for invoke "git_repo.clj" 117]
         [cljdoc.git_repo$slurp_file_at invokeStatic "git_repo.clj" 135]
         [cljdoc.git_repo$slurp_file_at invoke "git_repo.clj" 130]
         [cljdoc.git_repo$read_cljdoc_config invokeStatic "git_repo.clj" 206]
         [cljdoc.git_repo$read_cljdoc_config invoke "git_repo.clj" 203]
         [cljdoc.analysis.git$cljdoc_config invokeStatic "git.clj" 51]
         [cljdoc.analysis.git$cljdoc_config invoke "git.clj" 49]
         [cljdoc.analysis.git$analyze_git_repo$fn__39431 invoke "git.clj" 120]
         [cljdoc.analysis.git$analyze_git_repo invokeStatic "git.clj" 106]
         [cljdoc.analysis.git$analyze_git_repo invoke "git.clj" 99]
         [cljdoc.server.ingest$ingest_git_BANG_ invokeStatic "ingest.clj" 54]
         [cljdoc.server.ingest$ingest_git_BANG_ invoke "ingest.clj" 45]
         [cljdoc.server.api$kick_off_build_BANG_$fn__41381 invoke "api.clj" 72]
         [clojure.core$binding_conveyor_fn$fn__5823 invoke "core.clj" 2047]
         [clojure.lang.AFn call "AFn.java" 18]
         [java.util.concurrent.FutureTask run "FutureTask.java" 264]
         [java.util.concurrent.ThreadPoolExecutor runWorker
          "ThreadPoolExecutor.java" 1136]
         [java.util.concurrent.ThreadPoolExecutor$Worker run
          "ThreadPoolExecutor.java" 635]
         [java.lang.Thread run "Thread.java" 833]],
 :via
   [{:at [cljdoc.git_repo$tree_for invokeStatic "git_repo.clj" 127],
     :data {:origin "https://github.com/sunng87/diehard", :rev "0.11.4"},
     :message
       "Could not find revision 0.11.4 in repo https://github.com/sunng87/diehard",
     :type clojure.lang.ExceptionInfo}]}

Reflection warnings

First of all, thank you for the library!

We can still see some reflection warnings using your library:

Reflection warning, diehard/rate_limiter.clj:34:9 - call to static method sleep on java.lang.Thread can't be resolved (argument types: unknown).
Reflection warning, diehard/rate_limiter.clj:45:13 - call to static method sleep on java.lang.Thread can't be resolved (argument types: unknown).
Reflection warning, diehard/circuit_breaker.clj:78:3 - reference to field allowsExecution on dev.failsafe.CircuitBreaker can't be resolved.
Reflection warning, diehard/core.clj:62:39 - reference to field getConfig can't be resolved.
Reflection warning, diehard/core.clj:83:11 - call to method withBackoff on dev.failsafe.RetryPolicyBuilder can't be resolved (argument types: java.lang.Object, java.lang.Object, java.time.temporal.ChronoUnit).

I believe it would be a good idea to add a call to (set! *warn-on-reflection* true) in the test suite or something like that to avoid regressions.

Get rid of reflection warnings

When using diehard in a namespace where you have (set! *warn-on-reflection* true) an invocation of with-retry will give warnings:

$ clojure -Sdeps '{:deps {diehard {:mvn/version "0.9.0"}}}'
Clojure 1.10.1
user=> (require '[diehard.core :as dh])
nil
user=> (set! *warn-on-reflection* true)
true
user=> (dh/with-retry {:max-retries 2} 1)
Reflection warning, NO_SOURCE_PATH:1:1 - call to static method with on net.jodah.failsafe.Failsafe can't be resolved (argument types: unknown).
Reflection warning, NO_SOURCE_PATH:1:1 - call to method onComplete can't be resolved (target class is unknown).
1

It would be nice if they could be resolved.

Fallback value is not used

I think the value returned from the :fallback on with-retry is not used. I can reproduce this by modifying the fallback test.

(deftest fallback
  ;; This is the original test
  (testing "fallback value is same as original value"
    (is (= 10 (dh/with-retry {:fallback (fn [v e]
                                          (is (= v 10))
                                          (is (nil? e))
                                          v)
                              :retry-if (fn [v e] (< v 10))}
                dh/*executions*)))

    ;; Here I changed fallback to instead return :fail.
    (testing "fallback value is different from original value"
      (is (= :fail (dh/with-retry {:fallback (fn [v e]
                                               (is (= v 10))
                                               (is (nil? e))
                                               :fail)
                                   :retry-if (fn [v e] (< v 10))}
                     dh/*executions*))))))
1 non-passing tests:

Fail in fallback
fallback value is same as original value fallback value is different from original value

expected: :fail

  actual: 10          
    diff: - :fail          
          + 10    

Here's a simpler test that also fails:

(deftest simple-fallback
  (is (= :fallback (dh/with-retry {:retry-when false
                                   :max-retries 3
                                   :fallback :fallback}
                     false))))
1 non-passing tests:

Fail in simple-fallback

expected: :fallback

  actual: false          
    diff: - :fallback          
          + false    

I'm not sure :fallback is even running though:

(dh/with-retry {:retry-when  false
                :max-retries 3
                :on-success  (fn [_ _] (println "success"))
                :on-failure  (fn [_ _] (println "failure"))
                :on-retry    (fn [_ _] (println "retry"))
                :fallback    (fn [_ _] (println "fallback") :fallback)}
  false)

;; retry
;; retry
;; retry
;; failure
;; => false

callback on-failure

There is a use case that we need to do something when a retry failure happened.
It does not seem to have a way to support a callback ๐Ÿค”
did I miss something here?

Document types and arity of circuit breaker options

First of all, thank you for diehard ๐Ÿ™

I'm having a hard time setting up some of the circuit breaker options. In particular, the following keys are documented, but not what type the values should be. In particular, I don't know what the value should be for the following:

  • :delay-ms
  • :failure-threshold
  • :failure-threshold-ratio
  • :success-threshold
  • :success-threshold-ratio
  • :on-open
  • :on-close
  • :on-half-open

I assume that :on-open, :on-close, :on-half-open should be functions of artity 0.

Diehard docs: https://cljdoc.org/d/diehard/diehard/0.8.5/api/diehard.core#defcircuitbreaker

Define a circuit breaker with option.

Available options

There options are available when creating circuit breaker in
defcircuitbreaker.

Failure criteria

All the three fail options share same meaning with similar option in
retry block.

  • :fail-if
  • :fail-on
  • :fail-when
  • :timeout-ms while give all you code a timeout is best practice in
    application level, circuit breaker also provides a timeout for
    marking a long running block as failure
Delay and threshold
  • :delay-ms required. the delay for :open circuit breaker to turn
    into :half-open.
  • :failure-threshold
  • :failure-threshold-ratio
  • :success-threshold
  • :success-threshold-ratio All these four option is to determine at
    what condition the circuit breaker is open.
Listeners
  • :on-open a function to be called when state goes :open
  • :on-close a function to be called when state goes :closed
  • :on-half-open a function to be called when state goes :half-open

Failsafe docs: https://jodah.net/failsafe/circuit-breaker/#event-listeners

In addition to the standard policy listeners, a CircuitBreaker can notify you when the state of the breaker changes:

circuitBreaker
 .onOpen(() -> log.info("The circuit breaker was opened"))
 .onClose(() -> log.info("The circuit breaker was closed"))
 .onHalfOpen(() -> log.info("The circuit breaker was half-opened"));

`Fallback` doesn't support `.handleResult`

I'd like to throw an exception when retries are exceeded. Per https://failsafe.dev/faqs/#how-to-i-throw-an-exception-when-retries-are-exceeded, here's what they recommend (and what works on my machine):

// Retry on a null result
RetryPolicy<Connection> retryPolicy = RetryPolicy.<Connection>builder()
  .handleResult(null)
  .build();
  
// Fallback on a null result with a ConnectException
Fallback<Connection> fallback = Fallback.<Connection>builderOfException(e -> {
    return new ConnectException("Connection failed after retries");
  })
  .handleResult(null)
  .build();

Failsafe.with(fallback).compose(retryPolicy).get(this::getConnection);

This means diehard.core/fallback would have to be reworked slightly to support .handleResult etc.

Awesome library by the way! ๐Ÿ™Œ Great functionality, and great abstraction over the Java lib.

Circuit breaker not opening

Hi there, I might not understand how this works. Very likely I am missing something.

I have a :fail-if that always returns true and a block of database code that throws a PSQLException.

Basically the with-circuit-breaker form always always throws and never enters in any of the callback functions, where I have logging statements.

I don't have any threshold set (but I tried with them) and I expected for the circuit to open at the first failure.

What am I doing wrong?

Incompatible change in behaviour of fallback policy, after upgrading to failsafe 2.0.x

Commit fc49004 introduced the changes needed to implement the fallback policy as a regular policy.

Previous to this change, the fallback policy was only executed after the retry policy (assuming no circuit breaker was configured) had resulted in a failed block execution for all configured retries. With the new change, the fallback policy is executed for every failed block execution.

This is due to a bad policy composition introduced in that commit. The typical way of composing the fallback policy (and the one that mimics the previous behaviour) is specified in https://github.com/jhalterman/failsafe#policy-composition.

So the right order to implement that typical policy and maintain the previous behaviour is by applying the following change:

diff --git a/src/diehard/core.clj b/src/diehard/core.clj
index f591542..410b279 100644
--- a/src/diehard/core.clj
+++ b/src/diehard/core.clj
@@ -343,7 +343,7 @@ It will work together with retry policy as quit criteria.
            fallback# (fallback the-opt#)
            cb# (:circuit-breaker the-opt#)

-           policies# (into-array FailurePolicy (filter some? [retry-policy# fallback# cb#]))
+           policies# (into-array FailurePolicy (filter some? [fallback# retry-policy# cb#]))

            failsafe# (Failsafe/with policies#)
            failsafe# (if-let [on-complete# (:on-complete the-opt#)]

`deflistener` does not exist

Hello @sunng87, and thank you for your effort on this library!

deflistener is documented but I can't find it in diehard.core.

Steps to reproduce:

(ns user
  (:require [diehard.core :refer [deflistener]))

;; ...
;; deflistener does not exist

I could be doing something very wrong, but I can't find reference to deflistener in the diehard.core tests either, so it's possible it got lost in the shuffle in the move to 0.8.0, based on my very brief look at the commit history.

Exceptions thrown from :on-failure or :on-retries-exceeded are swallowed

Exceptions thrown from :on-failure or :on-retries-exceeded are swallowed. As a result, the following test fails:

(deftest throw-on-timeout-test
  (is (thrown-with-msg? Exception #":on-retries-exceeded"
                        (dh/with-retry {:retry-when          false
                                        :delay-ms            100
                                        :max-retries         10
                                        :on-failed-attempt   (fn [_ _] 
                                                               (println :on-failed-attempt))
                                        :on-failure          (fn [_ _] 
                                                               (prn :on-failure) 
                                                               (throw (ex-info ":on-failure" {})))
                                        :on-retries-exceeded (fn [_ _] 
                                                               (prn :on-retries-exceeded)
                                                               (throw (ex-info ":on-retries-exceeded" {})))}
                          false))))

Cannot use listeners with predefined policy

I'm reading the cljdoc on predefined listeners, the documentation states:

(require '[diehard.core :as diehard])

;; added for clarity
(diehard/defretrypolicy policy
  {:max-retries 5})

(diehard/deflistener listener
  {:on-retry (fn [return-value exception-thrown] (println "retried"))})

(diehard/with-retry {:policy policy :listener listener}
  ;; your code here
  )

So problem number one here diehard/deflistener no longer exists, I see that it was introduced in version 0.3.0, but I don't know when it went away. That's fine, so we can rewrite the above code as follows:

(require '[diehard.core :as diehard])

;; added for clarity
(diehard/defretrypolicy policy
  {:max-retries 5})

(diehard/with-retry {:policy policy
                     :on-retry (fn [return-value exception-thrown] (println "retried"))}
  ;; your code here
  )

The problem here is that once the policy key is provided, all the other keys are ignored and they wont affect the provided RetryPolicy object.

My goal was to create a base RetryPolicy object once, and provide the listener functions at the call site so that I can use the inputs, something like this:

(require '[diehard.core :as diehard])

;; added for clarity
(diehard/defretrypolicy policy
  {:max-retries 5})

(defn get-user
  [id]
  (diehard/with-retry {:policy policy
                       :on-retry (fn [_ _] (println "(Re)Attempting to fetch user: " id)}
    (get-user* id))

This is a contrived example but essentially split the policy from the listeners that need local information.

TL;DR: The documentation doesn't match what the code is doing, what is the correct the code or the docs?


What I'll end up doing is define my policy as a Clojure map and assoc the listener keys at the call site, e.g.:

(require '[diehard.core :as diehard])

;; added for clarity
(def policy
  {:max-retries 5})

(defn get-user
  [id]
  (diehard/with-retry (assoc policy
                       :on-retry (fn [_ _] (println "(Re)Attempting to fetch user: " id))
    (get-user* id))

upgrade to 0.9.3 produces Execution error, 'Unable to resolve spec: ...'

For the 0.9.x series, things seem fine through the 0.9.2 release. When I try to use the 0.9.3 release I get the following error when compiling:

Execution error at diehard.util/verify-opt-map-keys-with-spec (util.clj:13).
Unable to resolve spec: :retry/retry-policy-new

We are backing up to 0.9.2 for the work we are doing but I thought you would want to know.
Perhaps we are doing something wrong on our end, but this is code that has been working well and has not changed in quite some time.

Thanks for your work on this library! :-)

0.10.1 regression: max-retries not respected

With the changes in #42, retry-policy-from-config with just a policy and no overrides ends up overwriting :max-retries with -1. A minimal reproduction, based on an existing test

(defretrypolicy the-test-policy
  {:max-retries 4
   :retry-if (fn [v e] (< v 10))})

(with-retry {:policy the-test-policy}
  *executions*)
; => 10

It looks like :max-retries is the only option with a default. Removing the default would fix this issue, but I'm not sure if it's safe to remove:

(when-let [retries (:max-retries policy-map -1)]
(.withMaxRetries policy retries))

NullPointerException when combining with-retry and with-timeout

I want to retry something that might time out, an HTTP request maybe. So I tried this:

(diehard/with-retry {:retry-on    TimeoutExceededException
                     :max-retries 1}
  (diehard/with-timeout {:timeout-ms 100}
    ;; Something that might time out
    (Thread/sleep 200)))

This results in a NullPointerException:

Execution error (NullPointerException) at user/eval173014 (user.clj:301).
Cannot throw exception because the return value of "java.lang.Throwable.getCause()" is null

My guess is it happens because the throwed TimeoutExceededException does not have a cause:

(try
    (diehard/with-timeout {:timeout-ms 100}
      (Thread/sleep 200))
    (catch Exception e
      (.getCause e)))
=> nil

Yet with-retry assumes that a FailsafeException always has a cause:
https://github.com/sunng87/diehard/blob/master/src/diehard/core.clj#L365

I've worked around this by throwing a new exception that has the TimeoutExceededException as cause:

(diehard/with-retry {:retry-on    Exception
                     :max-retries 1}
  (try
    (diehard/with-timeout {:timeout-ms 100}
      (Thread/sleep 200))
    (catch TimeoutExceededException e
      (throw (Exception. "Timeout" e)))))

But I'm wondering if there's a better intended way to do this, or if there maybe should be a check around the .getCause in with-retry that would throw the original exception if the cause was nil?

Async retry

Add support for Failsafe retry in async scenario. This is not needed by myself at the moment but it would be nice to have it supported from the library.

Add fallback support

We can now specify a fallback value/handler for retry block in failsafe 0.9.x

Allow retry on Throwable

Currently diehard restricts retry-on to Exception and its subclasses:

https://github.com/sunng87/diehard/blob/master/src/diehard/spec.clj#L17

This is in keeping with the recommendation from the Java docs:

The class Exception and its subclasses are a form of Throwable that indicates conditions that a reasonable application might want to catch.

https://docs.oracle.com/javase/8/docs/api/java/lang/Exception.html

But this advice is (sadly) contradicted by experience. In the wild, much code throw exceptions that are not subclasses of Exception (Clojure's pre and post assertions, for example) or catches Throwable, usually in a server context to prevent thread death, for example:

As diehard is used in exactly these situations, I believe the spec should be relaxed to (isa? % Throwable).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.