Comments (8)
I'm happy to work on implementing this, but I suspect @carllerche will have some opinions on how we ought to do it.
from tower.
I think that this is a great idea.
My original thought was to not do backoff in the reconnect middleware, instead keep back off as part of Retry
(which doesn't exist yet) and then Retry<Reconnect<...>>
would provide the backoff reconnect behavior.
However, after thinking some more about it, I don't think that this is necessarily ideal because Retry
will incur some amount of additional overhead necessary to keep a handle to the original request and performing the retry of the request. This overhead probably isn't necessary int he reconnect w/ back off case.
Given this, implementing the logic directly in tower-reconnect
is probably the right first (and maybe final) step. Once we tackle Retry
we can then think about what, if anything, can be shared between the two.
@hawkw if you want to try to take a stab at this, go for it. The one nit with your original list is:
take a stream of durations to use as backoffs
This probably won't be a stream in the futures sense. Instead, the type (Backoff
?) will probably implement IntoIterator
such that the Item = Duration
.
Maybe @olix0r or @danburkert have additional thoughts on the matter.
This relates a little to #14.
from tower.
My original thought was to not do backoff in the reconnect middleware, instead keep back off as part of
Retry
(which doesn't exist yet) and thenRetry<Reconnect<...>>
would provide the backoff reconnect behavior.However, after thinking some more about it, I don't think that this is necessarily ideal because
Retry
will incur some amount of additional overhead necessary to keep a handle to the original request and performing the retry of the request. This overhead probably isn't necessary int he reconnect w/ back off case.
My thought is that we provide a "unit" or "empty" backoff type, and a Retry
with the empty/unit backoffs would just do the current Reconnect
behaviour of immediately failing the request on connect errors, because the backoffs are always exhausted.
Instead, the type (
Backoff
?) will probably implementIntoIterator
such that theItem = Duration
.
Yeah, that seems right.
from tower.
A question that came up up while working on this is: what timer should be used for the backoffs?
In Conduit, I've been working on an abstraction over timer implementations (linkerd/linkerd2#480) that allows a mock timer to be injected for testing, and we'd expect that if I configure conduit to use a mock timer, backoffs will also wait based on the mock timer rather than the default timer. Furthermore, we'd ideally want users who are using tokio-timer
to be able to use that timer for backoffs, without requiring the tokio-timer
dependency. This implies to me that we might want to move the timer facade work I've been doing from Conduit to tower
. Does that seem reasonable?
from tower.
I would prefer to not introduce a Timer
trait yet. Traits come with ergonomic overhead. The next iteration of tokio-timer can handle the requirements of a being able to "mock" out timers.
from tower.
I do have some thoughts on retry strategies. I think the strategy @hawkw outlined in the first comment is valid, but it does have the downside that every error becomes a timeout error. This can be mitigated with some careful error management, but it's pretty tricky to do in an ergonomic way and without throwing away the intermediate errors.
The retry strategy also needs to be designed with speculative / hedged connections in mind. E.g. if the service has three replicas which you can choose from, you may not want to spend your full timeout attempting to connect to one of them. Perhaps this is better solved at a higher-level, though.
from tower.
@danburkert I think that request-level retry strategies should be handled at a higher level. Reconnect should really only be addressing the layer 4 concern of establishing a transport. I'd expect this to be wrapped with a connection pool, the connection pool to be wrapped with a balancer, and then retries to wrap the balancer. I'm in the process of writing up some general plans for linkerd/linkerd2#475 -- would love your input once that's up.
from tower.
Ah ok, seems I misunderstood the requirements here. If you are reconnecting just the layer 4 transport (say, TCP), how do you handle application-level negotiations that need to take place like TCP, SASL, and custom handshakes?
In general doing anything more complex than layer 4 means that the reconnect logic needs to be able to distinguish between retriable and non-retriable errors.
from tower.
Related Issues (20)
- AsyncFilterLayer is missing Clone impl
- Have MakeBalance and MakeBalanceLayer example?
- `Reconnect::new()` Generic parameters are redundant HOT 2
- Adding option_layer causes trait bound unsastisfied HOT 1
- Publish release without pin-project (with pin-project-lite) HOT 5
- `tower::service_fn` docs don't say that you need the `util` flag HOT 3
- MQTT client adapter / framework HOT 1
- βFan outβ services? HOT 3
- Consider using `ControlFlow` for retry `Policy`.
- Extending `Building a middleware from scratch` guide
- breaking change in tower design (0.6 or beyond): first class support for async fn traits HOT 20
- unexpected behaviour of `RateLimit`
- experiment with permit based service framework HOT 5
- Tower & Warp server error
- Rate Limit Layer not Respected HOT 1
- Feature Request: implement `layer` method on `ServiceExt` HOT 2
- Port tower to futures-lite HOT 2
- Integrate GATs into tower-service HOT 1
- Add `BoxCloneSyncService` HOT 2
- Build fails with -Zbuild-std HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tower.