bufbuild / httplb Goto Github PK
View Code? Open in Web Editor NEWClient-side load balancing for net/http
Home Page: https://pkg.go.dev/github.com/bufbuild/httplb
License: Apache License 2.0
Client-side load balancing for net/http
Home Page: https://pkg.go.dev/github.com/bufbuild/httplb
License: Apache License 2.0
In dual-stack IPv4+IPv6 environments, it is not always possible to know with certainty whether or not an IPv6 address is actually routable from the current environment. Especially in containerized environments, oftentimes “dual stack” IPv4+IPv6 is actually an IPv4 stack with IPv6 loopback, or a similar configuration where Internet IPv6 addresses do not route properly. Therefore, so-called “Happy Eyeballs” algorithms, as proposed by RFC 6555, are employed in any situation where dual-stack IPv4+IPv6 networking is encountered to ensure a swift fallback when IPv6 networking is non-functional.
In Go’s net
stack, this is implemented at the net.Dialer
level: when resolving addresses, the net.Dialer
will choose IPv6 addresses in preference as “primary” addresses, and IPv4 addresses as “fallback” addresses. The net.Dialer
performs a race, first attempting to connect to the IPv6 address, and shortly after, trying to connect to the IPv4 address, concurrently. These connection attempts are raced, with the first connection that succeeds getting used and the other being closed and discarded.
In httplb, this doesn’t work: httplb creates individual connections to hosts, resolving names before the net.Dialer
is called. This means the net.Dialer
cannot perform its RFC 6555 race.
This proposal asserts that the core problem with httplb
today is that the Resolver
returns unstructured addresses. This is appropriate for DNS, because DNS does not carry any information about hosts: it only has records that point to addresses. However, the reason why a clean IPv6 to IPv4 fallback can not be implemented is because the Resolver
does not return enough information. In an ideal world, the Resolver
would return both the IPv4 and IPv6 address(es) for a host as a single unit.
The current default behavior of httplb
at HEAD
is to prefer IPv4 addresses when present, only using IPv6 addresses when there are no IPv4 addresses present. This is probably widely compatible with existing deployments, and it is likely better than using both resolved IPv4 and IPv6 addresses, since it is very likely that some of the addresses point to the same hosts, and thus in dual stack environments it is very likely that using both IPv4 and IPv6 addresses would lead to improperly balanced load.
In order to rectify this, this proposal suggests that the Resolver
return a slice of Host structures, which shall have:
For more advanced resolvers, these values may be able to be filled in some logical fashion that is able to support a proper RFC 6555 fallback. DNS however does not provide enough information, so any solution to this problem will carry at least some downsides.
There is no ideal way to implement this with DNS, so a heuristic must be used. I suggest the following:
The net effect of this is that the fallback IPv4 address for a given IPv6 is effectively arbitrary; this means that if any of the hosts are unhealthy, the balancing may become uneven when IPv4 fallback addresses are used and result in multiple entries in the pool balancing to the same host. Unfortunately, there's really no way to prevent this from happening with DNS alone, but I believe this is a better overall outcome, as the result today is that not explicitly specifying IPv4 or IPv6 results in other suboptimal behavior, potentially making adoption of httplb difficult in systems that might need to tolerate a large variety of possible production environments.
Right now the way that the transport implementation handles targeting is by overriding the host in the URL, which requires some overhead:
http.Request
needs to be at least shallow copied.In order to allow for this fallback behavior, we need to move this override to a lower level. This gives us the opportunity to lower the overhead of the transport implementation in the vast majority of cases, since there are far fewer cases where the request or URL will need to be cloned, and the TLS configuration will never need to be patched.
It is possible for users to provide a custom Dial
function. We could wrap this again into a custom dial function that performs the RFC 6555 Dial
race using the underlying implementation. The downside here is that we need to implement this race ourselves, though it is not insurmountable.
*net.Resolver
It is challenging but possible to override the behavior of *net.Resolver
. This can be done by setting the PreferGo
field to true
and setting the Dial
function to return an in-memory net.Pipe()
that speaks DNS, ideally using the x/net/dns/dnsmessage
package. (I recently did this in my test implementation.)
While this looks ugly, it seems like it is actually intended by the Go developers, and despite the text on PreferGo
being somewhat unclear, it will in fact work on all platforms:
if runtime.GOOS == "plan9" {
// TODO(bradfitz): for now we only permit use of the PreferGo
// implementation when there's a non-nil Resolver with a
// non-nil Dialer. This is a sign that the code is trying
// to use their DNS-speaking net.Conn (such as an in-memory
// DNS cache) and they don't want to actually hit the network.
// Once we add support for looking the default DNS servers
// from plan9, though, then we can relax this.
if r == nil || r.Dial == nil {
return false
}
}
Furthermore, while this is seemingly intended to work for the foreseeable future, there also seems to be intent to implement this more properly in the future:
// TODO(bradfitz): optional interface impl override hook
By doing this, we net the ability to tell net.Dial
about the list of hosts and it should be able to perform graceful IPv6 fallback, probably avoiding the fallback delay as necessary.
The only problem with this approach is that it relies on being able to override the Resolver
field of the net.Dialer
, which precludes the ability to specify a Dial
function. We would need to refactor this API so that the *net.Resolver
gets passed back to the user so they can use it in their Dial
function implementation (we may need to rehaul the way the override works; it should probably be a function that returns a Dial
function, given a *net.Resolver
.)
The advantage of this approach is that if applied properly, it should give us fallback behavior that is very close to what Go is able to offer out-of-the-box.
Dial
level. There are two potential approaches:
Dial
race using the Dial
function or default dialer directly. This would not require changes to the API, but it would require implementing the nuances of the dial race by hand.Dial
function API so that we can pass a custom *net.Resolver
to be used for dialing. Allow the real hostname of the target to pass down to the Dial
function. Implement custom *net.Resolver
behavior so that we can return only the records appropriate for the individual host entry. This would allow the default RFC 6555 dial race implemented in the Go net
package to do most of the work.When a set of servers is behind a layer-4 proxy, the default behavior of an httplb.Client
will not result in meaningful load distribution. In such a case, DNS resolution typically returns a single address, a VIP for the proxy. But since it's a layer-4 proxy, it will not distribute requests to multiple backends: instead, a single connection is effectively pinned to a single backend behind the proxy.
The first thing needed to work in this scenario is a way to make the client establish multiple connections, even if to the same backend address. The layer-4 load balancer will then distribute these connections to multiple backends. That way, picking from these multiple connections (such as the default round robin algorithm) results in distribution of requests to multiple backends.
The above is not by itself sufficient because it is possible that multiple connections still have low backend diversity. This could easily happen when the total number of connections in the client is low and so is the number of backends. In this case, just by chance, the proxy could route multiple (even all) connections to the same backend instance. This can also happen in some operational scenarios, such as rolling restarts, where one node is restarted and the underlying "virtual" connection reconnects. If some of the backends are unavailable (due to the restart), there is a greater chance that these reconnects find their way to the same backend.
So, to work around the potential issue of low backend diversity, we also need a way to recycle connections. Periodically forcing "virtual connections" to re-connect. While we can't prevent the situation of low backend diversity, since there is no way to influence how a connection through a layer-4 proxy gets routed, periodically recycling connections allows the client to "heal" its connection pool if/when it does get into this state.
Re-connecting needs to be graceful, so that any in-progress operations on the old connection are allowed to complete, uninterrupted, but the new connection is used for all new operations.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.