GithubHelp home page GithubHelp logo

bufbuild / httplb Goto Github PK

View Code? Open in Web Editor NEW
41.0 3.0 1.0 271 KB

Client-side load balancing for net/http

Home Page: https://pkg.go.dev/github.com/bufbuild/httplb

License: Apache License 2.0

Makefile 0.89% Go 99.11%
golang http load-balancer connectrpc

httplb's People

Contributors

akshayjshah avatar chrispine avatar dependabot[bot] avatar emcfarlane avatar jchadwick-buf avatar jhump avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

tsingson

httplb's Issues

Proposal: Add Host concept to Resolver and implement RFC 6555

Overview

In dual-stack IPv4+IPv6 environments, it is not always possible to know with certainty whether or not an IPv6 address is actually routable from the current environment. Especially in containerized environments, oftentimes “dual stack” IPv4+IPv6 is actually an IPv4 stack with IPv6 loopback, or a similar configuration where Internet IPv6 addresses do not route properly. Therefore, so-called “Happy Eyeballs” algorithms, as proposed by RFC 6555, are employed in any situation where dual-stack IPv4+IPv6 networking is encountered to ensure a swift fallback when IPv6 networking is non-functional.

In Go’s net stack, this is implemented at the net.Dialer level: when resolving addresses, the net.Dialer will choose IPv6 addresses in preference as “primary” addresses, and IPv4 addresses as “fallback” addresses. The net.Dialer performs a race, first attempting to connect to the IPv6 address, and shortly after, trying to connect to the IPv4 address, concurrently. These connection attempts are raced, with the first connection that succeeds getting used and the other being closed and discarded.

In httplb, this doesn’t work: httplb creates individual connections to hosts, resolving names before the net.Dialer is called. This means the net.Dialer cannot perform its RFC 6555 race.

Proposal

This proposal asserts that the core problem with httplb today is that the Resolver returns unstructured addresses. This is appropriate for DNS, because DNS does not carry any information about hosts: it only has records that point to addresses. However, the reason why a clean IPv6 to IPv4 fallback can not be implemented is because the Resolver does not return enough information. In an ideal world, the Resolver would return both the IPv4 and IPv6 address(es) for a host as a single unit.

The current default behavior of httplb at HEAD is to prefer IPv4 addresses when present, only using IPv6 addresses when there are no IPv4 addresses present. This is probably widely compatible with existing deployments, and it is likely better than using both resolved IPv4 and IPv6 addresses, since it is very likely that some of the addresses point to the same hosts, and thus in dual stack environments it is very likely that using both IPv4 and IPv6 addresses would lead to improperly balanced load.

Resolver Interface

In order to rectify this, this proposal suggests that the Resolver return a slice of Host structures, which shall have:

  • Primary address. In a dual-stack system, this would probably an IPv6 address to the host.
  • Fallback address. Optional. In a dual-stack system, this would probably be an IPv4 address to the host.
  • Host-level attributes. Addresses may still have attributes if it still seems useful.

For more advanced resolvers, these values may be able to be filled in some logical fashion that is able to support a proper RFC 6555 fallback. DNS however does not provide enough information, so any solution to this problem will carry at least some downsides.

DNS Resolver implementation

There is no ideal way to implement this with DNS, so a heuristic must be used. I suggest the following:

  • If there are only IPv4 or only IPv6 addresses in the results, use each address as a unique host.
  • If there are both IPv6 and IPv4 addresses in the results, use each IPv6 address as a unique host and set the fallback address to an arbitrary IPv4 address (probably just pair them in the order they are returned by the resolver.) If there are less IPv4 addresses than IPv6 addresses, cycle through the IPv4 addresses in order. In the event that the resolver returns addresses in any logical order, this may result in host addresses being assigned correctly. In the event that there is a single anycast IPv4 but multiple IPv6 addresses, it will result in the anycast IPv4 being treated as the fallback for any of the IPv6 hosts.
  • Store the results between resolver runs. Before running the above algorithm, check each host from the last set of results; if both the primary and fallback address are still present, remove them from the list and add the old result back to the new result. (Since the fallback may appear multiple times, it should still be considered when checking the remaining previous results.) This ensures that resolution-to-resolution, mappings will be stable. If the resolver returns a different set of hosts each time, the default DNS resolver will already cause problems, so long-term stability in these mappings is not a concern.

The net effect of this is that the fallback IPv4 address for a given IPv6 is effectively arbitrary; this means that if any of the hosts are unhealthy, the balancing may become uneven when IPv4 fallback addresses are used and result in multiple entries in the pool balancing to the same host. Unfortunately, there's really no way to prevent this from happening with DNS alone, but I believe this is a better overall outcome, as the result today is that not explicitly specifying IPv4 or IPv6 results in other suboptimal behavior, potentially making adoption of httplb difficult in systems that might need to tolerate a large variety of possible production environments.

Transport implementation

Right now the way that the transport implementation handles targeting is by overriding the host in the URL, which requires some overhead:

  • The URL almost always needs to be cloned, which means the http.Request needs to be at least shallow copied.
  • The TLS configuration needs to be patched.
  • The Host header needs to be fixed.

In order to allow for this fallback behavior, we need to move this override to a lower level. This gives us the opportunity to lower the overhead of the transport implementation in the vast majority of cases, since there are far fewer cases where the request or URL will need to be cloned, and the TLS configuration will never need to be patched.

Implement RFC 6555 fallback manually

It is possible for users to provide a custom Dial function. We could wrap this again into a custom dial function that performs the RFC 6555 Dial race using the underlying implementation. The downside here is that we need to implement this race ourselves, though it is not insurmountable.

Implement a custom *net.Resolver

It is challenging but possible to override the behavior of *net.Resolver. This can be done by setting the PreferGo field to true and setting the Dial function to return an in-memory net.Pipe() that speaks DNS, ideally using the x/net/dns/dnsmessage package. (I recently did this in my test implementation.)

While this looks ugly, it seems like it is actually intended by the Go developers, and despite the text on PreferGo being somewhat unclear, it will in fact work on all platforms:

	if runtime.GOOS == "plan9" {
		// TODO(bradfitz): for now we only permit use of the PreferGo
		// implementation when there's a non-nil Resolver with a
		// non-nil Dialer. This is a sign that the code is trying
		// to use their DNS-speaking net.Conn (such as an in-memory
		// DNS cache) and they don't want to actually hit the network.
		// Once we add support for looking the default DNS servers
		// from plan9, though, then we can relax this.
		if r == nil || r.Dial == nil {
			return false
		}
	}

Furthermore, while this is seemingly intended to work for the foreseeable future, there also seems to be intent to implement this more properly in the future:

	// TODO(bradfitz): optional interface impl override hook

By doing this, we net the ability to tell net.Dial about the list of hosts and it should be able to perform graceful IPv6 fallback, probably avoiding the fallback delay as necessary.

The only problem with this approach is that it relies on being able to override the Resolver field of the net.Dialer, which precludes the ability to specify a Dial function. We would need to refactor this API so that the *net.Resolver gets passed back to the user so they can use it in their Dial function implementation (we may need to rehaul the way the override works; it should probably be a function that returns a Dial function, given a *net.Resolver.)

The advantage of this approach is that if applied properly, it should give us fallback behavior that is very close to what Go is able to offer out-of-the-box.

Summary

  • This proposal asserts that it is a design defect to treat mixed IPv4+IPv6 records as their own hosts: in most cases, mixed results would result from the same hosts with both IPv4 and IPv6 endpoints, and thus contain duplicate hosts. Resolvers should return not just addresses, but hosts, which contain possibly multiple addresses.
  • DNS itself does not contain enough information to know which addresses belong to the same hosts, so this proposal suggests a simple algorithm that arbitrarily pairs IPv6 addresses with IPv4 addresses in the order they appear in the result set. In subsequent resolutions, the mappings will be kept consistent as long as both the IPv6 and IPv4 address are still present in the new results.
  • Since a host could have multiple addresses, the current approach to mapping to a specific target will not work. This gives us an opportunity to remove some hacks by instead implementing the override on the Dial level. There are two potential approaches:
    • Implement an RFC 6555-style Dial race using the Dial function or default dialer directly. This would not require changes to the API, but it would require implementing the nuances of the dial race by hand.
    • Refactor the custom Dial function API so that we can pass a custom *net.Resolver to be used for dialing. Allow the real hostname of the target to pass down to the Dial function. Implement custom *net.Resolver behavior so that we can return only the records appropriate for the individual host entry. This would allow the default RFC 6555 dial race implemented in the Go net package to do most of the work.

Better support for layer-4 (TCP) proxies

When a set of servers is behind a layer-4 proxy, the default behavior of an httplb.Client will not result in meaningful load distribution. In such a case, DNS resolution typically returns a single address, a VIP for the proxy. But since it's a layer-4 proxy, it will not distribute requests to multiple backends: instead, a single connection is effectively pinned to a single backend behind the proxy.

The first thing needed to work in this scenario is a way to make the client establish multiple connections, even if to the same backend address. The layer-4 load balancer will then distribute these connections to multiple backends. That way, picking from these multiple connections (such as the default round robin algorithm) results in distribution of requests to multiple backends.

The above is not by itself sufficient because it is possible that multiple connections still have low backend diversity. This could easily happen when the total number of connections in the client is low and so is the number of backends. In this case, just by chance, the proxy could route multiple (even all) connections to the same backend instance. This can also happen in some operational scenarios, such as rolling restarts, where one node is restarted and the underlying "virtual" connection reconnects. If some of the backends are unavailable (due to the restart), there is a greater chance that these reconnects find their way to the same backend.

So, to work around the potential issue of low backend diversity, we also need a way to recycle connections. Periodically forcing "virtual connections" to re-connect. While we can't prevent the situation of low backend diversity, since there is no way to influence how a connection through a layer-4 proxy gets routed, periodically recycling connections allows the client to "heal" its connection pool if/when it does get into this state.

Re-connecting needs to be graceful, so that any in-progress operations on the old connection are allowed to complete, uninterrupted, but the new connection is used for all new operations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.