GithubHelp home page GithubHelp logo

Comments (16)

rnsanchez avatar rnsanchez commented on June 2, 2024 2

Hi Willy,

I've repeated our data collection for a random HAProxy thread (patched), and it is a totally different scenario. It is worth noting that the server was operating at only 55 Gbps during data collection, but still, pl_wait_unlock is taking so little CPU that is barely noticeable. This looks like solid improvement! :-)

haproxy-2 9 6-patch-20240305-2215

Here's considering only HAProxy (no kernel/libc):

image

from haproxy.

felipewd avatar felipewd commented on June 2, 2024 1

Sounds great! And it sounds future-proof in relation to these bigger core machines.

We intend on experimenting with 200-300Gbps machines this year, and this approach sounds great for this kind of traffic.

from haproxy.

felipewd avatar felipewd commented on June 2, 2024 1

Oops, sorry, my bad!

Like a moron, I went straight to the patch and didn't even read the "don't use peers" part. I've commented them out and it works fine, yes!

We'll report back after a few hours to collect some traffic.

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024 1

Super cool, thanks for the measurements! It confirms that we must absolutely finish that one for 3.0, it seems like a low-hanging fruit (unless there are trouble on the peers side of course, which I hope not). In the worst case it shows too difficult to adapt the peers code to it, I'm still having a fallback in mind: I think we could use read-locks on the table everywhere we touch any shard of the table, so that it remains exclusive with peers updates. But that would still be a loss so I'd prefer to make sure peers are properly handled.

from haproxy.

rnsanchez avatar rnsanchez commented on June 2, 2024

Here's perf annotate:
image

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

Hi Felipe, Ricardo!

Thanks for this nice feedback. Here it's definitely the read lock on the stick-table that's taking a lot of time because most of the time another thread is already keeping the table write-locked for purging an entry or just inserting one. Given that stktable_get_entry() appears on two lock branches here, I suspect that what happens is that the vast majority of waiters are in stktable_lookup_key() which just takes a read lock, and that a few are form __stktable_store() that's directly inlined in stktable_get_entry(), and which is a write lock. The creation of a few entries is making other threads wait.

I'm wondering if we couldn't figure a way to batch removal of purged entries: we would allow them to linger there for a longer time than configured hoping they get a chance to be recycled and reused, otherwise we'd have to batch-remove them, taking a lock only once for a few to a few tens of entries. Note that I'm saying this without looking in details, of course, but that might be possible.

Another possibility could be to "shard" the keys: instead of having a single tree head in the table, we could have, say, 64, and hash the key so that there are 64 independent sub-tables. Of course this would require to move the expirations there as well, so that we can have one lock per sub-table instead of one shared lock. Maybe that approach would be more scalable by the way, and should be considered with priority as it doesn't sound super complicated.

from haproxy.

felipewd avatar felipewd commented on June 2, 2024

Hi @wtarreau thanks for the quick reply!

The sharding idea sounds great! Call me crazy, but this "one lock per sub-table instead of 1 shared lock" just reminded me of the old BKL removal patches back in the day :-)

Anyway, this approach does sound more scalable indeed. Let us know how we can help

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

We already do that at other places, but not all places are compatible with this. Stick-tables are special in that there is no relation between keys, so we can afford to store them at different places. In the worst case it will be the "show table" that will list them in whatever order. No big deal.

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

Do you have a test platform to test totally experimental code ? I gave it a try to see if there was any showstopper, and it went relatively smoothly. The only limit is that "show table" as well as the Lua table scan currently only list the first shard. But I'd be interested in knowing if:

  • you observe some crashes, indicating that some of the locks also protect other areas and cannot be sharded like this;
  • it gets significantly better or not (at least in callgraphs)

from haproxy.

felipewd avatar felipewd commented on June 2, 2024

@wtarreau we can put it in a smaller production machine to give it a go, yes :-)

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

OK, here it comes for 3.0-master, I think it should be OK on top of 2.9 since these parts do not change often. It has received very little testing (it builds and passes the regtests that are not sensitive to show table).
0001-EXP-stktable-try-to-split-the-keys-across-multiple-s.patch.txt

from haproxy.

felipewd avatar felipewd commented on June 2, 2024

hey @wtarreau it crashed with 100% cpu after just a few seconds:

a crash.log from core attached.

We have the bin + core if you want.

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

Based on the dump, something's got corrupted, too many waiters at the same place. It's possible that I missed an unlock somewhere, I'll need to recheck in depth. Hmmm I also just found stksess_kill_if_expired() in stick-table.h that needs to be converted to use the shards as well. OK let's forget that patch for now, I'll have more work to do on it. Thanks for the test. Don't worry, I'm still having hopes, but at the moment I'm quite busy on a ton of other stuff that I absolutely need to finish.

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

OK better now, I fixed the other points I found. A test config on a 24-core EPYC jumped from 358k rps to 2.35M, which tells me it mostly works :-)

However, the peers code takes the table lock and will continue to access the entries unlocked. I don't know this part well enough to do a quick hack there. Thus if you want to test it, I suggest that you comment all your peers (or point them to closed ports so that they don't learn anything).

Here's the updated patch, in case you can give it a try. Sorry, it has the same name (the updated one mentions peers at the end of the commit message).

0001-EXP-stktable-try-to-split-the-keys-across-multiple-s.patch.txt

from haproxy.

felipewd avatar felipewd commented on June 2, 2024

hello @wtarreau still no luck with crashes right after start (with 100% CPU).

I've applied it on 2.9.6 and 3.0-dev4....got core dumps from both of them if you want to take a look.

Backtraces:
crash-2.9.log
crash-3.0-dev4.log

from haproxy.

wtarreau avatar wtarreau commented on June 2, 2024

Excellent, I'm indeed impatient to know how much it improves the situation for you!

from haproxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.