Comments (16)
Hi Willy,
I've repeated our data collection for a random HAProxy thread (patched), and it is a totally different scenario. It is worth noting that the server was operating at only 55 Gbps during data collection, but still, pl_wait_unlock
is taking so little CPU that is barely noticeable. This looks like solid improvement! :-)
Here's considering only HAProxy (no kernel/libc):
from haproxy.
Sounds great! And it sounds future-proof in relation to these bigger core machines.
We intend on experimenting with 200-300Gbps machines this year, and this approach sounds great for this kind of traffic.
from haproxy.
Oops, sorry, my bad!
Like a moron, I went straight to the patch and didn't even read the "don't use peers" part. I've commented them out and it works fine, yes!
We'll report back after a few hours to collect some traffic.
from haproxy.
Super cool, thanks for the measurements! It confirms that we must absolutely finish that one for 3.0, it seems like a low-hanging fruit (unless there are trouble on the peers side of course, which I hope not). In the worst case it shows too difficult to adapt the peers code to it, I'm still having a fallback in mind: I think we could use read-locks on the table everywhere we touch any shard of the table, so that it remains exclusive with peers updates. But that would still be a loss so I'd prefer to make sure peers are properly handled.
from haproxy.
from haproxy.
Hi Felipe, Ricardo!
Thanks for this nice feedback. Here it's definitely the read lock on the stick-table that's taking a lot of time because most of the time another thread is already keeping the table write-locked for purging an entry or just inserting one. Given that stktable_get_entry() appears on two lock branches here, I suspect that what happens is that the vast majority of waiters are in stktable_lookup_key() which just takes a read lock, and that a few are form __stktable_store() that's directly inlined in stktable_get_entry(), and which is a write lock. The creation of a few entries is making other threads wait.
I'm wondering if we couldn't figure a way to batch removal of purged entries: we would allow them to linger there for a longer time than configured hoping they get a chance to be recycled and reused, otherwise we'd have to batch-remove them, taking a lock only once for a few to a few tens of entries. Note that I'm saying this without looking in details, of course, but that might be possible.
Another possibility could be to "shard" the keys: instead of having a single tree head in the table, we could have, say, 64, and hash the key so that there are 64 independent sub-tables. Of course this would require to move the expirations there as well, so that we can have one lock per sub-table instead of one shared lock. Maybe that approach would be more scalable by the way, and should be considered with priority as it doesn't sound super complicated.
from haproxy.
Hi @wtarreau thanks for the quick reply!
The sharding idea sounds great! Call me crazy, but this "one lock per sub-table instead of 1 shared lock" just reminded me of the old BKL removal patches back in the day :-)
Anyway, this approach does sound more scalable indeed. Let us know how we can help
from haproxy.
We already do that at other places, but not all places are compatible with this. Stick-tables are special in that there is no relation between keys, so we can afford to store them at different places. In the worst case it will be the "show table" that will list them in whatever order. No big deal.
from haproxy.
Do you have a test platform to test totally experimental code ? I gave it a try to see if there was any showstopper, and it went relatively smoothly. The only limit is that "show table" as well as the Lua table scan currently only list the first shard. But I'd be interested in knowing if:
- you observe some crashes, indicating that some of the locks also protect other areas and cannot be sharded like this;
- it gets significantly better or not (at least in callgraphs)
from haproxy.
@wtarreau we can put it in a smaller production machine to give it a go, yes :-)
from haproxy.
OK, here it comes for 3.0-master, I think it should be OK on top of 2.9 since these parts do not change often. It has received very little testing (it builds and passes the regtests that are not sensitive to show table).
0001-EXP-stktable-try-to-split-the-keys-across-multiple-s.patch.txt
from haproxy.
hey @wtarreau it crashed with 100% cpu after just a few seconds:
a crash.log from core attached.
We have the bin + core if you want.
from haproxy.
Based on the dump, something's got corrupted, too many waiters at the same place. It's possible that I missed an unlock somewhere, I'll need to recheck in depth. Hmmm I also just found stksess_kill_if_expired() in stick-table.h that needs to be converted to use the shards as well. OK let's forget that patch for now, I'll have more work to do on it. Thanks for the test. Don't worry, I'm still having hopes, but at the moment I'm quite busy on a ton of other stuff that I absolutely need to finish.
from haproxy.
OK better now, I fixed the other points I found. A test config on a 24-core EPYC jumped from 358k rps to 2.35M, which tells me it mostly works :-)
However, the peers code takes the table lock and will continue to access the entries unlocked. I don't know this part well enough to do a quick hack there. Thus if you want to test it, I suggest that you comment all your peers (or point them to closed ports so that they don't learn anything).
Here's the updated patch, in case you can give it a try. Sorry, it has the same name (the updated one mentions peers at the end of the commit message).
0001-EXP-stktable-try-to-split-the-keys-across-multiple-s.patch.txt
from haproxy.
hello @wtarreau still no luck with crashes right after start (with 100% CPU).
I've applied it on 2.9.6 and 3.0-dev4....got core dumps from both of them if you want to take a look.
Backtraces:
crash-2.9.log
crash-3.0-dev4.log
from haproxy.
Excellent, I'm indeed impatient to know how much it improves the situation for you!
from haproxy.
Related Issues (20)
- When I use filter the HAproxy process is terminated HOT 12
- issue with openssl initialisation order prevent use of security framework configuration for DH-related part HOT 5
- Ability to inherit server properties when using dynamic servers
- 2.9.4: Peers? crash during/after reload HOT 4
- Native asynchronous request mirroring HOT 2
- OCSP Stapling fails when server resolves to IPv6 but only IPv4 connectivity available HOT 2
- QUIC/H3 vs H2 performance difference for large payloads HOT 23
- Logging multiple combined FIX messages HOT 8
- Freezing frontend in state LIM after high load test HOT 4
- Unable to set a carriage return through a variable with http-request return HOT 4
- SPOE requests hanging until processing time is met when doing a reload HOT 2
- Allow preserving abstract namespace sockets address length HOT 5
- QUIC Interop "resumption" testcase failure when run with LibreSSL HOT 7
- src/http_ext.c: uninitialized variable suspected by gcc-14 HOT 5
- Attach config elements to a uniquely defined ID
- Preserve stats across reloads HOT 2
- src/sample.c: couple of coverity findings HOT 6
- httpclient adding full URL to the generated request. HOT 2
- haproxy 2.9.5 (solaris) external-check command go in infinite loop HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from haproxy.