Comments (12)
Remember that the 'time to x' charts are inverted; lower relative values are better.
I don't see a whole lot of benefit one way or the other here, but I think it's worth more investigation!
from thousand_island.
Looking at https://github.com/mtrudel/bandit/actions/runs/3992173488, the results are mostly neutral
I think we should increase the number of clients
After all, the perf issue in main
would only occur when there are lots of incoming messages 🤔
from thousand_island.
Your analysis is spot on! That exact thing is (mostly) implemented on https://github.com/mtrudel/thousand_island/tree/inline_accept, but doesn't seem to yield the improvements you (and I!) might think. Feel free to play around with it though, and see if there's something I'd missed (truthfully, I didn't look too closely at it).
from thousand_island.
The implementation in inline_accept
branch looks good to me
I'm wondering how you compared the performance between main
and inline_accept
?
As I mentioned above, this potential performance issue may only occur when:
- many connections come in at the same time
- and they all start sending a lot of messages
from thousand_island.
I cooked up a branch of bandit that referenced the inline_accept
branch, and then ran a manual benchmark against it.
An even easier way to get started locally (and an even more apples to apples comparison) would be to run something like https://github.com/mtrudel/thousand_island/blob/main/examples/http_hello_world.ex in both versions of thousland_island and run h2load (part of nghttpd) against each of them directly. Its 'Time To Connect' and 'Time To First Byte' measurements would be the things to look at for this. The overhead of HTTP here is as minimal as you're going to get (essentially constant time), and leaves really just the differences in Thousand Island implementations to compare.
from thousand_island.
Hi @mtrudel
Thanks for the instruction
I ran some benchmarks with h2load
against the HTTPHelloWorld
handler on my local
With only 1 tweak: set num_acceptors
to 1
so only 1 acceptor is accepting connections, simulating the most demanding scenarios
(I need to use my fork of inline_accept
branch so it respects num_acceptors
config: ced5f3e)
And I did find that in some extreme scenarios, inline_accept
can yield some better performance, a huge improvements over Time to 1st byte
- inline_accept
h2load --h1 -n100000 -c1000 -m1 http://localhost:6001 starting benchmark... spawning thread #0: 1000 total client(s). 100000 total requests Application protocol: http/1.1 progress: 10% done finished in 34.29s, 466.24 req/s, 0B/s requests: 100000 total, 16634 started, 15988 done, 15988 succeeded, 84012 failed, 84012 errored, 0 timeout status codes: 15988 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 0B (0) total, 484.01KB (495628) headers (space savings 0.00%), 202.97KB (207844) data min max mean sd +/- sd time for request: 50us 49.95ms 7.61ms 6.17ms 91.63% time for connect: 51us 112.87ms 25.50ms 30.82ms 88.72% time to 1st byte: 101us 703us 366us 171us 63.16% req/s : 0.00 69.46 10.63 12.41 90.20%
- main (acceptor)
h2load --h1 -n100000 -c1000 -m1 http://localhost:6000 starting benchmark... spawning thread #0: 1000 total client(s). 100000 total requests Application protocol: http/1.1 progress: 10% done finished in 28.37s, 506.86 req/s, 0B/s requests: 100000 total, 15179 started, 14380 done, 14380 succeeded, 85620 failed, 85620 errored, 0 timeout status codes: 14380 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 0B (0) total, 435.33KB (445780) headers (space savings 0.00%), 182.56KB (186940) data min max mean sd +/- sd time for request: 61us 1.25s 18.99ms 99.27ms 99.35% time for connect: 54us 199.61ms 24.63ms 28.52ms 90.65% time to 1st byte: 159us 1.24s 81.44ms 296.95ms 93.89% req/s : 0.00 88.29 17.13 17.53 85.90%
That being said, in any scenario that's less demanding than -n100000 -c1000
, these 2 implementations are almost the same:
- inline_accept
h2load --h1 -n10000 -c100 -m1 http://localhost:6001 starting benchmark... spawning thread #0: 100 total client(s). 10000 total requests Application protocol: http/1.1 progress: 10% done progress: 20% done progress: 30% done progress: 40% done progress: 50% done progress: 60% done progress: 70% done progress: 80% done progress: 90% done progress: 100% done finished in 786.63ms, 12712.46 req/s, 0B/s requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 0B (0) total, 302.73KB (310000) headers (space savings 0.00%), 126.95KB (130000) data min max mean sd +/- sd time for request: 635us 10.20ms 6.82ms 1.34ms 74.87% time for connect: 99us 1.02ms 310us 248us 81.00% time to 1st byte: 738us 7.95ms 6.36ms 1.60ms 87.00% req/s : 127.26 129.98 128.23 0.52 74.00%
- main
h2load --h1 -n10000 -c100 -m1 http://localhost:6000 starting benchmark... spawning thread #0: 100 total client(s). 10000 total requests Application protocol: http/1.1 progress: 10% done progress: 20% done progress: 30% done progress: 40% done progress: 50% done progress: 60% done progress: 70% done progress: 80% done progress: 90% done progress: 100% done finished in 1.42s, 7064.71 req/s, 0B/s requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 0B (0) total, 302.73KB (310000) headers (space savings 0.00%), 126.95KB (130000) data min max mean sd +/- sd time for request: 1.23ms 166.27ms 13.73ms 16.23ms 98.03% time for connect: 92us 243us 119us 29us 82.00% time to 1st byte: 2.50ms 12.51ms 10.14ms 3.58ms 78.00% req/s : 70.77 71.73 71.22 0.24 64.00%
And if we have 10 acceptors for both implementations, they can handle -n100000 -c1000
just fine.
So I'm not sure if it's worth it to switch to inline_accept
What do you think?
P.S.
You may find my setup in this repo:
https://github.com/dsdshcym/thousand_island_benchmark/
from thousand_island.
Interesting! There was some conversation recently on mtrudel/bandit#72 regarding TTFB numbers, so this is something both timely and intriguing!
I'd suggest that we:
- Review the changes on the
inline_accept
branch in the context of it being a 'real' PR and not just an experiment (ie: with an assumption that if things look good, it gets merged). - Cut a test PR on bandit that points at this branch (this is just a test branch; if
inline_accept
gets merged then bandit will get it as part of its regular thousand_island dependency - We can put the benchmarker to work on this branch in bandit and see how inline_accept works in the real world
WDYT?
from thousand_island.
Specifically, if we can see improvements to TTC/TTFB with reqs/sec numbers staying unchanged, that's a clear win in my book.
from thousand_island.
Wow’zer thats a signifant improvement to “time for request”
from thousand_island.
On a second thought, the active
option is false
until we set it to :once
in Handler.handle_continuation/2
:
thousand_island/lib/thousand_island/handler.ex
Lines 386 to 390 in 63878c0
So that there are no messages to move in Acceptor's mailbox when calling controlling_processes/2
I did a little test with this diff in main:
modified lib/thousand_island/acceptor.ex
@@ -12,10 +12,20 @@ def run({server_pid, parent_pid, %ThousandIsland.ServerConfig{} = server_config}
accept(listener_socket, connection_sup_pid, server_config)
end
+ require Logger
+
defp accept(listener_socket, connection_sup_pid, server_config) do
case server_config.transport_module.accept(listener_socket) do
{:ok, socket} ->
+ loop(
+ fn ->
+ Logger.debug(inspect(Process.info(self(), :messages)))
+ end,
+ 10
+ )
+
ThousandIsland.Connection.start(connection_sup_pid, socket, server_config)
+
accept(listener_socket, connection_sup_pid, server_config)
{:error, :closed} ->
@@ -25,4 +35,14 @@ defp accept(listener_socket, connection_sup_pid, server_config) do
raise "Unexpected error in accept: #{inspect(reason)}"
end
end
+
+ defp loop(_fun, 0) do
+ :ok
+ end
+
+ defp loop(fun, times) do
+ fun.()
+
+ loop(fun, times - 1)
+ end
end
And the messages are always empty...
from thousand_island.
You're correct! There's still a number of possible approaches here that might help improve connection startup times (though, this is the hardest part of the connection lifecycle to reason about so it's important that we reason through the subtlety here).
I'm working up a simple benchmark stack directly on thousand island so we can test this stuff more directly. I'll report back here as I make progress.
from thousand_island.
I've been up and down this and I can't seem to get any reproducible numbers out of it. Suggest we shelve the PRs for the time being.
The issue originally raised in this issue is resolved as well, I think. Will close.
from thousand_island.
Related Issues (20)
- Support for UDP transport HOT 12
- Process isolation between requests HOT 6
- Acceptor pool race condition HOT 6
- Cannot start multiple ThousandIsland children under a Supervisor HOT 5
- GenServer `:tls_alert` error on handshake failure HOT 6
- Unexpected error in accept: :emfile HOT 26
- Thoughts on adopting PartitionSupervisor HOT 4
- docs are unclear about GenServer handle_* calls cancelling read timeouts HOT 3
- problems with tcpkali HOT 4
- configurable acceptor pools HOT 2
- FunctionClauseError randomly happens HOT 2
- Upgrading the transport mid-connection HOT 6
- Inconsistent callback invoked when data received HOT 6
- A way to align custom and library `GenServer` responses HOT 7
- Slow accepting HOT 2
- handling non-TI messages within same GenServer fails after PR#96 HOT 1
- Feature request: Let the process outlive beyond the connection HOT 9
- Using the ThousandIsland.Handler "escape hatch" HOT 6
- GenServer #PID<0.656.0> terminating HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thousand_island.