GithubHelp home page GithubHelp logo

Comments (14)

Lukasa avatar Lukasa commented on May 27, 2024 1

Ok, for now I think we can accept that if you're exceeding SETTINGS_MAX_CONCURRENT_STREAMS by 40x then you may begin to see performance problems and want to rearchitect your application. We can revisit this decision down the road if it becomes worthwhile.

Let's call this done for now. Thanks for the report, it led to a number of useful fixes!

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

Huh, weird. I'll investigate.

As a quick note, I don't think my recent changes will have affected that, as they imposed a linear cost on each header block encoded, and header block encoding is always going to be linear-time anyway. So we need to investigate further and find where the quadratic operation count is.

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

Hrm, this is a nasty one.

Regardless of whether or not the HTTP/2 stack has any quadratic runtime in it, we're not getting a chance to exercise it. This is because we're hitting a quadratic runtime problem in NIO's scheduler first. We'll need to resolve that before we go any further.

The issue is fundamentally to do with the way our Heap is implemented. Specifically, we're hitting a weird edgecase which appears to cause Swift to emit unnecessary ARC traffic, and that unnecessary ARC traffic leads to an unnecessary CoW of the backing array storage. Tracking that down took quite a while, but the result is that a loop spinning on any of the SelectableEventLoop functions that modifies the task queue (execute, submit, schedule) exhibits quadratic behaviour.

While we wait for the above Swift bug to be fixed I'm also going to rewrite the Heap. The weird edge case was hit because of our use of static funcs, and those were only there to allow us to feel more confident that we hadn't misimplented the algorithms. We clearly haven't, and we also wrote a bunch of tests, so I'm going to just pull those functions down into instance methods that Swift is more capable of seeing through, which should resolve the problem nicely.

I am not confident this is the end of the quadratic behaviour this test will see. In particular, I'm sure there is a quadratic loop in the HTTP/2 code that I've already found that this test should hit. With that one I'm less confident whether we should fix it or let it sit there, but that's a discussion for another day: first, let's fix the quadratic behaviour we know about.

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

See apple/swift-nio#960 for the fix for Heap.

from swift-nio-http2.

MrMage avatar MrMage commented on May 27, 2024

Wow, impressive detective work! Looking forward to seeing what the performance of this test will be once all the quadratic runtime issues are fixed.

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

Incidentally, there's another major performance gain you can have here that works today.

Right now the test spins in a loop on client.get, which ends up calling EventLoop.execute to swap onto the event loop thread. The effect of that is that you are acquiring and releasing the event loop task queue lock many times in a tight loop. This lock will simultaneously be being contended on by the event loop thread itself, so this lock acquisition pattern ends up being quite slow.

In the pre-Heap-fix code I changed this by moving the loop execution onto the event loop thread. This eliminated the cross-thread traffic, and in one case dropped the runtime for preparing 20k requests from 13 seconds to 2 seconds.

Put another way, the cross-thread communication still utterly dominates the quadratic runtime of the heap logic if the iteration count gets large enough.

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

@MrMage Is this problem still outstanding for you?

from swift-nio-http2.

MrMage avatar MrMage commented on May 27, 2024

@Lukasa I haven't tried recently; my understanding was that there were other causes of quadratic behavior that have yet to be fixed. Is that still the case or should I re-run our tests?

from swift-nio-http2.

Lukasa avatar Lukasa commented on May 27, 2024

So there's one possible source of quadratic behaviour in your code, which is that calls to flush from the child channels will lead to a loop over the number of streams that are being buffered to avoid violating SETTINGS_MAX_CONCURRENT_STREAMS. As you attempt to create a bunch of streams all at once, and flush each of them, that buffer will be sized O(n), and will be flushed O(n) times, leading to a quadratic loop.

However, I don't believe that quadratic behaviour is likely to be a performance problem in real programs. Your test wildly exceeds SETTINGS_MAX_CONCURRENT_STREAMS (which is usually set to 100), and so is an absolute worst-possible-case for that code. It's also profoundly difficult to fix within the semantics of NIO's flushing model: I end up needing to maintain a separate data structure that indicates which streams need to be flushed, and in the common-case (where there are only a handful of streams buffered at most) that's more expensive than the current model.

So I'm interested in seeing whether you're still having trouble. If your test runs in an acceptable amount of time then I'm going to consider that quadratic behaviour low-urgency to address.

from swift-nio-http2.

MrMage avatar MrMage commented on May 27, 2024

@Lukasa thank you for the elaboration! I can confirm that sending requests is now very fast, even from "off" the event loop (I guess acquiring 2000 locks isn't a major bottleneck yet). However, waiting for all requests to be received takes three times longer for each doubling of the request count:

1000 requests sent so far, elapsed time: 0.085628
total time to send 2000 requests: 0.12969
total time to receive 2000 responses: 1.721965
1000 requests sent so far, elapsed time: 0.110867
2000 requests sent so far, elapsed time: 0.176551
3000 requests sent so far, elapsed time: 0.232632
total time to send 4000 requests: 0.291121
total time to receive 4000 responses: 4.914795
1000 requests sent so far, elapsed time: 0.090735
2000 requests sent so far, elapsed time: 0.151104
3000 requests sent so far, elapsed time: 0.2133
4000 requests sent so far, elapsed time: 0.269586
5000 requests sent so far, elapsed time: 0.312878
6000 requests sent so far, elapsed time: 0.355842
7000 requests sent so far, elapsed time: 0.410365
total time to send 8000 requests: 0.453256
total time to receive 8000 responses: 15.107122

Looking at the Instruments traces, it does appear that flushing is the culprit here, and is a bottleneck, mostly on the client:

Screen Shot 2019-04-16 at 10 21 55

Screen Shot 2019-04-16 at 10 22 39

Screen Shot 2019-04-16 at 10 24 32

I assume this is the behavior you are describing; given that it only occurs in a fairly "extreme" test right now, there's probably no point in trying to fix this.

from swift-nio-http2.

MrMage avatar MrMage commented on May 27, 2024

Follow-up: I can confirm that injecting a

Thread.sleep(forTimeInterval: 0.00025)

inside the "send" loop completely eliminates the quadratic-runtime issue. However, that runtime appears to not be very relevant for 2000 requests, anyway; it only becomes apparent when sending 4000 or more requests.

from swift-nio-http2.

glbrntt avatar glbrntt commented on May 27, 2024

Follow-up: I can confirm that injecting a

Thread.sleep(forTimeInterval: 0.00025)

inside the "send" loop completely eliminates the quadratic-runtime issue. However, that runtime appears to not be very relevant for 2000 requests, anyway; it only becomes apparent when sending 4000 or more requests.

Sorry, why does this work?

from swift-nio-http2.

MrMage avatar MrMage commented on May 27, 2024

Sorry, why does this work?

The test runtime explodes because each flush call needs to iterate over all open streams. Throttling the number of send operations in flight reduces the amount of streams that are still open and thus need to be iterated over.

from swift-nio-http2.

glbrntt avatar glbrntt commented on May 27, 2024

The test runtime explodes because each flush call needs to iterate over all open streams. Throttling the number of send operations in flight reduces the amount of streams that are still open and thus need to be iterated over.

Ahh of course, thanks!

from swift-nio-http2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.