Comments (12)
Actually I may have misunderstood the settings code upon further review, but there still seems to be something wrong here in that the max concurrent streams is reduced below the number of open streams, yet another stream is opened immediately after acking.
from h2.
Yea, I think the logs are probably easy to confuse: it's saying that the local settings are only applied once the remote has acknowledged them. The remote settings are applied as soon as they are received.
It's certainly possible there is a bug in changing the max allowed streams. We have some tests, but could be some cases we haven't exercised. I know it's work, but it's a great first step: could you add a unit test that triggers what you're seeing?
from h2.
Hey Sean, thanks for the response. I have the time available to work on a unit test, but I'm not particularly confident that it's easily reproducible in such a setting. The context in which this comes up for us tends to be while we are rapidly pushing requests onto an individual h2 connection when the setting frame arrives. Ie, feels a bit racey.
One theory I have is that there might just be something queued in the prioritize layer at the point the settings frame arrives, but because it's already in the send queue, there's no mechanism to stop it. Do you think it's possible to write unit tests to cover that sort of scenario without going through the client API?
from h2.
So I started putting together a unit test to try to trigger what you're talking about. I didn't get it all the way there, but looking at the logs, it does look close: it seems stream 5 is being sent, even though stream 3 is still streaming, and the server received the settings ack. I might have the asserts slightly wrong...
It's currently in this branch: https://github.com/hyperium/h2/compare/i704-max-concurrent-streams-mid-connection
from h2.
Oh wow, that test does seem to exhibit the behavior we're seeing. Thanks for putting that together! This looks like it's all under Connection::poll. First we see
│ │ │ ├─0ms TRACE h2::proto::connection recv SETTINGS, frame=Settings { flags: (0x0), max_concurrent_streams: 1 }
│ │ │ ├─┐h2::proto::connection::poll_ready
│ │ │ │ ├─┐h2::codec::framed_write::FramedWrite::buffer frame=Settings { flags: (0x1: ACK) }
Which is acking the new settings. Later the logs show
│ │ │ ├─0ms TRACE h2::proto::streams::prioritize writing, frame=Headers { stream_id: StreamId(3), flags: (0x5: END_HEADERS | END_STREAM) }
│ │ │ ├─┐h2::codec::framed_write::FramedWrite::buffer frame=Headers { stream_id: StreamId(3), flags: (0x5: END_HEADERS | END_STREAM) }
│ │ │ │ ├─0ms DEBUG h2::codec::framed_write send, frame=Headers { stream_id: StreamId(3), flags: (0x5: END_HEADERS | END_STREAM) }
where the new streams are sent. The TRACE log here is particularly insightful; it looks like these are under the pending_send
queue in prioritize
which means there are no further checks to the max_send_streams
setting (those checks only happen when going from pending_open to pending_send, afaict).
Here's my analysis of the situation. I'm rather new to the code base so sorry if there are any big gaps or incorrect conclusions.
I'm looking at the Prioritize implementation, and there's kind of a lot going on here. There's two queues of interest
- pending_open, which is the set of streams waiting to transition to open state / send headers
- pending_send, which is supposed to be streams with frames ready and just waiting on connection capacity
Once streams transition to pending_send
, it is assumed that the connection can accept whatever is there whether that's additional data frames, or in our case, a new stream to open. Because there's no checks on that queue for max_send_streams, new streams get opened when they shouldn't.
It seems to me we either need to add checks to pending_send
to make sure there is still new stream capacity, or we need to be more conservative about adding new streams to pending_send
. The former option seems more complicated because now you don't have a simple FIFO situation, so I think a solution should focus on the latter of only popping from pending_open
when header frames may immediately be written to the Codec
(ie, eliminate the chance of desync between max_send_streams and the pending_send
queue depth of new streams).
With that in mind, my attention is brought to two places.
- Prioritize::pop_frame which itself does two things:
a. Callsschedule_pending_open
which moves as many streams frompending_open
topending_send
as there is currently capacity for.
b. Pops frames frompending_send
to write to the codec - Send::send_headers which appears to add the stream to both
pending_open
andpending_send
.
Ultimately I think there's a couple of changes needed here to enforce the invariant that pending_send always has frames that are allowed by the h2 protocl to be written:
schedule_pending_open
should yield precisely one stream to try and open, and it should immediately write the headers to the codec.send_headers
should only push to pending_open
Do you agree with this analysis? Curious for your take.
from h2.
Ah one other note, it looks like the way these frames are making it into pending_send
is via the send_headers
method which is ultimately called via SendRequest::send_request
.
from h2.
Building on your test, I was able to make a patch that seems to address the issue we are seeing. However, because the test suite in general doesn't do client.ready().await
, several tests start failing with this patch.
diff --git a/src/proto/streams/send.rs b/src/proto/streams/send.rs
index dcb5225..1e5e4be 100644
--- a/src/proto/streams/send.rs
+++ b/src/proto/streams/send.rs
@@ -143,21 +143,17 @@ impl Send {
// Update the state
stream.state.send_open(end_stream)?;
- if counts.peer().is_local_init(frame.stream_id()) {
- // If we're waiting on a PushPromise anyway
- // handle potentially queueing the stream at that point
- if !stream.is_pending_push {
- if counts.can_inc_num_send_streams() {
- counts.inc_num_send_streams(stream);
- } else {
- self.prioritize.queue_open(stream);
- }
- }
- }
+ if counts.peer().is_local_init(frame.stream_id()) && !stream.is_pending_push {
+ stream
+ .pending_send
+ .push_back(buffer, Frame::<B>::from(frame));
- // Queue the frame for sending
- self.prioritize
- .queue_frame(frame.into(), buffer, stream, task);
+ self.prioritize.queue_open(stream);
+ } else {
+ // Queue the frame for sending
+ self.prioritize
+ .queue_frame(frame.into(), buffer, stream, task);
+ }
Ok(())
}
diff --git a/tests/h2-tests/tests/client_request.rs b/tests/h2-tests/tests/client_request.rs
index b4fb847..7939b75 100644
--- a/tests/h2-tests/tests/client_request.rs
+++ b/tests/h2-tests/tests/client_request.rs
@@ -303,7 +303,8 @@ async fn recv_decrement_max_concurrent_streams_when_requests_queued() {
srv.ping_pong([0; 8]).await;
// limit this server later in life
- srv.send_frame(frames::settings().max_concurrent_streams(1)).await;
+ srv.send_frame(frames::settings().max_concurrent_streams(1))
+ .await;
srv.recv_frame(frames::settings_ack()).await;
srv.recv_frame(
@@ -312,6 +313,14 @@ async fn recv_decrement_max_concurrent_streams_when_requests_queued() {
.eos(),
)
.await;
+ srv.send_frame(frames::headers(3).response(200).eos()).await;
+ srv.recv_frame(
+ frames::headers(5)
+ .request("POST", "https://example.com/")
+ .eos(),
+ )
+ .await;
+ srv.send_frame(frames::headers(5).response(200).eos()).await;
srv.ping_pong([1; 8]).await;
};
@@ -335,6 +344,7 @@ async fn recv_decrement_max_concurrent_streams_when_requests_queued() {
.unwrap();
// first request is allowed
+ let mut client = h2.drive(async move { client.ready().await.unwrap() }).await;
let (resp1, _) = client.send_request(request, true).unwrap();
let request = Request::builder()
@@ -344,6 +354,7 @@ async fn recv_decrement_max_concurrent_streams_when_requests_queued() {
.unwrap();
// second request is put into pending_open
+ let mut client = h2.drive(async move { client.ready().await.unwrap() }).await;
let (resp2, _) = client.send_request(request, true).unwrap();
/*
@@ -363,7 +374,6 @@ async fn recv_decrement_max_concurrent_streams_when_requests_queued() {
assert_eq!(err.to_string(), "user error: rejected");
*/
-
h2.drive(async move {
resp1.await.expect("req");
})
The patch's intent is to make all requests go through the Prioritize layer's pending open queue so that the max concurrent streams check is done near send time.
To keep the semantics of before which seemed to have internal queuing, we'd need to lift the restriction that the SendRequest handle only holds one pending-open stream at a time and instead let them buffer in the Prioritize layer up to a point. Such a change is probably going to be necessary anyway to ensure reasonable throughput with the change to send_headers.
Note that this patch isn't sufficient to fully address the problem. The logic in Prioritize which moves streams from pending_open to pending_send is too eager and could again result in opening new streams over max concurrent.
I seem to be grokking the test suite and the code a bit better, and I will take a shot at addressing the other piece in prioritize later.
from h2.
I've applied your patch, read through the logs, and now I see what you mean. Clients are supposed to check that the previous request was actually sent before trying to send another one, but in practice since there is often space, many tests weren't actually checking.
We could fix up all the tests and say "well it was documented", but it likely will still surprise people if they did the same thing. Well, hm. (Stream of consciousness) I suppose people would only run into it if they were hitting a max stream limit, but in most cases, they should be able to work just fine, right? This patch would just be pushing back pressure back to the user faster, when the peer had signaled to slow down. I'm going to try exploring a little more on the broken tests.
from h2.
In my current WIP fix for the issue, I've added a next_send_stream_will_reach_capacity()
method on Counts
which the SendRequest handle can use as a signal to hold onto the stream and wait to be notified. This leaves a backpressure mechanism in place while leaving the decision whether to actually open the streams up to Prioritize
.
The piece I'm working on now is fixing schedule_pending_open
to be effectively pop_pending_open
and then to shove the new stream at the front of the pending_send
queue. Only one stream can be safely popped at a time when we know the Codec is ready for a write and that a Headers
frame won't sit in the pending_send
queue.
I've somewhat given up on writing a test to specifically exercise this failure mode for the moment. I was unsure whether it's actually possible to simulate headers frames being in the pending_send
queue of Prioritize, while streams are open, and when the server sends the settings frame reducing max stream count past the current count. Specifically, I was unsure whether the given abstractions have too "broad" of synchronization between client processing and server processing to catch the issue.
from h2.
The patch might still be rough around the edges but I wanted to share. Here's my current patch for this issue. Need to spend some time looking at tests now. The commit message explains in detail what I believe the problem to be and the approach as well as one alternative I had considered. Should I open a PR with this work for further discussion?
from h2.
I've somewhat given up on writing a test to specifically exercise this failure mode for the moment.
Yea, I never expected it would be easy to consistently do. It depends on internal implementation details... But the one I threw together was useful to trigger what it could on-demand. 🤷
Should I open a PR with this work for further discussion?
Certainly! I'll be back to full steam next week.
from h2.
Published v0.3.21 with the fix for this just now!
from h2.
Related Issues (20)
- Panicking following 0.3.17 HOT 1
- Unable to send response. HOT 3
- assertion failed: self.num_remote_reset_streams > 0 HOT 1
- How to set frame priority / Weight ? HOT 4
- Request processing always times out, HOT 1
- Handling of RFC7540 8.1.2.5 HOT 5
- Panic due to the max headers capacity HOT 1
- `h2spec 2.6.0` tests are failling against `h2` HOT 3
- Sending data frame results in transmission of more tcp segments than needed HOT 11
- Make tracing an optional feature
- Modifying the `akamai` example to send to `www.google.com` and adding a `host` header and errors HOT 2
- `max_send_buffer_size` documentation states default is 400 MB. It is 400 KB. HOT 2
- Remove tokio I/O traits from dependencies HOT 4
- http2 error: stream error received: unspecific protocol error detected HOT 4
- custom SETTINGS settings
- Proposal: change the default value of `initial_max_send_streams` to 100 HOT 7
- Documentation: How to handle GOAWAY HOT 2
- Respond with an HTTP 4xx than only a `RST_STREAM` to malformed requests HOT 2
- Default server configuration prevents the use of server pushes HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2.