GithubHelp home page GithubHelp logo

bearcove / fluke Goto Github PK

View Code? Open in Web Editor NEW
271.0 7.0 8.0 5.29 MB

HTTP 1+2 in Rust, with io_uring & ktls

Home Page: https://docs.rs/fluke

License: Apache License 2.0

Rust 99.78% Just 0.21% Shell 0.02%
http io-uring rust-lang rust

fluke's People

Contributors

dependabot[bot] avatar fasterthanlime avatar frankreh avatar github-actions[bot] avatar paolobarbolini avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fluke's Issues

Consider using codecov again

rc-zip is still using it, things are pretty quiet on that front, they reassured us that they will stay free for open-source moving forward, seems safe to use it again.

Add CI for macOS & Windows

GitHub has hosted runners for that, we can keep the self-hosted ones for Linux jobs (with io_uring support).

Actually, I should check if GitHub's hosted runners let us do io_uring, it has been around for a while after all.

Move off of codecov

Sentry now owns codecov, which means:

  • they're pretending it's open-source (it's BUSL)
  • the "free for OSS" tier is gone, it's now a generic "free for 5 active developers, 250 uploads a month" tier, which makes it unsuitable for this
  • they have a "proof of concept" self-hosted repo for if you want to run it on your macbook, this is also unsuitable, also the README says it's based on a deprecated version and they apparently haven't had time to remove license checking

I probably know people who know people who can get free credits or carve whatever exception for this repo, but I'm not interested: it seems to me codecov is going the same way travis did, and this time I'm not sure what replacement there is, apart from just.. hosting the llvm-cov HTML reports somewhere and writing scripts to track coverage, I guess.

h2spec tracking (80 passed, 14 failed)

Ran h2spec today, with:

$ just h2spec-server
(cut)
Listening on [::]:8888
h2spec -p 8888 -o 1
(cut)
Finished in 49.2419 seconds
146 tests, 70 passed, 1 skipped, 75 failed

Even with a super naive implementation that fails to check for a lot of preconditions / stubs large parts of the spec, we're already off to a good start!

h2: The read task can't own all the state

When we write an "EndStream", the stream state is supposed to transition to "Closed", but we only know that from the write task.

That's nothing one more channel can't fix, I suppose.

Maybe the deframer should be separate? And it should feed (H2Frame | SomeEvent) to the H2ReadContext?

Use writev more heavily in h2 server, pool frame headers

Frame::write allocates a Vec of size 9 so we have a stable address for io_uring. We could allocate this once in the h2_write_loop and re-use it, since we only ever write one frame at a time.

Because Frame::write issues a write_all of its own, sometimes we call write_all as many as three times, when we could simply call writev_all once.

Upgrade to 2023-10-18

The clippy fix for spurious "needless mutable ref" warning has been merged, and stuff is happening re: async fn in trait that we should follow closely (to eventually build on stable)

h2: Track h2spec regressions

Right now there's no code in CI that compares the number of passed/failed tests with the number of passed/failed tests in main.

This seems mildly annoying to do, but someone should do it.

h2: If we send a GOAWAY, we shouldn't be surprised at a connection reset

Describe the bug

Not really a bug but see this log, that's the end of http2/5.1.2:

 WARN fluke::h2::server: connection error: max concurrent streams exceeded (more than 32) (MaxConcurrentStreamsExceeded { max_concurrent_streams: 32 }) (code ProtocolError)
DEBUG fluke::h2::server: Sending GoAway last_stream_id=65 error_code=ProtocolError
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 29, payload: BodyChunk }
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 63, payload: Headers }
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 31, payload: BodyChunk }
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
        ✔ 1: Sends HEADERS frames that causes their advertised concurrent stream limit to be exceeded
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 33, payload: BodyChunk }

Finished in 0.0596 seconds
1 tests, 1 passed, 0 skipped, 0 failed
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 35, payload: BodyChunk }
DEBUG fluke::h2::server: Handler completed successfully, gave us a responder
DEBUG fluke::h2::server: Writing ev=H2Event { stream_id: 37, payload: BodyChunk }
DEBUG fluke::h2::server: caught error from one of the tasks: read error: read_into for read_and_parse::<fluke::h2::parse::Frame> / Read(
    Error {
        msg: "read_into for read_and_parse::<fluke::h2::parse::Frame>",
        source: Os {
            code: 104,
            kind: ConnectionReset,
            message: "Connection reset by peer",
        },
    },
)
ERROR fluke_h2spec: error serving client 127.0.0.1:60488: read error: read_into for read_and_parse::<fluke::h2::parse::Frame>
DEBUG fluke::h2::server: Handler returned an error: could not send event to h2 connection handler
DEBUG fluke::h2::server: Handler returned an error: could not send event to h2 connection handler
 WARN fluke::h2::encode: could not send event to h2 connection handler
 WARN fluke::h2::encode: could not send event to h2 connection handler

Expected behavior

If we've sent a GOAWAY, we should fully expect to 1) not be able to write further messages, 2) get a connection reset when we try to read more messages.

This can probably be done by just having a "has_sent_goaway" flag somewhere.

h2: Send GOAWAY (connection error) on HPACK errors

just h2spec hpack/6.1
(cut)
Failures: 

HPACK: Header Compression for HTTP/2
  6. Binary Format
    6.1. Indexed Header Field Representation
      × 1: Sends a indexed header field representation with index 0
        -> The endpoint MUST treat this as a decoding error.
           Expected: GOAWAY Frame (Error Code: COMPRESSION_ERROR)
                     Connection closed
             Actual: Timeout

Finished in 1.0043 seconds
1 tests, 0 passed, 0 skipped, 1 failed

Consider splitting the protocol implementations into another crate

Right now, this crate contains an implementation of HTTP 1.1. However, one of my issues with hyper is that all of the protocol code is welded to the I/O code, and it's generally very hard to separate them if you want to do something with just the protocol code. For example, if I wanted to write my own HTTP implementation but using a synchronous API instead of the ring-based API, it would be preferable to use an HTTP protocol implementation that's already in use by well-established pieces of software in the ecosystem rather than duplicating time and effort by reimplementing my own.

See this page for more information on this idea. A success story currently in use in the Rust ecosystem is x11rb-protocol, which is currently in use by x11rb as well as my own implementation of X11, breadx.

I'd like to make sure this crate goes in that direction. It would involve taking the code currently in the h1 module and splitting the parts that deal with the protocol out into another crate. If this is desired, I can implement this.

Bring in hring-tls, fork hpack

Some of the crates in this repositories are not in a cargo workspace on purpose (h2spec-server, etc.), but some probably should be, and they should be published together.

We need to fork hpack to pass all h2spec tests (sending a SizeUpdate at the end of a field block is a protocol error), and having hring-tls under fasterthanlime/ seems ill-advised, see https://github.com/fasterthanlime/hring-tls

h2: Support PING frames

./h2spec -p 8888 generic/3.7/1
Generic tests for HTTP/2 server
  3. Frame Definitions
    3.7. PING
      × 1: Sends a PING frame
        -> The endpoint MUST accept PING frame.
           Expected: PING Frame (length:8, flags:0x01, stream_id:0, opaque_data:h2spec)
             Actual: Connection closed

Failures: 

Generic tests for HTTP/2 server
  3. Frame Definitions
    3.7. PING
      × 1: Sends a PING frame
        -> The endpoint MUST accept PING frame.
           Expected: PING Frame (length:8, flags:0x01, stream_id:0, opaque_data:h2spec)
             Actual: Connection closed

Finished in 0.0013 seconds
1 tests, 0 passed, 0 skipped, 1 failed

Set `uri` in `Request` struct

h2 sends us everything we need:

2023-01-21T18:10:51.046209Z DEBUG hring::h2::server: HEADER | :method: GET
2023-01-21T18:10:51.046218Z DEBUG hring::h2::server: HEADER | :scheme: http
2023-01-21T18:10:51.046230Z DEBUG hring::h2::server: ignoring pseudo-header
2023-01-21T18:10:51.046235Z DEBUG hring::h2::server: HEADER | :path: /
2023-01-21T18:10:51.046478Z DEBUG hring::h2::server: HEADER | :authority: 127.0.0.1:8888

h1 a little less:

2023-01-21T18:11:05.748766Z DEBUG hring::h1::server: src/h1/server.rs:79: got request Request { method: POST, path: "/echo-body", version: HTTP/1.1 }

(host headers would show up here if we had any).

There's no real equivalent to scheme in HTTP/1.1, unfortunately, although we could pass that knowledge from somewhere else, potentially.. I'm not sure what hyper does here.

Why does `curl_echo_body_chunked` take seconds to pass?

Full output:

just test curl_echo_body_chunk --no-capture
just build-testbed
cargo build --release --manifest-path hyper-testbed/Cargo.toml
    Finished release [optimized] target(s) in 0.02s
RUST_BACKTRACE=1 cargo nextest run curl_echo_body_chunk --no-capture
    Finished test [unoptimized + debuginfo] target(s) in 0.03s
    Starting 1 tests across 4 binaries (35 skipped)
       START             hring::integration_test curl_echo_body_chunked

running 1 test
2023-01-21T18:06:35.535493Z DEBUG integration_test::proxy: tests/proxy.rs:131: Accepted connection from [::1]:53932
2023-01-21T18:06:35.536614Z DEBUG hring::h1::server: src/h1/server.rs:79: got request Request { method: POST, path: "/echo-body", version: HTTP/1.1 }
2023-01-21T18:06:35.536637Z DEBUG integration_test::proxy: tests/proxy.rs:36: making new connection to upstream!
2023-01-21T18:06:35.536693Z DEBUG hring::util: src/util.rs:92: writing 149 bytes in 25 chunks
2023-01-21T18:06:35.536739Z DEBUG hring::util: src/util.rs:96: wrote 149/149
2023-01-21T18:06:35.537162Z DEBUG integration_test::testbed: tests/testbed.rs:33: [upstream] Handling Parts { method: POST, uri: /echo-body, version: HTTP/1.1, headers: {"host": "[::]:37543", "accept": "*/*", "transfer-encoding": "chunked", "content-type": "application/octet-stream", "expect": "100-continue"} }
2023-01-21T18:06:35.537275Z DEBUG hring::h1::client: src/h1/client.rs:97: client received response
2023-01-21T18:06:35.537291Z DEBUG hring::types: src/types/mod.rs:84: got response code=200 OK version=HTTP/1.1
2023-01-21T18:06:35.537298Z DEBUG hring::types: src/types/mod.rs:86: got header name=transfer-encoding value=Ok("chunked")
2023-01-21T18:06:35.537309Z DEBUG hring::types: src/types/mod.rs:86: got header name=date value=Ok("Sat, 21 Jan 2023 18:06:35 GMT")
2023-01-21T18:06:35.537323Z DEBUG hring::util: src/util.rs:92: writing 84 bytes in 14 chunks
2023-01-21T18:06:35.537366Z DEBUG hring::util: src/util.rs:96: wrote 84/84
2023-01-21T18:06:35.537397Z DEBUG integration_test: tests/integration_test.rs:576: curl read header: Length: 17 (0x11) bytes
0000:   48 54 54 50  2f 31 2e 31  20 32 30 30  20 4f 4b 0d   HTTP/1.1 200 OK.
0010:   0a                                                   .
2023-01-21T18:06:35.537423Z DEBUG integration_test: tests/integration_test.rs:576: curl read header: Length: 28 (0x1c) bytes
0000:   74 72 61 6e  73 66 65 72  2d 65 6e 63  6f 64 69 6e   transfer-encodin
0010:   67 3a 20 63  68 75 6e 6b  65 64 0d 0a                g: chunked..
2023-01-21T18:06:35.537439Z DEBUG integration_test: tests/integration_test.rs:576: curl read header: Length: 37 (0x25) bytes
0000:   64 61 74 65  3a 20 53 61  74 2c 20 32  31 20 4a 61   date: Sat, 21 Ja
0010:   6e 20 32 30  32 33 20 31  38 3a 30 36  3a 33 35 20   n 2023 18:06:35 
0020:   47 4d 54 0d  0a                                      GMT..
2023-01-21T18:06:35.537459Z DEBUG integration_test: tests/integration_test.rs:576: curl read header: Length: 2 (0x2) bytes
0000:   0d 0a                                                ..
2023-01-21T18:06:36.536920Z DEBUG integration_test: tests/integration_test.rs:565: sending 23 bytes
2023-01-21T18:06:36.536986Z DEBUG integration_test: tests/integration_test.rs:565: sending 0 bytes
2023-01-21T18:06:36.537081Z DEBUG hring::util: src/util.rs:92: writing 17 bytes in 3 chunks
2023-01-21T18:06:36.537149Z DEBUG hring::util: src/util.rs:96: wrote 17/17
2023-01-21T18:06:36.537186Z DEBUG hring::util: src/util.rs:92: writing 16 bytes in 3 chunks
2023-01-21T18:06:36.537250Z DEBUG hring::util: src/util.rs:96: wrote 16/16
2023-01-21T18:06:36.537284Z DEBUG hring::util: src/util.rs:92: writing 17 bytes in 3 chunks
2023-01-21T18:06:36.537332Z DEBUG hring::util: src/util.rs:96: wrote 17/17
2023-01-21T18:06:36.537344Z DEBUG integration_test: tests/integration_test.rs:571: receiving 12 bytes
2023-01-21T18:06:36.537376Z DEBUG hring::h1::body: src/h1/body.rs:305: writing h1 body end mode=Chunked
2023-01-21T18:06:36.537412Z DEBUG hring::util: src/util.rs:92: writing 5 bytes in 1 chunks
2023-01-21T18:06:36.537472Z DEBUG hring::util: src/util.rs:92: writing 16 bytes in 3 chunks
2023-01-21T18:06:36.537529Z DEBUG hring::util: src/util.rs:96: wrote 5/5
2023-01-21T18:06:36.537555Z DEBUG hring::h1::client: src/h1/client.rs:76: done writing request body
2023-01-21T18:06:36.537598Z DEBUG hring::util: src/util.rs:96: wrote 16/16
2023-01-21T18:06:36.537602Z DEBUG integration_test: tests/integration_test.rs:571: receiving 11 bytes
2023-01-21T18:06:36.537642Z DEBUG hring::h1::body: src/h1/body.rs:305: writing h1 body end mode=Chunked
2023-01-21T18:06:36.537660Z DEBUG hring::util: src/util.rs:92: writing 5 bytes in 1 chunks
2023-01-21T18:06:36.537689Z DEBUG hring::util: src/util.rs:96: wrote 5/5
2023-01-21T18:06:36.537706Z DEBUG integration_test: tests/integration_test.rs:587: Got HTTP 200, body: Length: 23 (0x17) bytes
0000:   50 6c 65 61  73 65 20 72  65 74 75 72  6e 20 74 6f   Please return to
0010:   20 73 65 6e  64 65 72                                 sender
2023-01-21T18:06:36.537768Z DEBUG integration_test::proxy: tests/proxy.rs:148: Shutting down proxy
2023-01-21T18:06:36.537781Z DEBUG integration_test::proxy: tests/proxy.rs:154: Proxy server shutting down.
2023-01-21T18:06:36.537791Z DEBUG hring::h1::server: src/h1/server.rs:65: client went away before sending request headers
2023-01-21T18:06:36.537803Z DEBUG integration_test::proxy: tests/proxy.rs:144: Done serving h1 connection
2023-01-21T18:06:36.537844Z DEBUG integration_test: tests/integration_test.rs:604: everything has been joined
test curl_echo_body_chunked ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 10 filtered out; finished in 1.00s

        PASS [   1.008s] hring::integration_test curl_echo_body_chunked
------------
     Summary [   1.008s] 1 tests run: 1 passed, 35 skipped

I'm not sure, but it feels odd. I don't see any "sleep for 1 second" code anywhere.

Allow building against non-uring tokio

This could be useful for folks who want to contribute from macOS/Windows.

The WriteOwned / ReadOwned traits are easier to implement against tokio than the other way around.

Here are the sticking points:

  • read / write in tokio land take &mut - even thought that's silly for sockets, tbqh. We have an unclear story around read/write ends of a socket right now in hring, I think we could clean it up - at least in h2, it's cleanly separated (at last).
  • We can no longer do tokio_uring::spawn in any of the core hring crates - we'll have to abstract over tokio_uring::spawn and, most probably spawn_local (that's what tokio_uring's spawn does right now anyway). - that means hring's non-uring version will only work inside of a LocalSet, a limitation I'm fine with (it's all build around Rc, not Arc)

I think that's most of it. It should be a relatively small investment and it could be interesting to compare hring with and without io_uring, compare the number of syscalls, throughput etc.

h2: Ignore unknown frame types

h2spec -p 8888 http2/4.1/1
Hypertext Transfer Protocol Version 2 (HTTP/2)
  4. HTTP Frames
    4.1. Frame Format
      × 1: Sends a frame with unknown type
        -> The endpoint MUST ignore and discard any frame that has a type that is unknown.
           Expected: PING Frame (length:8, flags:0x01, stream_id:0, opaque_data:)
             Actual: Connection closed

Failures: 

Hypertext Transfer Protocol Version 2 (HTTP/2)
  4. HTTP Frames
    4.1. Frame Format
      × 1: Sends a frame with unknown type
        -> The endpoint MUST ignore and discard any frame that has a type that is unknown.
           Expected: PING Frame (length:8, flags:0x01, stream_id:0, opaque_data:)
             Actual: Connection closed

Finished in 0.0015 seconds
1 tests, 0 passed, 0 skipped, 1 failed

h2: Does State even need to be an `Rc<RefCell>` ?

Afaict only h2_read_loop uses it. We could save a lot of grief by just.. not protecting it at all. A lot of the ugliness around futures there is making sure we don't await while holding the RefCell guard.

WriteOwned and write

Hi. What's your take on whether or not writev should be interpreted as a write all?

I suspect most people would say it shouldn't be, as there is a count being returned.

So if an implementation wants to simulate a writev with a looping write, should it check the length of each write result and exit the loop early if the write wasn't complete?

And in the case of the owned API, should it have to rebuild the Vec so all the entries are there for the caller to reuse, perhaps after modifying the vec slice and one of the vec entry slices?

Since the input vec argument ownership is passed in, it seems reasonable to replace the entries per write but I haven't tried to play with that yet to see how easy the ownership is kept straight, without resorting to unsafe.

Consider using HeaderMap<PieceStr>?

There's a lot of smarts in the http crate's HeaderMap and I'm guessing we'd win more performance than we'd lose (compared to just borrowing everything). Also the effort to port something from hyper to hring would be much smaller then.

h2: Clean up headers/trailers handling

The codepaths are gnarly and duplicated.

There's a code comment around end_headers that shows what my plan is. Since it needs both "do sync stuff while holding a RefCell write guard" and "do async stuff while not holding it", it probably needs to error out on the result being unused, the closest to linear types we get 😭

Simplify buffer interface greatly

AggBuf shouldn't be that complicated. Advancing should split the buffer, always, and return an AggSlice (maybe those should be renamed AggMut and Agg). AggSlice should not hold a Rc to AggBuf / AggBufInner, it should have a SmallVec<[Buf; 1]> and that's it.

This should remove the need for borrow_mut in a bunch of places.

Bring hring-hpack into the year 2021

  • Make sure all tests are run in CI (there's an optional feature using rustc-serialize, what's it about?
  • Resolve existing warnings (unused functions, etc.)
  • Bring from Rust 2015 to Rust 2021
  • Migrate from log to tracing
  • Migrate from rustc-serialize to serde_json

Drop `BufOrSlice`, just use `Piece` for everything.

That involves changing the WriteOwned trait, which is fine, whatever.

This is blocked on #150


context: writev might write an arbitrary number of bytes, you might give it ["hello", "world"] and it may write until the second "o", so then you need your buffer list to be ["rld"]: hence the BufOrSlice enum to be able to give io_uring the address of "rld".

but for http/2 framing reasons, I recently had to do the same thing: splitting Piece, an enum of &'static [u8], HeaderName, Vec<u8>, etc. into PieceCore (that enum), and Piece, an enum with Full and Slice variants.

I'd be fine if writev_all only accepted PieceList actually.

h1: Get rid of heap allocation when writing uri.path()

encode_request could take a &mut RollMut and write to it with .put instead of doing a heap allocation.

(We need a stable address to write it with io_uring, but we don't need to bother the heap allocator for that).

Consider switching to sccache-action

cf. https://github.com/Mozilla-Actions/sccache-action

I'm not sure the cache action we're using for Win & macOS right now is doing its job well: on a PR, it took 2m30s to set itself up:

image

..then cargo check took 2m+ and cargo nextest 1.5m, for pretty simple changes. Maybe it just doesn't work well for PRs, but then why does it report the cache it found is a "full match"? And also, then why do we want it? PRs are how all the development happens.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.