GithubHelp home page GithubHelp logo

Comments (2)

nyh avatar nyh commented on June 7, 2024

This problem is not specific to the HTTP server implementation. We had the same problem in ScyllaDB in other non-http servers, which like the HTTP server listen()ed and accept()ed connections in all cores in parallel. The bad port is detected as expected on shard 0, but other shards result in the assertion failure shown above, and cause the whole application to crash. The problem should probably be fixed in the context of listen() and/or accept(), not the HTTP server. I'm removing the HTTP server label from the issue.

from seastar.

xemul avatar xemul commented on June 7, 2024

The code in src/net/posix-stack.cc has a strange assert(i.second), with no explanation what it means, and this is the assert that fails.

The assertion means, that the code tried to emplace some new object to wait for connection, but emplace returned that that object was already there. See below for why the object was expected to be new and why it turned out that it wasn't.

This assert was introduced by @gleb-cloudius almost a decade ago, in commit 19addf6. The commit message explains that "only one thread opens listen socket and runs accept(). Other threads emulate listen()/accept() by waiting for connected socket from the main thread.". I'm not 100% sure that this commit message is actually up to date, because I was under the impression that today all cores accept() connections - and send them to other cores. But I'm not sure.

Currently only one core accepts sockets for real, other cores wait on sockets from that core via "ap-server" thing:

network_stack_entry register_posix_stack() {
    return network_stack_entry{
        "posix", std::make_unique<program_options::option_group>(nullptr, "Posix"),
        [](const program_options::option_group& ops) {
            return smp::main_thread() ? posix_network_stack::create(ops)
                                      : posix_ap_network_stack::create(ops);
        },
        true};
}   

the main_thread() returns true to only one shard's thread

This also raises another question: Is it important that http_server::listen() be called first on core 0, and only then on other cores? Right now I call all of them in parallel, and luckily it does run on core 0 first, but what if it runs first on core 1, before core 0?

This is how it works.

First, the httpd::listen() calls seastar::listen() and .accept() on the returned server_socket. On shard 0 this maps to posix_network_stack::listen() + posix_server_socket_impl::accept(). On non-0 shards this maps to posix_ap_network_stack::listen() + posix_ap_server_socket_impl::accept(). The ap-socket's accept code emplaces a "new object" mentioned above into sockets map to listen for connections coming from main thread. The sockets map is std::unordered_map<std::tuple<int, socket_address>, promise<accept_result>>, where int is the protocol (PF_... value). So it maps socket address to a promise which will be resolved into connected socket later.

Second, yet another httpd::listen() is called and it does the very same thing -- calls seastar::listen() and .accept() on the returned socket and both map to the same pair of calls as in first step. On non-main thread the .accept() call, which maps to posix_ap_server_socket_impl::accept(), tries to emplace one more "new object" mentioned above into sockets map to listen for connections coming from main thread, but the key it emplaces, is already there after the first https::listen()! It's already there, because socket_address part is the same for both -- first and second invocations. Thus the assertion.

from seastar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.