GithubHelp home page GithubHelp logo

Comments (3)

greensky00 avatar greensky00 commented on June 24, 2024

Hi @Steamgjk

  1. It depends on 1) your workload and 2) how many cores your machine has. In many-core machines, more threads will help to achieve better CPU utilization. If your workload is enough to fully utilize 4 threads (so that CPU usage is nearly 400%) and you have more than 4 cores, then increasing the thread pool size will make the performance better. Otherwise, the improvement will be marginal.

  2. Examples use raft_launcher which internally creates asio_listener_ :

    asio_listener_ = asio_svc_->create_rpc_listener(port_number, lg);

  3. There are a few different Raft settings:

    raft_params params;
    params.heart_beat_interval_ = 500;
    params.election_timeout_lower_bound_ = 1000;
    params.election_timeout_upper_bound_ = 2000;
    params.reserved_log_items_ = 10000000;
    params.snapshot_distance_ = 100000;
    params.client_req_timeout_ = 4000;
    params.return_method_ = raft_params::blocking;

    If you set the same parameters, there should be no performance difference.

from nuraft.

Steamgjk avatar Steamgjk commented on June 24, 2024

Thanks for the explanation, @greensky00
I just made a bench test on Google Cloud. I directly use the bench program in the repo. I am using 3 replica VMs, each is n1-standard-32 type (that means, each VM has 32 cores), surely asio_opt.thread_pool_size_ = 32. The testing result is as follows. We can see that the max throughput is only 34.3K/second, and p50 latency is 845 us.

图片

Then, I made a low-load test, and the result is as follows. With 1K/second load, the p50 latency is 362 us.
图片

In my cluster, the ping latency is around 250~300 us, that means one RTT should be around this value. Considering the message serialization/deserialization also takes some time, I feel fine with the low-load latency (362 us) [If we become more critical, we can still find some inconsistency between the results and your reported bench results, because your RTT is 180us and you can reach the median value of 187 us. For me, my RTT is around 250 but I reach 362us]. However, I am a little concerned about the throughput number, and there is still 7K/second gap between my result and your reported bench results (around 40K/second with 16 client threads). And according to your bench report, the replicas only have 8 cores, so that means, I am using more powerful VMs but earns less powerful results. Do you have any idea about that? What black magics can we use to further improve the throughput?

图片

from nuraft.

greensky00 avatar greensky00 commented on June 24, 2024

@Steamgjk
The numbers on the benchmark result page are just for reference, and of course, the performance will vary according to the environment.

And note that the workload generated by the benchmark program will not be CPU-bound unless the network is super-fast. That means having 32 cores does not help to improve the performance. Given the fact that your network environment is a bit slower than that of our data center, the discrepancy of 34.3K vs. 40K seems reasonable.

Regarding p50 latency, the number in the result page was measured with the workload of a single client thread + max throughput. You can increase your target throughput to a big number (let's say 1M) and re-measure it:

raft_bench 1 10.128.0.59:12345 120 1000000 1 256 10.128.0.73:12345 10.128.0.28:12345

If you use multiple client threads and higher throughput, a longer p50 latency is expected. Even though client threads independently call the Raft API in parallel, each request will be assigned with a unique Raft log index number, and replication should be done exactly in that order. That means some requests with bad luck should wait for the completion of the replication (including commit) of Raft logs with smaller index numbers, and this wait time is reflected in the latencies.

from nuraft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.