The goal of this issue is to demonstrate the following: Why mi

Update As I start the benchmarking I realized that testing the <code class="notran

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Update] Benchmarking Microtests: State of the union of node/benchmark tests and where to go next about benchmarking HOT 16 CLOSED

nodejs commented on May 13, 2024

[Update] Benchmarking Microtests: State of the union of node/benchmark tests and where to go next

from benchmarking.

Comments (16)

ThePrimeagen commented on May 13, 2024

BTW, I was going to use this issue to update my progress on the aforementioned set of tasks then close when done.

from benchmarking.

ThePrimeagen commented on May 13, 2024

Update
As I start the benchmarking I realized that testing the require module is probably a bad place to start since its a one time start-up cost and probably the least concerning to a sophisticated node application. I will continue to get the tests up to simply prove out the framing of our benchmark tests and to get the first one being plotted.

from benchmarking.

gareth-ellis commented on May 13, 2024

I guess it depends what we're measuring - I'd say if it's startup we're measuring, then require is a cost that almost every node application is going to pay.

Maybe something for another issue, but in your experience, how are people (how is Netflix) running node? I remember at node.JS interactive the talk on Netflix mentioned the much faster startup time being a definite win for node. How often are node instances being restarted? Do you have any very short lived node applications?

from benchmarking.

mhdawson commented on May 13, 2024

The performance of require was called out specifically by sam-github here: #22. Even though startup is a one-time cost its still seem as a key metric for runtimes.

from benchmarking.

ThePrimeagen commented on May 13, 2024

@gareth-ellis @mhdawson I doubt the validity of this. But, as per agreed upon task set, I will start at require.

@yunong can correct me if I am wrong but the start up time of the website prior to node was significantly longer that it made node a huge win to switch to, regardless of require time. I know I am getting really pedantic but node startup time and application startup time can be considered separate... :) Just saying.

As mentioned above, I will start on require (will be done shortly) and then i'll move on to what I consider to be the next most used node core lib.

And yes @gareth-ellis we have shorter lived node apps. But short lived is probably relative term... @yunong can speak to the average life of our node servers.

from benchmarking.

ThePrimeagen commented on May 13, 2024

require, URL, and events are "done".

from benchmarking.

jeffreytgilbert commented on May 13, 2024

Where does http, keepalive, sockets, etc performance come in? There's certainly an argument to be made that if you're writing a service in node, you're going to need those basics to show perf gains over time. As important as it is to benchmark the methods you bake all your applications with, I don't know of anyone writing their own http module in javascript.

On that note, I'm curious to know if you still see javascript as the best mechanism for testing things like concurrent connection handling and how those things could fall over if benched from the same machine as the one thats running the http process. Additionally, clustering and forked process speeds have a serious lack of transparency into how fast they actually perform. I know there were scheduler changes in the latest releases vs 0.10 which were supposed to fix these, but I haven't found any benchmarks that showcase this.

from benchmarking.

jeffreytgilbert commented on May 13, 2024

To follow that last message up, here are some alternate benchmarking tools i've run in the past which offer fuel for the discussion.

https://github.com/wg/wrk
https://github.com/newsapps/beeswithmachineguns

from benchmarking.

ThePrimeagen commented on May 13, 2024

@jeffreytgilbert Good morning! I love the benchmarking tools you provided. I will definitely be checking that out.

As of right now a few of the members of the WG are working on getting the Acme Air application setup to be part of the benchmarking suite. My part is to focus from the other side. Here is my rough plan of attack.

Start with the smallest elements, true micro benchmark. These are useful metrics to have and give signals when the overall system has slowed down.
Build some macro tests out of combined micro benchmarks. Some form of caching algorithms, or other fun complex pieces of work that push one aspect to the max and avoids some of the pitfalls of micro benchmarks.
If Acme Air has yet to get off the ground I'll probably start setting up some various low level http tests. This is where the aforementioned libraries will potentially come in handy at that point. This is where I would be testing several of the spawning / process libraries (including cluster). I'll probably create some sort of aggregation service and see how many requests are able to be processed over some time period.

If you have any thoughts or ideas please let me know.

from benchmarking.

jeffreytgilbert commented on May 13, 2024

Saving the test data as a png is fine. Saving it to a new repo is fine. Save the data as well. Use time series data storage for some of the benchmarks. Opentsdb or a graphing system like influxdb or prometheus or one of those that you can tie into grafana or another suite.

from benchmarking.

mhdawson commented on May 13, 2024

In terms of the plans for how we are working initially to save the data and graph see https://github.com/nodejs/benchmarking/blob/master/benchmarks/README.md. I'm slowly working on this with one of the steps being to add the micro benchmarks added by @michaelbpaulson to those we generate graphs for.

from benchmarking.

jeffreytgilbert commented on May 13, 2024

Also see for comparison "are we fast yet". I believe I heard someone reference this on the videocast I listened in to, but my memory is failing me. https://arewefastyet.com/

It's relevant as it is currently comparing browsers using benchmark suites based on their build types and they plot results run over time so you can easily see deltas.

from benchmarking.

jeffreytgilbert commented on May 13, 2024

http://grafana.org/
http://opentsdb.net/
https://influxdata.com/
https://prometheus.io/

from benchmarking.

mhdawson commented on May 13, 2024

Yes the plan outlined in https://github.com/nodejs/benchmarking/blob/master/benchmarks/README.md. will results in graphs along these lines: https://github.com/nodejs/benchmarking/blob/master/benchmarks/startup_footprint/footprint.png

from benchmarking.

ofrobots commented on May 13, 2024

While I do think that micro-benchmarks have do their place, I wanted to offer a slightly different perspective.

My personal experience, and also the experience of the V8 team, has been that there is a lot of day to day churn and variability on micro-benchmarks. I personally have worked in teams that have wasted person-years of development time because of spurious regressions on micro-benchmarks (_cough_CaffeineMark etc.cough) in a former life.

Micro-benchmarks try to set precise expectations about how a specific piece of code is going to be executed by the computer. It is hard to have that but in a managed runtime where JIT and GC are constantly conspiring against you, in different ways each day. As a result your micro-benchmarks may not be measuring what you expect them to be measuring. The exact timing of when the JIT decides to optimize things, or how much it wants to optimize things, or the exact thresholds at which V8 chooses to switches a Array to dictionary mode, can affect micro-benchmarks a lot more than real code. It may be inadvertently sensitive to such incidental details, and this sensitivity may completely drown out the signal that you were originally intending to capture.

Here's a video (by the awesome Vyacheslav Egorov) that talks about benchmarking JavaScript, that touches on these topics: https://www.youtube.com/watch?v=65-RbBwZQdU. See also a related discussion on the topic of micro-benchmarks: robohornet/robohornet#67

It is not always clear if the results from a micro-benchmark will necessarily translate to real world applications. Apart from a being bad predictor, this also may misplace the incentives for a VM implementer from getting work done to improve performance of real world applications.

Having said that, I do think targeted micro-benchmark make sense for specific things. It is perfectly reasonable to have a benchmark for startup performance. It is also perfectly reasonable to have a simple benchmark to measure http server throughput for a trivial http server.

The point I am trying to convey is you can evaluate a benchmark on a case by case basis only. I would contest a blanket statement about virtues of micro-benchmarks in general, and I would be against this WG adopting such a philosophy. A well-realized performance regression suite should also have larger workloads derived from real world use-cases.

from benchmarking.

mhdawson commented on May 13, 2024

@michaelbpaulson I think we should close this for now as there has been no update for > 1year. If you want to restart the conversation feel free to re-open.

from benchmarking.

[Update] Benchmarking Microtests: State of the union of node/benchmark tests and where to go next about benchmarking HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs