I dont understand the reasoning behind the recorded and uncorrected response times. I

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

latency vs u_latency about wrk2 HOT 6 CLOSED

ckstettler commented on August 22, 2024

latency vs u_latency

from wrk2.

Comments (6)

giltene commented on August 22, 2024 4

The service time is 1 second. The response time is 1 second plus whatever amount of time your requests was waiting between when it was scheduled to start (according to your 5 tps schedule) and when it actually got started. The uncorrected latency measures service time, while the corrected latency measures response time.

Response time is the time is takes for a request to get served, as seen by the client. In your setup case, you made sure only one request can be served at a time (only 1 connection), and you scheduled client requests to start each 0.2 seconds (5 tps), making sure that the rate of incoming requests is higher than the server can possibly serve. This means that the longer the test runs, the longer the list of "client waiting but we haven't even started on their request" is, and form the client's point of viewe, the [response time] clock for each such client request is already ticking. The service time for each of those requests (in your setup) will be 1 second. But the response time (the time from when the requests was scheduled to start according to your 5 tps test schedule until a response occurs) will grow over time: The first request will be served 1 second from its scheduled start time. The second will be served 1.8 seconds from its scheduled start time ((N * 1.0) - ((N-1) * 0.2)). The 31th request [issued 30 seconds into the timing, right at the end of your 30 secondsrun] will be served 25 seconds from its scheduled start time. The average response time of the 31 requests will be (N*(N+1)/2)/N seconds, which is exactly 16 seconds....

To see why this measurement is the obviously right measurement for response time (as observed by clients), imagine the following:

When you go to a coffee shop that has one (single threaded) barista that takes 1 minute to make a cup of coffee, and 5 people show up and start waiting in line every minute, and you stop "starting new coffees" after 30 minutes (you shut the door after 30 minutes even though there are people waiting in line, and only serve the poeple who you started serving in those 30 minutes) the 31st person in line (the last person to be served, who came in 6 minutes after you opened) will end up waiting 25 minutes for their cup of coffee. When asked "how long did it take you to get a cup of coffee here?" That last person would say "25 minutes", while the first person would say "1 minute". On average (across all 31 people who's coffee got started on in the 30-minute window), the answer would be "16 minutes" ((31*32/2) / 31).

from wrk2.

ckstettler commented on August 22, 2024 1

Thank you for the response. I get what you are saying. Its kinda like a wind chill factor in the weather. You are reporting what it will feel like to the client due to the queuing.

I was having trouble understanding why it reported an ever increasing response time. I think I get it though. On other tools I have used, the virtual users are slowed by the server's delay. So 5 virtual users attempting 1 tps would be slowed and I generally would see something like 5 secs per request from each instead of 1 sec. It was more of a multiplier.

In your model you compute response times as if the rate never slows. In my case if the incoming rate is constant at 5 tps and I am servicing at 1 tps, then 5 more showed up before I could service one and by the time the first 5 are processed I have another 25+ more, thus the ever growing latency.

I still think it would be nice to have a choice of which metrics to display in the reporting. You provide a means to see the details in u_latency but you dont allow it in the summary data. I see why you have implemented it your way, but I still think about it the other way.

Again thank you for you time and for the tool. It has been fun dabbling in C again!!

from wrk2.

ckstettler commented on August 22, 2024 1

Just another note and I will let this one go. The other thing that confused me about this was that even though the response times were being recorded as 200 secs. There were never any timeouts which told me that the round trip was actually pretty fast. So I assume timeouts are computed on uncorrected data.

from wrk2.

giltene commented on August 22, 2024 1

Internal timeout counts in the load generator have no relation to the timeouts that clients of the tested service would experience.

It would certainly be possible to report on timeouts that actual clients would experience under the load scenario, assuming those clients timeout when not receiving a response within X time after starting a request. But that would simply amount to reporting "what % of (or how many total) requests had a response time of greater than X". But since high fidelity percentile output already exist for the response time metric, and those percentiles effectively included all the timeout information you could possibly want (without forcing you to choose a specific timeout level for the test), I don't see much point in adding correct time modeling for a specific timeout level.

from wrk2.

giltene commented on August 22, 2024

The value of this measurement is much more apparent when you test load of servers that "on the average" can keep up with the incoming request rates, but have temporary periods during the run where they don't. E.g when a service slows down temporarily for some housekeeping work like re-indexing, re-loading catalogs, pausing for garbage collection, or yielding the CPU to another service on the same machine, the rate of incoming requests can be temporarily higher than the service rate, and queue of started (from the client point of view) but not yet served requests will build up. If the server is able to (on the average) keep up with a higher-than -incoming rate, this queue will be drawn down over time, but for each request that a client started during that time, a longer response time will be experienced.

The simplest way to demonstrate the effect is to run a different version of your test: drop you server delay from 1 second to 0.1 seconds, such that your server can handle 2x the incoming 5tps request rate. Now run your 30 second test, but 10 seconds into the run, ^Z the server, and then let it keep running (with "fg") after 5 seconds (or use "kill -STOP ; sleep 5; kill -CONT " for reliable timing). Now compare the response time reported with the service time reported, and think of which one represents what happened from a client point of view... And then imagine the same thing happening in the real world, but with occasional 50msec or 150 msec blips instead of 5-second blips (such that none of your normal tools observe them happening), and which number will tell you what your service response time actually behaved like....

from wrk2.

bendizen commented on August 22, 2024

@ckstettler I came across the situations you described in a 1hr test using wrk2 caused by a pause for garbage collection. Gil's explanation makes sense yet I also agree it would be helpful to see the u_latency in the summary data. I still don't think I understand the reasoning why no timeout are recorded in this situation.

from wrk2.

latency vs u_latency about wrk2 HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs