This is some result abount OnePublisherToOneProcessorRawBatchThroughputTest Run 0,

BTW, I tried the JDK8 <a class="user-mention notranslate" data-hovercard-type="user" d

Someone intrest this result? about disruptor HOT 9 CLOSED

lmax-exchange commented on May 14, 2024

Someone intrest this result?

from disruptor.

Comments (9)

mikeb01 commented on May 14, 2024

Actually that test is a little bit of inside joke on my part, demonstrating how you can lie with a benchmark. All it is doing is testing how fast the sequencer can signal the consumer, but only on every 10th update. It doesn't do any actual useful work.

How does the modest-lock fair with the OnePublisherToOneProcessorUniCastThroughputTest? Also do you have a link to the code?

If all I wanted to do was improve that test I could just have one thread polling a sequence with another thread updating it in batches of 10.

Existing Disruptor:
Starting Disruptor tests
Run 0, Disruptor=1,998,001,998 ops/sec
Run 1, Disruptor=2,079,002,079 ops/sec
Run 2, Disruptor=2,157,497,303 ops/sec
Run 3, Disruptor=2,114,164,904 ops/sec
Run 4, Disruptor=2,152,852,529 ops/sec
Run 5, Disruptor=2,205,071,664 ops/sec
Run 6, Disruptor=3,577,817,531 ops/sec
Run 7, Disruptor=3,546,099,290 ops/sec
Run 8, Disruptor=3,610,108,303 ops/sec

Simple polling code:
Starting Disruptor tests
Run 0, Disruptor=6,191,950,464 ops/sec
Run 1, Disruptor=6,042,296,072 ops/sec
Run 2, Disruptor=6,369,426,751 ops/sec
Run 3, Disruptor=6,289,308,176 ops/sec
Run 4, Disruptor=6,389,776,357 ops/sec

from disruptor.

qinxian commented on May 14, 2024

OnePublisherToOneProcessorRawBatchThroughputTest
Seems both batch 10, a little improvement in my machine. strange! should be ~=10x
Run 0, Disruptor=871,839,581 ops/sec
Run 1, Disruptor=811,030,008 ops/sec
Run 2, Disruptor=1,231,527,093 ops/sec
Run 3, Disruptor=1,320,132,013 ops/sec
Run 4, Disruptor=1,320,132,013 ops/sec
Run 5, Disruptor=1,185,536,455 ops/sec
Run 6, Disruptor=1,143,510,577 ops/sec
Run 7, Disruptor=1,067,235,859 ops/sec
Run 8, Disruptor=1,243,008,079 ops/sec
Run 9, Disruptor=1,268,230,818 ops/sec
Run 10, Disruptor=1,391,788,448 ops/sec
Run 11, Disruptor=1,334,222,815 ops/sec
Run 12, Disruptor=1,320,132,013 ops/sec
Run 13, Disruptor=1,292,824,822 ops/sec
Run 14, Disruptor=1,255,492,780 ops/sec
Run 15, Disruptor=1,267,427,122 ops/sec
Run 16, Disruptor=1,219,512,195 ops/sec
Run 17, Disruptor=1,243,008,079 ops/sec
Run 18, Disruptor=1,164,144,353 ops/sec
Run 19, Disruptor=1,085,187,194 ops/sec

OK lets back to the 10-1 pattern.
Indeed the modest-lock very simple,just like this:

if((counter&1)==1) Thread.yield();
return counter-1;

This result after applied the modest-lock to this test, still use write10:read1 pattern
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec

change SingleProduceSequencer to next reserve operation by Thread.yield instead of parkNanos
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec

apply the modest-lock to the next reserve operation:
Run 0, Disruptor=1,490,312,965 ops/sec
Run 1, Disruptor=1,489,203,276 ops/sec
Run 2, Disruptor=1,662,510,390 ops/sec
Run 3, Disruptor=1,683,501,683 ops/sec
Run 4, Disruptor=1,662,510,390 ops/sec
Run 5, Disruptor=1,683,501,683 ops/sec
Run 6, Disruptor=1,683,501,683 ops/sec
Run 7, Disruptor=1,662,510,390 ops/sec
Run 8, Disruptor=1,662,510,390 ops/sec
Run 9, Disruptor=1,662,510,390 ops/sec
Run 10, Disruptor=1,661,129,568 ops/sec
Run 11, Disruptor=1,662,510,390 ops/sec
Run 12, Disruptor=1,662,510,390 ops/sec
Run 13, Disruptor=1,662,510,390 ops/sec
Run 14, Disruptor=1,683,501,683 ops/sec
Run 15, Disruptor=1,683,501,683 ops/sec
Run 16, Disruptor=1,662,510,390 ops/sec
Run 17, Disruptor=1,684,919,966 ops/sec
Run 18, Disruptor=1,683,501,683 ops/sec
Run 19, Disruptor=1,684,919,966 ops/sec

BTW: a strange feeling: Indeed, we just do guess for the OS scheduler program.
All we need is the right scheduler, but ...

from disruptor.

qinxian commented on May 14, 2024

I created a gist at here:https://gist.github.com/qinxian/5771879

from disruptor.

mikeb01 commented on May 14, 2024

Are you running with HyperThreading enabled?

from disruptor.

qinxian commented on May 14, 2024

NO!
AMD x3:)

from disruptor.

qinxian commented on May 14, 2024

BTW, I tried the JDK8 @contended annotation at field.
seems the padding volatile long field faster than the long[] implementation.

from disruptor.

mikeb01 commented on May 14, 2024

The different I see with ModestLock is not as marked as your results and the difference on the OnePublisherToOneProcessorUniCastThroughputTest is lower than the noise. I think these small optimisations will vary between hardware platforms. One of reasons we made the WaitStrategy pluggable is to allow these types of optimisations. If it speeds up your system end to end, then go for it, but don't base your decision on the OnePublisherToOneProcessorRawBatchThroughputTest, as it doesn't test anything useful, base it on your own macro-benchmarks.

I've also had a go with @contended, didn't make a massive difference, but it should be a little bit quicker as it would remove one indirection. Unfortunately it will be a while before Java 8 is the standard. I might do a Java 8 specific version if there is enough interest.

from disruptor.

qinxian commented on May 14, 2024

Expected when P10-C10 pattern improvement 2.X over P10-C1 pattern in your result.

Indeed I always use JDK8 with win8 on AMD HT.
At these cases, If both ends employ modest-lock, the version work at 2.X effect.
so some deduction on the spec-relative seems: Multiple publisher can obtain a profit from similar way.
From above messages, seems you test for Intel HT. so seems it's IHT vs. AHT.
but I still intrest the Intel HT modest-lock test results at now.

Of cause, the results just only an one-test-case. As to if useful or useless, depends.
but it does be some kind of reference, right?

BTW, like before "guess" words, some sadness on the kernel. One like me, only works at high level, no some willing to lower level, maybe cannot, maybe kernel cannot. A real world!

BTW, you do plan refactor the WaitStrategy to more generalized? I did some work.

from disruptor.

mikeb01 commented on May 14, 2024

I'm going to close this as it not really an issue just a discussion, which can happen on the google groups page.

from disruptor.

Someone intrest this result? about disruptor HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs