Comments (9)
Actually that test is a little bit of inside joke on my part, demonstrating how you can lie with a benchmark. All it is doing is testing how fast the sequencer can signal the consumer, but only on every 10th update. It doesn't do any actual useful work.
How does the modest-lock fair with the OnePublisherToOneProcessorUniCastThroughputTest? Also do you have a link to the code?
If all I wanted to do was improve that test I could just have one thread polling a sequence with another thread updating it in batches of 10.
Existing Disruptor:
Starting Disruptor tests
Run 0, Disruptor=1,998,001,998 ops/sec
Run 1, Disruptor=2,079,002,079 ops/sec
Run 2, Disruptor=2,157,497,303 ops/sec
Run 3, Disruptor=2,114,164,904 ops/sec
Run 4, Disruptor=2,152,852,529 ops/sec
Run 5, Disruptor=2,205,071,664 ops/sec
Run 6, Disruptor=3,577,817,531 ops/sec
Run 7, Disruptor=3,546,099,290 ops/sec
Run 8, Disruptor=3,610,108,303 ops/sec
Simple polling code:
Starting Disruptor tests
Run 0, Disruptor=6,191,950,464 ops/sec
Run 1, Disruptor=6,042,296,072 ops/sec
Run 2, Disruptor=6,369,426,751 ops/sec
Run 3, Disruptor=6,289,308,176 ops/sec
Run 4, Disruptor=6,389,776,357 ops/sec
from disruptor.
OnePublisherToOneProcessorRawBatchThroughputTest
Seems both batch 10, a little improvement in my machine. strange! should be ~=10x
Run 0, Disruptor=871,839,581 ops/sec
Run 1, Disruptor=811,030,008 ops/sec
Run 2, Disruptor=1,231,527,093 ops/sec
Run 3, Disruptor=1,320,132,013 ops/sec
Run 4, Disruptor=1,320,132,013 ops/sec
Run 5, Disruptor=1,185,536,455 ops/sec
Run 6, Disruptor=1,143,510,577 ops/sec
Run 7, Disruptor=1,067,235,859 ops/sec
Run 8, Disruptor=1,243,008,079 ops/sec
Run 9, Disruptor=1,268,230,818 ops/sec
Run 10, Disruptor=1,391,788,448 ops/sec
Run 11, Disruptor=1,334,222,815 ops/sec
Run 12, Disruptor=1,320,132,013 ops/sec
Run 13, Disruptor=1,292,824,822 ops/sec
Run 14, Disruptor=1,255,492,780 ops/sec
Run 15, Disruptor=1,267,427,122 ops/sec
Run 16, Disruptor=1,219,512,195 ops/sec
Run 17, Disruptor=1,243,008,079 ops/sec
Run 18, Disruptor=1,164,144,353 ops/sec
Run 19, Disruptor=1,085,187,194 ops/sec
OK lets back to the 10-1 pattern.
Indeed the modest-lock very simple,just like this:
if((counter&1)==1) Thread.yield();
return counter-1;
This result after applied the modest-lock to this test, still use write10:read1 pattern
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec
change SingleProduceSequencer to next reserve operation by Thread.yield instead of parkNanos
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec
apply the modest-lock to the next reserve operation:
Run 0, Disruptor=1,490,312,965 ops/sec
Run 1, Disruptor=1,489,203,276 ops/sec
Run 2, Disruptor=1,662,510,390 ops/sec
Run 3, Disruptor=1,683,501,683 ops/sec
Run 4, Disruptor=1,662,510,390 ops/sec
Run 5, Disruptor=1,683,501,683 ops/sec
Run 6, Disruptor=1,683,501,683 ops/sec
Run 7, Disruptor=1,662,510,390 ops/sec
Run 8, Disruptor=1,662,510,390 ops/sec
Run 9, Disruptor=1,662,510,390 ops/sec
Run 10, Disruptor=1,661,129,568 ops/sec
Run 11, Disruptor=1,662,510,390 ops/sec
Run 12, Disruptor=1,662,510,390 ops/sec
Run 13, Disruptor=1,662,510,390 ops/sec
Run 14, Disruptor=1,683,501,683 ops/sec
Run 15, Disruptor=1,683,501,683 ops/sec
Run 16, Disruptor=1,662,510,390 ops/sec
Run 17, Disruptor=1,684,919,966 ops/sec
Run 18, Disruptor=1,683,501,683 ops/sec
Run 19, Disruptor=1,684,919,966 ops/sec
BTW: a strange feeling: Indeed, we just do guess for the OS scheduler program.
All we need is the right scheduler, but ...
from disruptor.
I created a gist at here:https://gist.github.com/qinxian/5771879
from disruptor.
Are you running with HyperThreading enabled?
from disruptor.
NO!
AMD x3:)
from disruptor.
BTW, I tried the JDK8 @contended annotation at field.
seems the padding volatile long field faster than the long[] implementation.
from disruptor.
The different I see with ModestLock is not as marked as your results and the difference on the OnePublisherToOneProcessorUniCastThroughputTest is lower than the noise. I think these small optimisations will vary between hardware platforms. One of reasons we made the WaitStrategy pluggable is to allow these types of optimisations. If it speeds up your system end to end, then go for it, but don't base your decision on the OnePublisherToOneProcessorRawBatchThroughputTest, as it doesn't test anything useful, base it on your own macro-benchmarks.
I've also had a go with @contended, didn't make a massive difference, but it should be a little bit quicker as it would remove one indirection. Unfortunately it will be a while before Java 8 is the standard. I might do a Java 8 specific version if there is enough interest.
from disruptor.
Expected when P10-C10 pattern improvement 2.X over P10-C1 pattern in your result.
Indeed I always use JDK8 with win8 on AMD HT.
At these cases, If both ends employ modest-lock, the version work at 2.X effect.
so some deduction on the spec-relative seems: Multiple publisher can obtain a profit from similar way.
From above messages, seems you test for Intel HT. so seems it's IHT vs. AHT.
but I still intrest the Intel HT modest-lock test results at now.
Of cause, the results just only an one-test-case. As to if useful or useless, depends.
but it does be some kind of reference, right?
BTW, like before "guess" words, some sadness on the kernel. One like me, only works at high level, no some willing to lower level, maybe cannot, maybe kernel cannot. A real world!
BTW, you do plan refactor the WaitStrategy to more generalized? I did some work.
from disruptor.
I'm going to close this as it not really an issue just a discussion, which can happen on the google groups page.
from disruptor.
Related Issues (20)
- lmax library compatibility with Java 11 and Java 17 HOT 1
- `Util.log2` gets stuck in endless loop for negative values
- 4.0 is based on JDK11. Will the JDK8-based 3.x version continue to be released? HOT 1
- Can a new 3.X version be released based on 3.4.4? HOT 3
- 4.0.0.RC1 has been released for more than one year. When will another stable version be released?
- please guide me,thanks. HOT 1
- A service creates multiple Disruptors HOT 4
- Unable to set disruptor as singleton HOT 2
- Directly replace Fence with the methods setRelease, setVolatile, and getAcquire. No need to set barriers via VarHandle HOT 1
- FixedSequenceGroup What does this class do? HOT 1
- BlockingWaitStrategy CPU 100% HOT 2
- SequenceBarrier.waitForSequence sometimes doesn't respect BlockingWaitStrategy
- Lmax Disruptor Version 4.0.0 with MultiProducers blocks in ringbuffer.next() call HOT 1
- Problem Using LMAX Disruptor 4.0.0 with JDBC-postgres 42.6.0 HOT 4
- Java 21 and log4j HOT 1
- In version 4.x, is there a successor to Disruptor::handleEventsWithWorkerPool? Can multiple consumers cooperate to process a batch of messages? (one message will only be processed by one consumer) HOT 6
- About process a batch of messages HOT 3
- Possible unnecessary judgments
- Reduce unnecessary consumption
- [v3.4.2] AsyncLogger thread is stuck on 100% CPU on lmax queue polling
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from disruptor.