Comments (13)
Here's the data that we have gathered so far by repeatedly running several configurations. Different variations of X/Y/Z give rise to names like HotRscTypecheck_XY_XY_Z
:
HotRscTypecheck_310_310_2.run 32.955
HotRscTypecheck_310_310_2.run 33.302
HotRscTypecheck_310_310_2.run 33.318
HotRscTypecheck_310_310_2.run 33.519
HotRscTypecheck_310_310_2.run 33.640
HotRscTypecheck_310_310_2.run 33.749
HotRscTypecheck_310_310_2.run 33.753
HotRscTypecheck_310_310_2.run 33.871
HotRscTypecheck_310_310_2.run 34.083
HotRscTypecheck_310_310_2.run 34.548
HotRscTypecheck_310_310_3.run 32.956
HotRscTypecheck_310_310_3.run 33.032
HotRscTypecheck_310_310_3.run 33.319
HotRscTypecheck_310_310_3.run 33.322
HotRscTypecheck_310_310_3.run 33.517
HotRscTypecheck_310_310_3.run 33.591
HotRscTypecheck_310_310_3.run 33.695
HotRscTypecheck_310_310_3.run 33.792
HotRscTypecheck_310_310_3.run 33.801
HotRscTypecheck_310_310_3.run 34.136
HotRscTypecheck_310_310_5.run 33.252
HotRscTypecheck_310_310_5.run 33.303
HotRscTypecheck_310_310_5.run 33.387
HotRscTypecheck_310_310_5.run 33.511
HotRscTypecheck_310_310_5.run 33.546
HotRscTypecheck_310_310_5.run 33.557
HotRscTypecheck_310_310_5.run 33.573
HotRscTypecheck_310_310_5.run 33.643
HotRscTypecheck_310_310_5.run 33.865
HotRscTypecheck_310_310_5.run 34.262
HotRscTypecheck_510_510_3.run 33.205
HotRscTypecheck_510_510_3.run 33.274
HotRscTypecheck_510_510_3.run 33.282
HotRscTypecheck_510_510_3.run 33.284
HotRscTypecheck_510_510_3.run 33.328
HotRscTypecheck_510_510_3.run 33.420
HotRscTypecheck_510_510_3.run 33.440
HotRscTypecheck_510_510_3.run 33.550
HotRscTypecheck_510_510_3.run 33.720
HotRscTypecheck_510_510_3.run 34.164
Currently, I think that all those configuration are equivalent reliability-wise. There doesn't seem to be any reason to pay 300s for 5/10/3 or 3/10/5 if 120s of 3/10/2 seems to be roughly as good as far as run-to-run variance is concerned.
This afternoon, we'll be experimenting with 510_510_5
, 1010_1010_5
and 1020_1020_3
.
from rsc.
Here's another batch of results that I forgot to include into the grepping session. It's not as detailed as the results above. 20/10/3 seems to be no better than 3/10/2, but the jury is still out wrt the others.
HotRscTypecheck_1010_1010_6.run 33.170
HotRscTypecheck_1010_1010_6.run 33.313
HotRscTypecheck_1010_1010_6.run 33.338
HotRscTypecheck_1010_1010_6.run 33.469
HotRscTypecheck_1010_1010_6.run 33.517
HotRscTypecheck_1010_1010_6.run 33.562
HotRscTypecheck_1020_1020_3.run 33.249
HotRscTypecheck_1020_1020_3.run 33.477
HotRscTypecheck_1020_1020_3.run 33.527
HotRscTypecheck_1020_1020_3.run 33.612
HotRscTypecheck_1020_1020_3.run 33.621
HotRscTypecheck_1020_1020_3.run 33.780
HotRscTypecheck_1020_1020_3.run 33.714
HotRscTypecheck_1020_1020_3.run 33.565
HotRscTypecheck_1020_1020_3.run 33.767
HotRscTypecheck_1020_1020_3.run 33.357
HotRscTypecheck_2010_2010_3.run 33.247
HotRscTypecheck_2010_2010_3.run 33.357
HotRscTypecheck_2010_2010_3.run 33.380
HotRscTypecheck_2010_2010_3.run 33.593
HotRscTypecheck_2010_2010_3.run 34.113
HotRscTypecheck_2010_2010_3.run 34.123
HotRscTypecheck_2020_2020_6.run 33.572
HotRscTypecheck_2020_2020_6.run 33.605
HotRscTypecheck_2020_2020_6.run 33.776
from rsc.
/cc @adriaanm @lrytz @SethTisue @szeiger @retronym
from rsc.
Also /cc @liufengyun
from rsc.
sub millisecond(IMO) means that you start to exercise L3 that does not look predominant(so far) in such cases...
from rsc.
Some early results from the current run:
HotRscTypecheck_510_510_5.run 33.183
HotRscTypecheck_510_510_5.run 33.112
HotRscTypecheck_510_510_5.run 33.591
HotRscTypecheck_510_510_5.run 33.718
HotRscTypecheck_510_510_5.run 33.438
HotRscTypecheck_510_510_5.run 33.444
HotRscTypecheck_510_510_5.run 33.428
HotRscTypecheck_510_510_5.run 33.893
HotRscTypecheck_510_510_5.run 33.665
HotRscTypecheck_510_510_5.run 33.294
HotRscTypecheck_510_510_5.run 33.594
HotRscTypecheck_510_510_5.run 33.549
HotRscTypecheck_510_510_5.run 33.830
HotRscTypecheck_510_510_5.run 33.458
HotRscTypecheck_510_510_5.run 33.416
HotRscTypecheck_510_510_5.run 33.738
HotRscTypecheck_510_510_5.run 33.309
HotRscTypecheck_510_510_5.run 33.690
HotRscTypecheck_510_510_5.run 33.524
HotRscTypecheck_510_510_5.run 33.418
HotRscTypecheck_510_510_5.run 33.455
HotRscTypecheck_510_510_5.run 33.615
HotRscTypecheck_510_510_5.run 33.865
HotRscTypecheck_1010_1010_5.run 33.603
HotRscTypecheck_1010_1010_5.run 33.292
HotRscTypecheck_1010_1010_5.run 33.612
HotRscTypecheck_1010_1010_5.run 33.338
HotRscTypecheck_1010_1010_5.run 33.581
HotRscTypecheck_1010_1010_5.run 33.277
HotRscTypecheck_1010_1010_5.run 33.584
HotRscTypecheck_1010_1010_5.run 33.529
HotRscTypecheck_1010_1010_5.run 33.809
HotRscTypecheck_1010_1010_5.run 33.526
from rsc.
@andreaTP Can you elaborate? I'm not sure I fully understand what you mean.
from rsc.
Sure, we are anyhow working on a managed runtime, this does mean that you have no full control over a number of variables. Sub millis checks are useful (in my very personal experience) when you start looking at your architecture performance (I. E. your processor) running on the Jvm. I think someone more knowledgeable than me can chip in this discussion @mjpt777 (just trying to summon him :-))
from rsc.
I'm getting really worked up about this, because our best benchmark for Rsc currently runs in slightly less than 25ms. >1ms run-to-run variance is a 4% potential error.
As it currently stands, having to make yay/nay judgements using such imprecise information makes me quite nervous. If a 4% error stacks 10 times (the rough number of "Benchmark XXX optimization" issues that we have in flight), it becomes a 1.5x difference in performance, and that is huge.
from rsc.
It's a hard problem...
You can get some insight into VM based sources of jitter with:
> jmh:run Bench -f1 -wi 0 -i100 -prof hs_comp -prof hs_gc `
Watch how long it takes for compiler.totalCompiles
to reach a steady state. You can also look for pattern in the GC stats. I find it useful to oversize the heap so that full GCs are very rare.
You could try to minimize OS jitter with our benv script.
I've found that even without super precise benchmarks, having continuous benchmark graphs around can help ensure the trend goes the right way.
from rsc.
Run to run variance can be much greater than 4% on general purpose operating system and managed platform. Things like CPU clock scaling, scheduling, research starvation, OS configuration, JVM configuration, JIT compiler races, etc. etc. JVMs like Azul Zing can help but you need to build a signficant knowledge base to do this well. On a three day course I just about manage to cover the major topics people need to start becoming aware of.
However progress has to start somewhere with experimenting and being curious at the core. Have fun experimenting and learning but be very careful reading too much into your discoveries as the reasons behind some results can be very surprising and often not obvious.
from rsc.
For development purposes, when there's a regressional benchmark infrastructure, usually developers don't care much about < 1ms
. If a performance change is invisible in the graph, developers will just ignore, as they cannot do any optimisation based on such changes.
Instead, developers usually care:
- visible big changes relative to previous points
- the trend of the curve
- no big intra-point variance that's visible in graph: via
min
andavg
curve - no inter-point variance visible in graph that cannot be explained by code change in PR
I Dotty, we have take following measures to address the 3rd and 4th concern in addition to stablizing the machine:
- have a test "emptyFile", which is supposed to be always/mostly flat
- show both
min
andavg
in the curve - allow developer to issue test command to have several points for the same PR
We still experience visible intra-point variance for some tests and sometimes get inexplicable inter-point changes. While we are still fighting against the variances, the infrastructure has been useful in confirming performance improvements and catching big regressions.
from rsc.
Thank you, everyone, for the valuable comments! Judging from your feedback, I think that my paranoia about run-to-run variance went a bit over the top.
We'll be settling down on 3/10/2 for quick benches and 10/10/5 for CI, accepting the current variance for the time being and relying on continuous benchmark graphs to help us detect meaningful trends in compilation performance.
Moreover, as we'll be implementing more and more features from full Scala, I expect that we'll be taking on more involved Scala projects. This will make millisecond-precision hair splitting moot, since most benchmarks will likely no longer run in double-digit millisecond timeframes.
from rsc.
Related Issues (20)
- Factor large case match in Dupe into a tree of case matches
- Use multiple threads to load classpath HOT 1
- Outlined Java interface cannot be extended
- Parse error on empty enum with semicolon
- Upgrade to sbt 1.x
- Unnecessary $init$ call when implementing trait with abstract method with default parameter
- Incorrect type parameter ascription when extending from nonpolymorphic trait with polymorphic grandparent class HOT 1
- Consider creating a protobuf schema for highlevel scalasigs
- Check utilities should have a setting to write output to file
- Scalasig not recursing into package type for prefix HOT 1
- Text-based Checkscalasig
- Figure out if we can use sbt shell when importing into Intellij
- Scalasig and bytecode differences in 2.12.8 when compiling final objects
- Consider using RscCompat to deal with super accessors temporarily
- Simplify forSome types
- Tuple1 is ascribed incorrectly
- Don't ascribe type refinements in case of overrides with different modifier
- Function of a single tuple is ascribed incorrect parentheses
- RscCompat rule is not auto-discoverable by scalafix
- Support loading symbols from .sig files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rsc.