Comments (5)
Benchmarks like these have to be taken with a grain of salt.
Both e-digits and pi-digits are not representative of real world work and are instead only representative of something that one would do purely for fun. Not only does any non-specialized use of pi
not need more than around 39 digits (the number needed to compute the circumference of the observable universe to within 1 atom of precision), but there are better ways to get the first n-digits
of these irrational numbers as there are actual supercomputers that have already written highly fine tuned applications that have computed 105 trillion digits and where the code looks nothing like any of the given implementations.
Additionally, the initial code for a platform is often authored in the most naive way possible. It isn't accounting for things like whether the functionality is built-in, whether it is actually executing in the language in question, whether it may be deferring down to another language where the actual core logic is running, etc. The community is then responsible for coming in and giving alternative implementations that may do better and more optimal things, with different languages getting different levels of interest. This means that the actual implementation of for two languages could be very different and make them not exactly comparable.
The way they are measuring the code also has to be accounted for, as they are not always measuring the code in identical scenarios. Some of the languages are being compiled AOT with all possible optimizations enabled, some of these compilers have built-in optimizations to recognize some of these patterns and effectively compile the code down to a near no-op (which is only possible when you have scenarios like this where everything is constant). Others are being compiled JIT and the timings are including that overhead plus not correctly accounting for other features like Tiered Compilation.
Now, the main reason the C# implementation for these two benchmarks is much slower is effectively down to the BigInteger
implementation and how the code is then using it.
In Python, every number is a "bignum" and so they've spent much more time fine tuning this handling to ensure that the common cases remain performant. Their big integer type is designed to work with any code, to account for people writing the most naive possible algorithms, and has specialized compiler support to do optimizations and other logic to make it as optimized as possible.
In C#, we instead have both fixed-sized
types (int
, long
, Int128
, etc) and the arbitrarily sized System.Numerics.BigInteger
. We expect that the fixed-sized types are the typical ones used and that BigInteger
is only used in specialized scenarios. We likewise have reasonable assumptions in BigInteger
around the maximum size we'll see and while we do support larger sizes, they've not received as much care or optimization work to make those scenarios the most efficient possible.
Correspondingly, this means that BigInteger
has some design decisions around it like it being immutable
and therefore (a + b) / 2
will involve allocating at twice and traversing the memory of inputs 4 times (once for a
, once for b
, once to create temporary that (a + b)
produced, and then once more for that temporary as part of the tmp / 2
operation). The naive implementation is therefore not very efficient and there are often better ways to handle this.
There's also some nuance in that because we have many different integer types, BigInteger
doesn't get as much focus since its primarily intended for specialized scenarios and not for "everyday scenarios". This does mean that we have some known optimization opportunities to make it perform better on modern hardware and even to allow users to more easily write naive logic (like done in the benchmark) without seeing terrible performance. There hasn't been an opportunity to do this optimization work as of yet, namely because there is work that is higher priority and more impactful to real world workloads to be completed in other areas first.
The other factor is that the code isn't actually taking into account how C# operates, what functionality is built into the BCL, etc. It's missing opportunities to use constants like double.Tau
(while it is using these for Python
and Rust
), it's missing opportunities to do more efficient string formatting using span/slices, that doesn't force many copies and allocations, and so on.
from core.
Simple pow on bigintegers is horribly slow comparing to other languages too (even python is much faster like over 300%)...
Checked on .net6 and .net8 almost same.
using System.Numerics;
class Program
{
static void Main()
{
string baseStr = "1000";
string exponentStr = "50000";
BigInteger baseNum = BigInteger.Parse(baseStr);
BigInteger exponentNum = BigInteger.Parse(exponentStr);
BigInteger result = BigInteger.Pow(baseNum, (int)exponentNum);
Console.WriteLine($"Result: {result}");
}
}
from core.
That program is:
- Including the overhead of parsing
- Not accounting for Tiered Compilation or DPGO (and therefore likely running unoptimized debug code)
- Doing unnecessary extra work by converting the big integer exponent back to an int
As explained above, there are known cases where the .NET BigInteger code could be improved. It was also explained that there's a lot of nuance towards benchmarking, real world perf considerations, etc. It is generally not sufficient to take a single simple program and use it as some metric of proof showing that it means X
is better or faster than Y
.
from core.
I've chosen this benchmark for a reason, it's very poor.
net.9 Preview4 make it even 12% worse
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Configs;
using System;
using System.Numerics;
using BenchmarkDotNet.Jobs;
[DisassemblyDiagnoser]
[SimpleJob(RuntimeMoniker.Net80, baseline: true)]
[SimpleJob(RuntimeMoniker.Net90)]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByMethod)]
public class PerformanceTest
{
[Benchmark]
[Arguments(80000)]
public string Run(int n)
{
var result = new System.Text.StringBuilder();
var k = BinarySearch(n);
var (p, q) = SumTerms(0, k - 1);
p += q;
var a = BigInteger.Pow(new BigInteger(10), n - 1);
var answer = p * a / q;
var answerStr = answer.ToString();
Span<char> sb = stackalloc char[10];
for (var i = 0; i < n; i += 10)
{
var count = i + 10;
if (count > n)
{
count = n;
}
for (var j = i; j < i + 10; j++)
{
if (j < n)
{
sb[j - i] = answerStr[j];
}
else
{
sb[j - i] = ' ';
}
}
result.AppendLine($"{new string(sb)}\t:{count}");
}
return result.ToString();
}
static (BigInteger, BigInteger) SumTerms(int a, int b)
{
if (b == a + 1)
{
return (new BigInteger(1), new BigInteger(b));
}
var mid = (a + b) / 2;
var (pLeft, qLeft) = SumTerms(a, mid);
var (pRight, qRight) = SumTerms(mid, b);
return (pLeft * qRight + pRight, qLeft * qRight);
}
static int BinarySearch(int n)
{
var a = 0;
var b = 1;
while (!TestK(n, b))
{
a = b;
b *= 2;
}
while (b - a > 1)
{
var m = (a + b) / 2;
if (TestK(n, m))
{
b = m;
}
else
{
a = m;
}
}
return b;
}
static bool TestK(int n, int k)
{
if (k < 0)
{
return false;
}
var lnKFactorial = k * (Math.Log((double)k) - 1) + 0.5 * Math.Log(Math.PI * 2);
var log10KFactorial = lnKFactorial / Math.Log(10);
return log10KFactorial >= (double)(n + 50);
}
public static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<PerformanceTest>();
Console.WriteLine(summary);
}
}
.NET SDK 9.0.100-preview.4.24267.66
[Host] : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
.NET 8.0 : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
.NET 9.0 : .NET 9.0.0 (9.0.24.26619), X64 RyuJIT AVX2
Method | Job | Runtime | n | Mean | Error | StdDev | Ratio | Code Size |
---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 326.4 ms | 0.55 ms | 0.49 ms | 1.00 | 9,827 B |
Run | .NET 9.0 | .NET 9.0 | 80000 | 364.4 ms | 0.56 ms | 0.49 ms | 1.12 | 10,435 B |
from core.
I've chosen this benchmark for a reason, it's very poor.
There is a certain level of responsibility on the code author to write good code when they want performance.
net.9 Preview4 make it even 12% worse
This regression is from ToString
, which as detailed above has known issues. It is being worked on, but isn't a huge priority compared to all the other work that could be done and a benchmark like this isn't going to change that since it isn't representative of a real world scenario someone would be likely to use in production. The primary fix for such code would be to not write things in such a naive/unoptimized fashion if performance were a consideration.
On my box, the exact code you've given (+ [MemoryDiagnoser]
) reports:
Method | Job | Runtime | n | Mean | Error | StdDev | Median | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 272.8 ms | 3.10 ms | 6.53 ms | 269.7 ms | 1.00 | 0.00 | 11,489 B | 3.41 MB | 1.00 |
Run | .NET 9.0 | .NET 9.0 | 80000 | 314.9 ms | 6.12 ms | 6.28 ms | 315.9 ms | 1.14 | 0.03 | 12,257 B | 3.23 MB | 0.94 |
It's then worth noting that you're skewing the benchmark (little bit of overhead, code size, and total bytes allocated) due to using the string builder in a fairly naive fashion. Changing it like this fixes that and makes it closer to the Python implementation by utilizing string interpolation, formatting, and slicing:
public IEnumerable<object[]> RunArgs()
{
const int n = 80000;
var lineCount = (n / 10) + 1;
var output = new StringBuilder(
n // Number of digits
+ lineCount // Number of tabs
+ lineCount // Number of colons
+ (int)(lineCount * float.Log10(n)) // Number of markers (approx)
+ (lineCount * Environment.NewLine.Length) // Number of newlines
);
yield return [n, output];
}
[Benchmark]
[ArgumentsSource(nameof(RunArgs))]
public void Run(int n, StringBuilder output)
{
var k = BinarySearch(n);
var (p, q) = SumTerms(0, k - 1);
p += q;
var a = BigInteger.Pow(new BigInteger(10), n - 1);
var answer = p * a / q;
var s = answer.ToString().AsSpan();
for (var i = 0; i < n; i += 10)
{
if ((i + 10) <= n)
{
output.AppendLine($"{s[i..(i + 10)]}\t:{i + 10}");
}
else
{
output.AppendLine($"{s[i..],-10}\t:{n}");
}
}
}
Which gives us:
Method | Job | Runtime | n | output | Mean | Error | StdDev | Median | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 273.5 ms | 4.36 ms | 10.69 ms | 269.7 ms | 1.00 | 0.00 | 16,196 B | 2.76 MB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 80000 | 313.6 ms | 5.20 ms | 4.61 ms | 313.6 ms | 1.10 | 0.07 | 6,419 B | 2.57 MB | 0.93 |
Now, it reduced allocations and code size for .NET 9, but it didn't really change the execution time because string builder itself is very fast. So we can experiment a bit and find out where the cost is coming from:
[ArgumentsSource(nameof(RunArgs))]
public string Run(int n, StringBuilder output)
{
var k = BinarySearch(n);
var (p, q) = SumTerms(0, k - 1);
p += q;
var a = BigInteger.Pow(new BigInteger(10), n - 1);
var answer = p * a / q;
return answer.ToString();
}
Gives us:
Method | Job | Runtime | n | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 272.4 ms | 4.73 ms | 9.46 ms | 1.00 | 0.00 | 10,090 B | 2.47 MB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 80000 | 312.7 ms | 6.24 ms | 6.13 ms | 1.13 | 0.05 | 4,707 B | 2.28 MB | 0.92 |
While:
[ArgumentsSource(nameof(RunArgs))]
public BigInteger Run(int n, StringBuilder output)
{
var k = BinarySearch(n);
var (p, q) = SumTerms(0, k - 1);
p += q;
var a = BigInteger.Pow(new BigInteger(10), n - 1);
return p * a / q;
}
Gives us:
Method | Job | Runtime | n | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 233.2 ms | 3.85 ms | 3.60 ms | 1.00 | 0.00 | 4,627 B | 2.13 MB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 80000 | 205.6 ms | 1.20 ms | 1.93 ms | 0.88 | 0.02 | 9,144 B | 2.13 MB | 1.00 |
We can then break apart the last few bits as well, to find out the following
BinarySearch(n)
Method | Job | Runtime | n | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 125.1 ns | 1.79 ns | 1.67 ns | 1.00 | 0.00 | 254 B | - | NA | |
Run | .NET 9.0 | .NET 9.0 | 80000 | 125.6 ns | 1.21 ns | 1.13 ns | 1.00 | 0.02 | 244 B | - | NA |
SumTerms(0, k - 1)
Method | Job | Runtime | k | output | Mean | Error | StdDev | Ratio | Code Size | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 20631 | 14.09 ms | 0.230 ms | 0.215 ms | 1.00 | 1,010 B | 109.3750 | 15.6250 | 1.97 MB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 20631 | 10.01 ms | 0.092 ms | 0.086 ms | 0.71 | 995 B | 109.3750 | 15.6250 | 1.97 MB | 1.00 |
p += q
Method | Job | Runtime | p | q | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 260(...)701 [80052] | 151(...)000 [80052] | 5.631 us | 0.0991 us | 0.0927 us | 1.00 | 0.00 | 310 B | 1.9836 | 0.2441 | 32.49 KB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 260(...)701 [80052] | 151(...)000 [80052] | 5.518 us | 0.0532 us | 0.0472 us | 0.98 | 0.02 | 306 B | 1.9836 | 0.2441 | 32.49 KB | 1.00 |
BigInteger.Pow(new BigInteger(10), n - 1)
Method | Job | Runtime | n | output | Mean | Error | StdDev | Ratio | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 80000 | 7.274 ms | 0.1437 ms | 0.1476 ms | 1.00 | 1,771 B | 32.47 KB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 80000 | 2.903 ms | 0.0358 ms | 0.0335 ms | 0.40 | 2,653 B | 32.47 KB | 1.00 |
var pa = p * a
Method | Job | Runtime | p | a | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 143(...)400 [80005] | 100(...)000 [80000] | 4.594 ms | 0.0898 ms | 0.0840 ms | 1.00 | 0.00 | 2,374 B | 64.91 KB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 143(...)400 [80005] | 100(...)000 [80000] | 4.607 ms | 0.0586 ms | 0.0548 ms | 1.00 | 0.02 | 3,575 B | 64.91 KB | 1.00 |
return pa / q
Method | Job | Runtime | pa | q | output | Mean | Error | StdDev | Ratio | RatioSD | Code Size | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Run | .NET 8.0 | .NET 8.0 | 143(...)000 [160004] | 527(...)000 [80004] | 184.8 ms | 1.62 ms | 2.88 ms | 1.00 | 0.00 | 1,616 B | 32.71 KB | 1.00 | |
Run | .NET 9.0 | .NET 9.0 | 143(...)000 [160004] | 527(...)000 [80004] | 192.0 ms | 2.70 ms | 3.88 ms | 1.04 | 0.02 | 2,565 B | 32.6 KB | 1.00 |
So this gives us a breakdown where most of the time is being spent (Division
and ToString
). There's some optimization opportunities in other areas too, but they don't represent the bulk of the work.
from core.
Related Issues (20)
- .Net on embedded platform HOT 1
- > where.exe dotnet C:\Program Files (x86)\dotnet\dotnet.exe C:\Program Files\dotnet\dotnet.exe HOT 1
- Failed to publish WASM client project for Linux
- .NET April 2024 Update - .NET 8.0.4, NET 7.0.18, .NET 6.0.29 HOT 5
- Getting Failed to fetch https://packages.microsoft.com/repos/edge/dists/stable for the last few hours HOT 2
- Is there an API to get the latest preview .NET SDK version HOT 7
- Failed to fetch https://packages.microsoft.com/repos/edge/dists/stable/main/binary-amd64/Packages.gz File has unexpected size HOT 3
- Package installation instructions are stale and need to be updated HOT 4
- Fedora 38 reaches end of life on 14 May 2024
- Add support for Alpine 3.20 HOT 1
- BEARSMITH8G HOT 1
- import speech_recognition as sr # Initialize the recognizer r = sr.Recognizer() # Function to convert speech to text def speech_to_text(): with sr.Microphone() as source: print("Please start speaking:") audio = r.listen(source) try: text = r.recognize_google(audio) return text except sr.UnknownValueError: return "Sorry, I could not understand what you said" except sr.RequestError: return "Sorry, my speech recognition service is currently down" # Main function if __name__ == "__main__": text = speech_to_text() print("You said: " + text)
- AAAAAAAAAAAAAAAA HOT 2
- We are pleased to announce the release of .NET 9 Preview 3. HOT 1
- .NET May 2024 Update - .NET 8.0.5, NET 7.0.19, .NET 6.0.30 HOT 27
- https://docs.synthesized.io/sdk/latest
- Golden Skulls over a dark background. Dead by Daylight's Year 8 Anniversary livestream is live now on Twitch and Youtube.https://docs.synthesized.io/sdk/latestc43a66cff7a403d56c5c5e1ff10d3d5f95961abf80f97f0e35380594909f0e4d HOT 1
- .NET 9 Preview 4 Wpf Themed Textbox Scrollbars do not show when required
- debian11 can't install dotnet8 HOT 2
- estimating complexity error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from core.