GithubHelp home page GithubHelp logo

Comments (4)

kidoman avatar kidoman commented on August 15, 2024

Thanks for bringing this up. I had a self debate yesterday about this particular commit. Let me try and explain the thought process:

  • The project started off initially as a good way to learn idiomatic Go
  • I shifted my focus to the performance aspect when I saw that there was a large delta between "unoptimized" Go and C++ (that is when I created the post in golang-nuts)
  • A lot of optimizations were done (and documented in the first blog post) as a way to see where Go lacked and what measures could be taken to close the gap as much as possible

The direction of the project "rays" has definitely shifted now. In my mind "its about seeing how good can a program perform in a given language/compiler/design (l/c/d) combination whilst still keeping the code as close to real world as possible."

So, there are two kinds of optimizations in my mind:

  • The first version (C++) of the code scanned through the entire ART and incurred a huge cost in computation time whilst accomplishing nothing; this looked like a broken algo design, hence I fixed it by creating a objects array
  • In Go, replacing math.Pow(x, 99) with a hand optimized multiplication tree to get 5 % extra perf

I still believe in retaining the first one, but like I reversed the micro-opt with bc029c5 yesterday, I want to bring all the implementations up to a stage where we do not avoid stuff like math.Pow(), etc. Instead, give scope for the compiler to do the right thing for you.

Then "rays" essentially becomes a good test bed to see how much we can extract from a l/c/d combination without doing benchmark specific optimization; by letting the compiler do its thing. As much as possible.

That being said, SSE in C++ is not something we need to avoid; in fact, its a USP in the language itself that it allows us to go from 12.7 s to 9.4 s by still writing C++. SSE is not the same as replacing math.Pow() in my mind

I want to know what you think about this though

from rays.

t-mat avatar t-mat commented on August 15, 2024

I would like to see 2 versions of code for every language

  • "mainline" version
    • Standard, platform independent
    • Only algorithm/calculation level optimization is allowed
  • "hacked" version
    • Non-standard, deeply platform/language dependent
    • Any kind of optimization is allowed
    • But every single line is written in target language
      • ex. For C++, intrinsics are allowed, but inline assembly is prohibited

"mainline" shows idiomatic way. Good for the language tourists. "hacked" shows back street of the language. Tourists should not walk into there, but locals enjoy the secret side of the language.

Some reasons

I think there are 4 ranks of goodness

  1. Standard, platform independent, straight forward code
    • Math.Pow(), Math.rand
  2. Algorithm/calculation level optimization
    • Pseudo lazy evaluation (algorithm)
    • Replace division with reciprocal (calculation)
  3. Non-standard, platform dependent, deeply language dependent
    • p33, rnd() (non-standard)
    • SSE vector (platform/runtime environment dependent)
    • PR #13 (deeply language dependent)
    • Commonly used external library (ex. PCRE)
  4. Out of the target
    • Special purpose external library
    • Another language (inline asm)

I would like to see 1. and 2. in mainline of the code. But I also want to see 'insanely optimized' version by 3.
'Insane' version should not allowed to merge to mainline, but as you have seen these optimization clearly show some kind of the room and weakness.

More random thoughts:

  • If we have an ideal compiler, auto-vectorization (ex. SSE optimizing) should be done by the compiler.
    • Also clamping, 2D-RNG
  • Usually, imperative programming language allows side effect, so compiler/interpreter could (should) not achieve lazy evaluation without special notations.
    • ex. Some kind of "pure" function attributes.
  • Process-wide GC is seriously bad.
  • RNG is not so good. LFSR variant is widely used for this purpose
    • eg. Xorshift, MT
    • Or use standard library
  • Division to reciprocal number multiplication conversion should be allowed.
    • This conversion is not same (ex. x87) but widely used.

from rays.

tkalbitz avatar tkalbitz commented on August 15, 2024

+1 for @t-mat

There should be a clean vanilla version as basis for a "dirty" optimized version.

from rays.

kidoman avatar kidoman commented on August 15, 2024

+1 for a clean reference version; and a crazy all out optimized version

from rays.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.