Comments (6)
Yes!
I would really like better support for writing code that produces branch-free traces. The big upside is that avoid branches keeps you on-trace which is the key to getting performance from the JIT. If the compiler could predictably compile some uses of and
/ or
into branch-free IR then that would be awesome. (I don't think it's as critical for the machine code to be branch free.)
This would be similar to the way that the JIT treats min
and max
today. Here is the example rewritten to compile into branch-free IR:
local a = {}
for i = 1, 2^10 do
a[i] = math.max(5, 10 * i % 2)
end
But life is really too short to write branch-free code using only min
and max
to combine values. We have sometimes resorted to this in Snabb and it always elicits groans during review.
from raptorjit.
Is it reasonable to look at this issue in two parts?
First - what is the set of code that we can successfully translate into a branch-free expression.
Second - what are the options to implement that translation. Could be done with the JIT, with new bytecodes, or even by translating into other existing bytecodes with a simple optimization pass in the front end?
I suppose there is also the question of when this is really desirable. On a CPU the choice between branching vs branch-free algorithms it not a no brainer, it depends if you are more worried about the control hazard of the branch or the data dependency of the conditional move. On the JIT there may be similar subtleties e.g. do you have to worry about whether the runtime types of the various operands will be compatible (unknown during bytecode compilation.)
Fun topic :).
from raptorjit.
One more question...
Is it better for the compiler to decide when to use a conditional move, or for the programmer to make that decision? The compiler is already too unpredictable and so I would be reluctant to introduce new automatic optimizations that can backfire in significant ways. So if the optimization is risky then we could consider introducing a new conditional move operator in the language instead like a[i] = i % 2 == 0 ? 5 : 10
.
from raptorjit.
So you guys are open to introducing language changes, huh?
I think it's going to be hard to make this optimisation "backfire" in any significant way; it would be easy to have a maximum IR length for a branch to be unbranched, and certain operations would be blacklisted regardless, e.g. impure operations.
The optimisation is either called if-conversion or if-conversion2 in GCC, and this is what I found for if-conversion through a simple github search:
https://github.com/gcc-mirror/gcc/blob/e000adb99a2cbe03e98b79878c28f1ca6fe986ef/gcc/ifcvt.h
https://github.com/gcc-mirror/gcc/blob/d592b7eb01d951389d742e4ebc5eae0d7fecacde/gcc/gimple-ssa-split-paths.c
https://github.com/gcc-mirror/gcc/blob/6e1b9a473ce28938f2e10e38b82aa995619c7bfb/gcc/tree-if-conv.c
https://github.com/gcc-mirror/gcc/blob/1911fdd760cb98bd9a84ddc6d6cb72ad99ea5796/gcc/tree-ssa-ifcombine.c
https://github.com/gcc-mirror/gcc/blob/967ec7427cf042db62ed85b4df6006f65fde3333/gcc/tree-vectorizer.c
https://github.com/gcc-mirror/gcc/blob/1911fdd760cb98bd9a84ddc6d6cb72ad99ea5796/gcc/ifcvt.c
https://github.com/gcc-mirror/gcc/blob/1cb6c2eb3b8361d850be8e8270c597270a1a7967/gcc/tree-ssa-loop-ch.c
https://github.com/gcc-mirror/gcc/blob/1cb6c2eb3b8361d850be8e8270c597270a1a7967/gcc/doc/passes.texi
https://github.com/gcc-mirror/gcc/blob/6e1b9a473ce28938f2e10e38b82aa995619c7bfb/gcc/tree-vectorizer.h
https://github.com/gcc-mirror/gcc/blob/2fbe7a3260952e247399aeef226839ced1f10d70/gcc/config/epiphany/epiphany.md
https://github.com/gcc-mirror/gcc/blob/1cb6c2eb3b8361d850be8e8270c597270a1a7967/gcc/doc/gimple.texi
https://github.com/gcc-mirror/gcc/blob/700a97608cadfe8adcd1a98e6388a5cbee9d76f6/gcc/config/frv/frv.h
https://github.com/gcc-mirror/gcc/blob/08b15fdc56232725a65c6eebb244bed13be06676/gcc/target.def
https://github.com/gcc-mirror/gcc/blob/505329ddc7cab005f1eaa5b2f7b51e69129d53b1/gcc/opts.c
https://github.com/gcc-mirror/gcc/blob/08b15fdc56232725a65c6eebb244bed13be06676/gcc/tree-vect-loop.c
https://github.com/gcc-mirror/gcc/blob/fd2ed0fe6f69e94df71a88ef0593d126956a30f1/gcc/match.pd
https://github.com/gcc-mirror/gcc/blob/6e1b9a473ce28938f2e10e38b82aa995619c7bfb/gcc/tree-vect-patterns.c
https://github.com/gcc-mirror/gcc/blob/505329ddc7cab005f1eaa5b2f7b51e69129d53b1/gcc/cfgexpand.c
https://github.com/gcc-mirror/gcc/blob/700a97608cadfe8adcd1a98e6388a5cbee9d76f6/gcc/config/frv/frv.c
https://github.com/gcc-mirror/gcc/blob/4bb28e46497f0819f5e3c4264cab7bbbd712b757/gcc/doc/tm.texi.in
from raptorjit.
Now that I think about it, unbranching it will just be faaar to hard without proper IR support.
So yes, we need to be able to represent branches of some kind in the IR.
There doesn't need to be support for impure branches (yet) in the same trace, but having pure branches could be done through some PHI-like instruction, that takes 3 arguments.
One selector argument, which is a boolean, and then the second and third is the arguments that can be selected. Let's call this SEL for now.
There won't really be a problem with matching the types, since you could just make the instruction guard if it can't be deduced that they will be that type.
For how to actually generate this: I am not very knowledgable about the recorder, but I assume that it currently just assumes one path and takes that, so the recorder has to somehow be aware of both branches, find where the branches merge, and then find the registers that change depending on the branch. If any of the branches are impure, this has to abort somehow.
Example IR:
0013 num KNUM +5
0014 num KNUM +10
.... SNAP #3 [ ---- ---- ---- ]
0015 > int LE 0012 +1024
.... SNAP #4 [ ---- ---- ---- 0012 ---- ---- 0012 ]
0016 ------ LOOP ------------
0017 int BAND 0012 +1
0018 int NE 0017 +0 ; no guard!
0019 num SEL 0018 0014 0013
.... SNAP #6 [ ---- ---- ---- 0012 ---- ---- 0012 ]
0020 > int ABC 0005 0012
0021 p32 AREF 0007 0012
0022 num ASTORE 0018 0019
0023 + int ADD 0012 +1
.... SNAP #7 [ ---- ---- ---- ]
0024 > int LE 0023 +1024
0025 int PHI 0012 0020
---- TRACE 1 stop -> loop
from raptorjit.
To answer your questions:
- Any pure code that is not too long.
- As described above
- There are (probably) very few cases where a pure short branch not being an actual "branch" would yield worse performance than now.
from raptorjit.
Related Issues (20)
- Idea: Remove Lua C-API HOT 41
- Philosophy: Who is RaptorJIT for? HOT 1
- RaptorJIT language side evolution and Lua compatibility HOT 3
- A world on FFI HOT 6
- Benchmark: FFI
- Idea: Separate snapshot for each function call
- raptorjit release version confusion HOT 2
- Idea: Write Lua parser and bytecode compiler in Lua HOT 19
- Question: How to send relevant fixes to LuaJIT?
- Document VM bootstrap, code generation, build process HOT 2
- Idea: CNEWI sinking across trace boundaries HOT 4
- Demo: Over 50x slowdown on pointer arithmetic due to single branch
- Windows support HOT 2
- Openresty HOT 8
- Optimization: lambda lifting HOT 7
- Initial port of RaptorJIT bytecode interpreter to C
- Filling the gap with Lua 5.3 HOT 2
- Apply to GitHub sponsorship HOT 3
- LuaJIT/RaptorJIT at FOSDEM 2020?
- Linking failed on ArchLinux HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raptorjit.