Comments (11)
Since you've asked for debugging info, I'll try to sort-of blog the debugging process as I do it. I'll summarize to avoid spamming the issue too much.
First off, I searched through the codebase for the message "No live subrange at use" and read through the code that produces the error. It looks like "subrange" in this context refers to a portion (subregister) of a virtual register; LLVM can track liveness of the different parts of a virtual register separately; for example, the low and high bytes of a 16-bit pointer in the zero page.
The error message is given for the virtual register %686. Searching for that, we find that all definitions and uses of %686 are in a small code snippet:
1668B undef %686.sublo:imag16 = COPY %684.sublo:imag16
1676B %687:gpr = COPY %686.sublo:imag16
1684B STAbsOffset %687:gpr, %stack.15, 0 :: (store 1 into %stack.15)
1692B %688:gpr = COPY %686.subhi:imag16
Sure enough, the straghtforward reading of this leaves the high byte, %686.subhi, completely undefined, even though it's used at 1692B. So the next steps will be to scrub through the passes and watch this section of the generated code to see how and why this is emitted. As you suspected, the likely culprit is loadStoreRegToStackSlot, but it's also possible that a later pass is corrupting the output of that pass, so it's usually a good idea to double check.
from llvm-mos.
I know offhand that loadStoreRegToStackSlot is pretty much exclusively used inside register allocation to spill and reload virtual registers, so -stop-after=greedy will be the easiest way to differentiate between the two cases: whether the code is being emitted wrong, or whether a later pass is corrupting it.
--stop-before=greedy doesn't include virtual register %686 at all, while --stop-after=greedy has the corrupted code fragment in all its glory. So the issue is definitely contained within register allocation. It may be in loadstoreRegToStackSlot alone, or it could be an optimization or interaction with the code in InlineSpiller.cpp or Greedy.cpp.
I can see what that code inserts via -debug-only=mos-instrinfo; from this, loadStoreRegStackSlot emits:
687:gpr = COPY %686.sublo:imag16
STAbsOffset %687:gpr, %stack.15, 0 :: (store 1 into %stack.15)
%688:gpr = COPY %686.subhi:imag16
STAbsOffset %688:gpr, %stack.15, 1 :: (store 1 into %stack.15 + 1)
This means that it's not responsible for the generation of 686, only the use. The error is now more likely to have occurred somewhere earlier, in a location yet to be determined, but still somewhere within register allocation.
from llvm-mos.
There are two broad approaches to fix this (or any issue really):
A) Make the bad thing not happen
B) Make the bad thing not bad
The bad thing, in this case, I'll take to be wide virtual registers that are only partially used. On the surface, it doesn't seem like there are any clear cases where it'd be necessary to emit this case; if you're only using part of a virtual register, there's no big obvious reason why you couldn't use a smaller register class. So A seems like it may work.
Looking at why the instruction selector emits this in the first place, well, it seems to be a consequence of going through and considering i16 to be legal. It's only legal for certain instructions, but because of a hint in the LLVM docs, I made sure that it was legal to copy, PHI, extend, and truncate i16 as well, treating it as a "legal type." However, it doesn't look like there's anything that actually requires this; the legalizer doesn't really have the notion of a "legal type", only "legal type/operation combination". An with that lens, pretty clearly only G_CONSTANT i16 is actually legal, along with the int/pointer conversion. We only have i16's for use with addressing modes, so there's not actually any reason to have them legal outside that scope. We can combine a pair of i8's temporarily, use them, then discard the i16 afterwards. This should ensure that any i16s we generate are actually used as i16s somewhere; otherwise, they'll stay as an unrelated pair of i8s.
So, I went through and lowered instructions where i16 didn't strictly need to be legal. This deleted a fair amount of code in the instructions selector, seems to have made the emitted code slightly more efficient, made 15 or so test cases compile, and fixed game.ll. That seems to be the roughly average per-issue payoff, which is why things are coming along as quickly as they are.
This should be fixed as of 9d8c865
from llvm-mos.
C source: https://github.com/sgadrat/6502-compilers-bench/blob/master/code_samples/ccgame/game_01_start.c
from llvm-mos.
The assert is being generated in MachineVerifier.cpp on line 2190.
from llvm-mos.
On an unrelated note, it's interesting to see the MOSIndexIV pass go at it and reduce everything to LDY/STY indirect. I can't imagine a legacy compiler doing anything like this.
from llvm-mos.
Thanks for the sanity check; I think this is about where I lost the thread of the debug session as well. At the very least, I'm pleased that there was not an obvious solution that I overlooked. I vaguely suspect the isDef and isUndef flags on MachineInstr's may not be getting set correctly somewhere, though I have no evidence to back this up.
from llvm-mos.
So, continuing backwards, running with straight -debug will give us the register allocation logs as well. The culprit line is the store to the low byte of %686, which is described in the following snippet of debug output:
selectOrSplit Imag16:%683 [1472r,3816r:0) 0@1472r L0000000000000002 [1472r,3816r:0) 0@1472r weight:2.351084e-02 w=2.351084e-02
RS_Spill Cascade 0
Inline spilling Imag16:%683 [1472r,3816r:0) 0@1472r L0000000000000002 [1472r,3816r:0) 0@1472r weight:2.351084e-02
From original %514
Merged spilled regs: SS#15 [1472r,3816r:0) 0@x weight:0.000000e+00
spillAroundUses %683
rewrite: 1472r undef %686.sublo:imag16 = COPY %684.sublo:imag16
It looks like the register allocator couldn't assign %683 to a register, so it decided to spill and reload it around each of its uses. Going back further, %683 is amongst a number of virtual registers originally split out of %514 ("From original %514"). In turn, %514 appears to be from the original input to greedy (-stop-before=greedy):
%511:ac = LDImm 0
undef %514.sublo:imag16 = COPY %511
Now, that's not that unusual by itself, but it looks like %514 is never actually used as a imag16 register, only its sublo is ever defined, and only its sublo is ever used. That appears to be the root cause of the original issue: loadStoreRegStackSlot gets called on a 16-bit register, but only half of the register is actually defined. Fingers crossed, but there shouldn't ever be a reason that this situation needs to occur; you can always define a narrower virtual register instead. So now the task becomes figuring out where %514 is coming from, and why it wasn't given the register class, say, anyi8.
from llvm-mos.
So, scrubbing backwards from greedy, the next natural place to look is where machine code is first generated, the instruction selector. But the output of the instruction selector looks totally fine in this case; a virtual register %14 is constructed from a 16-bit constant (the null pointer), and it's copied straight through to form %514, without alteration. So this code was corrupted somewhere along the way.
I'll save the tedium of describing the binary search between these two points, but it ends up that the code is fine before "rename-independent-subregs" and totally screwed up after. This means that some target-specific interface needed by that pass we either failed to implement, implemented incorrectly, or the logic doesn't think a case that can happen on the 6502 can happen. This makes sense, since I've never had to deal with this pass before, and have no idea what it does, how it does it, or why. But at least, that's where the problem is.
from llvm-mos.
And the other shoe drops; the purpose of rename-independent-subregs is precisely to produce code of the form we see: larger virtual registers where only one subregister is live. There's even the following:
// TODO: We could attempt to recompute new register classes while visiting
// the operands: Some of the split register may be fine with less constraint
// classes than the original vreg.
So, the code we saw is considered good and correct, by virtue of the two halves of a larger virtual register being, after the transformation, allocatable to completely unrelated memory locations, which is actually a good thing. So now that I know what the situation is, I'll go heads down trying to think of a principled way of dealing with it.
from llvm-mos.
Outstanding work. Especially, it's well done to take the more complicated but ultimately more long-term stable solution.
from llvm-mos.
Related Issues (20)
- LLVM ERROR: Unable to legalize instruction HOT 3
- Support assembler sources in ca65 format
- Lower mem intrinsics to loops
- G_OR prevents selection of addressing mode HOT 1
- Don't copy single-use strings to the zero page
- rustc crash HOT 2
- Compilation failure on MacOS w. Apple silicon HOT 11
- Builder for Apple Silicon
- mos-sim crash HOT 1
- Triple selection doesn't accommodate mos-<platform>-<type>-<subtype> syntax
- [65C816, 65CE02] Long branch instructions not supported HOT 2
- ld.lld: error: undefined symbol: __rc4 to __rc24 HOT 3
- Missing G_SBC commutation for equality checks HOT 1
- [Assembler] Improved ergonomics for 65816 (and other) subtargets HOT 14
- [Assembler] .byte/.short don't support MOS expression parsing
- [Interrupts] Current interrupt C generation inadequate for CBM machines HOT 2
- Redundant copy and spilling HOT 1
- Compiler crashes when try to access a member variable of a class through inline assembly HOT 5
- Declaration order of member variables has a big impact on code optimization HOT 1
- Surface error messages for inline assembly
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llvm-mos.