I'm seeing a case where, when running llc with -verify-machineinstrs, that the Machine

"No live subrange at use" after register allocation about llvm-mos HOT 11 CLOSED

llvm-mos commented on July 20, 2024

"No live subrange at use" after register allocation

from llvm-mos.

Comments (11)

mysterymath commented on July 20, 2024 1

Since you've asked for debugging info, I'll try to sort-of blog the debugging process as I do it. I'll summarize to avoid spamming the issue too much.

First off, I searched through the codebase for the message "No live subrange at use" and read through the code that produces the error. It looks like "subrange" in this context refers to a portion (subregister) of a virtual register; LLVM can track liveness of the different parts of a virtual register separately; for example, the low and high bytes of a 16-bit pointer in the zero page.

The error message is given for the virtual register %686. Searching for that, we find that all definitions and uses of %686 are in a small code snippet:

1668B     undef %686.sublo:imag16 = COPY %684.sublo:imag16
1676B     %687:gpr = COPY %686.sublo:imag16
1684B     STAbsOffset %687:gpr, %stack.15, 0 :: (store 1 into %stack.15)
1692B     %688:gpr = COPY %686.subhi:imag16

Sure enough, the straghtforward reading of this leaves the high byte, %686.subhi, completely undefined, even though it's used at 1692B. So the next steps will be to scrub through the passes and watch this section of the generated code to see how and why this is emitted. As you suspected, the likely culprit is loadStoreRegToStackSlot, but it's also possible that a later pass is corrupting the output of that pass, so it's usually a good idea to double check.

from llvm-mos.

mysterymath commented on July 20, 2024 1

I know offhand that loadStoreRegToStackSlot is pretty much exclusively used inside register allocation to spill and reload virtual registers, so -stop-after=greedy will be the easiest way to differentiate between the two cases: whether the code is being emitted wrong, or whether a later pass is corrupting it.

--stop-before=greedy doesn't include virtual register %686 at all, while --stop-after=greedy has the corrupted code fragment in all its glory. So the issue is definitely contained within register allocation. It may be in loadstoreRegToStackSlot alone, or it could be an optimization or interaction with the code in InlineSpiller.cpp or Greedy.cpp.

I can see what that code inserts via -debug-only=mos-instrinfo; from this, loadStoreRegStackSlot emits:

687:gpr = COPY %686.sublo:imag16
STAbsOffset %687:gpr, %stack.15, 0 :: (store 1 into %stack.15)
%688:gpr = COPY %686.subhi:imag16
STAbsOffset %688:gpr, %stack.15, 1 :: (store 1 into %stack.15 + 1)

This means that it's not responsible for the generation of 686, only the use. The error is now more likely to have occurred somewhere earlier, in a location yet to be determined, but still somewhere within register allocation.

from llvm-mos.

mysterymath commented on July 20, 2024 1

There are two broad approaches to fix this (or any issue really):
A) Make the bad thing not happen
B) Make the bad thing not bad

The bad thing, in this case, I'll take to be wide virtual registers that are only partially used. On the surface, it doesn't seem like there are any clear cases where it'd be necessary to emit this case; if you're only using part of a virtual register, there's no big obvious reason why you couldn't use a smaller register class. So A seems like it may work.

Looking at why the instruction selector emits this in the first place, well, it seems to be a consequence of going through and considering i16 to be legal. It's only legal for certain instructions, but because of a hint in the LLVM docs, I made sure that it was legal to copy, PHI, extend, and truncate i16 as well, treating it as a "legal type." However, it doesn't look like there's anything that actually requires this; the legalizer doesn't really have the notion of a "legal type", only "legal type/operation combination". An with that lens, pretty clearly only G_CONSTANT i16 is actually legal, along with the int/pointer conversion. We only have i16's for use with addressing modes, so there's not actually any reason to have them legal outside that scope. We can combine a pair of i8's temporarily, use them, then discard the i16 afterwards. This should ensure that any i16s we generate are actually used as i16s somewhere; otherwise, they'll stay as an unrelated pair of i8s.

So, I went through and lowered instructions where i16 didn't strictly need to be legal. This deleted a fair amount of code in the instructions selector, seems to have made the emitted code slightly more efficient, made 15 or so test cases compile, and fixed game.ll. That seems to be the roughly average per-issue payoff, which is why things are coming along as quickly as they are.

This should be fixed as of 9d8c865

from llvm-mos.

johnwbyrd commented on July 20, 2024

C source: https://github.com/sgadrat/6502-compilers-bench/blob/master/code_samples/ccgame/game_01_start.c

from llvm-mos.

johnwbyrd commented on July 20, 2024

The assert is being generated in MachineVerifier.cpp on line 2190.

from llvm-mos.

johnwbyrd commented on July 20, 2024

On an unrelated note, it's interesting to see the MOSIndexIV pass go at it and reduce everything to LDY/STY indirect. I can't imagine a legacy compiler doing anything like this.

from llvm-mos.

johnwbyrd commented on July 20, 2024

Thanks for the sanity check; I think this is about where I lost the thread of the debug session as well. At the very least, I'm pleased that there was not an obvious solution that I overlooked. I vaguely suspect the isDef and isUndef flags on MachineInstr's may not be getting set correctly somewhere, though I have no evidence to back this up.

from llvm-mos.

mysterymath commented on July 20, 2024

So, continuing backwards, running with straight -debug will give us the register allocation logs as well. The culprit line is the store to the low byte of %686, which is described in the following snippet of debug output:

selectOrSplit Imag16:%683 [1472r,3816r:0)  0@1472r L0000000000000002 [1472r,3816r:0)  0@1472r weight:2.351084e-02 w=2.351084e-02
RS_Spill Cascade 0
Inline spilling Imag16:%683 [1472r,3816r:0)  0@1472r L0000000000000002 [1472r,3816r:0)  0@1472r weight:2.351084e-02
From original %514
Merged spilled regs: SS#15 [1472r,3816r:0)  0@x weight:0.000000e+00
spillAroundUses %683
        rewrite: 1472r  undef %686.sublo:imag16 = COPY %684.sublo:imag16

It looks like the register allocator couldn't assign %683 to a register, so it decided to spill and reload it around each of its uses. Going back further, %683 is amongst a number of virtual registers originally split out of %514 ("From original %514"). In turn, %514 appears to be from the original input to greedy (-stop-before=greedy):

%511:ac = LDImm 0
undef %514.sublo:imag16 = COPY %511

Now, that's not that unusual by itself, but it looks like %514 is never actually used as a imag16 register, only its sublo is ever defined, and only its sublo is ever used. That appears to be the root cause of the original issue: loadStoreRegStackSlot gets called on a 16-bit register, but only half of the register is actually defined. Fingers crossed, but there shouldn't ever be a reason that this situation needs to occur; you can always define a narrower virtual register instead. So now the task becomes figuring out where %514 is coming from, and why it wasn't given the register class, say, anyi8.

from llvm-mos.

mysterymath commented on July 20, 2024

So, scrubbing backwards from greedy, the next natural place to look is where machine code is first generated, the instruction selector. But the output of the instruction selector looks totally fine in this case; a virtual register %14 is constructed from a 16-bit constant (the null pointer), and it's copied straight through to form %514, without alteration. So this code was corrupted somewhere along the way.

I'll save the tedium of describing the binary search between these two points, but it ends up that the code is fine before "rename-independent-subregs" and totally screwed up after. This means that some target-specific interface needed by that pass we either failed to implement, implemented incorrectly, or the logic doesn't think a case that can happen on the 6502 can happen. This makes sense, since I've never had to deal with this pass before, and have no idea what it does, how it does it, or why. But at least, that's where the problem is.

from llvm-mos.

mysterymath commented on July 20, 2024

And the other shoe drops; the purpose of rename-independent-subregs is precisely to produce code of the form we see: larger virtual registers where only one subregister is live. There's even the following:

  // TODO: We could attempt to recompute new register classes while visiting
  // the operands: Some of the split register may be fine with less constraint
  // classes than the original vreg.

So, the code we saw is considered good and correct, by virtue of the two halves of a larger virtual register being, after the transformation, allocatable to completely unrelated memory locations, which is actually a good thing. So now that I know what the situation is, I'll go heads down trying to think of a principled way of dealing with it.

from llvm-mos.

johnwbyrd commented on July 20, 2024

Outstanding work. Especially, it's well done to take the more complicated but ultimately more long-term stable solution.

from llvm-mos.

"No live subrange at use" after register allocation about llvm-mos HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs