GithubHelp home page GithubHelp logo

Comments (6)

juj avatar juj commented on July 21, 2024

If you replace rand16() with e.g. a constant 1, does the inefficiency go away?

Maybe it might be the case that the compiler can't prove that rand16() doesn't touch memory and potentially i, so it thinks it needs to store X and reload it later.

Any chance you'd be able to reduce the test case into a minimal code line test case, so that it could e.g. be verifiable on https://godbolt.org/ ? That way it will be easier to see if the issue still persists in the future when newer versions of llvm-mos come up on godbolt.

from llvm-mos.

cogwheel avatar cogwheel commented on July 21, 2024

Maybe it might be the case that the compiler can't prove that rand16() doesn't touch memory and potentially i, so it thinks it needs to store X and reload it later.

Even if it needs to use the value in $31 elsewhere, shouldn't it at least be able to eliminate the 2nd ldx? (unless i was declared volatile, maybe?)

from llvm-mos.

juj avatar juj commented on July 21, 2024

Yeah, probably.. that was just a guess.

from llvm-mos.

Memblers avatar Memblers commented on July 21, 2024

If you replace rand16() with e.g. a constant 1, does the inefficiency go away?

Good question, yeah it becomes a simple DEX / BNE in that case.

Maybe it might be the case that the compiler can't prove that rand16() doesn't touch memory and potentially i, so it thinks it needs to store X and reload it later.

neslib's rand.s is in assembly. rand16 uses X, and JSRs to rand8 twice. Both of those subroutines are defined like:
.section .text.rand8,"ax",@progbits

Any chance you'd be able to reduce the test case into a minimal code line test case, so that it could e.g. be verifiable on https://godbolt.org/ ? That way it will be easier to see if the issue still persists in the future when newer versions of llvm-mos come up on godbolt.

Sure, I put a reduced version here. Well, rand16 may be a red herring, because it's in there and the iterator test suddenly looks more reasonable (DEC ZP then loads it to Y, but Y value this time is actually used at the top of the loop).
https://godbolt.org/z/PTWxKvdbr

While editing it down, I noticed something strange. Code before the loop is affecting the loop, it doesn't seem like it should be. This copy has one of the neslib function calls uncommented, and uncommenting any one of those function calls seems to have this same effect. With this, now it's back to doing the LDX / DEX / STX thing (actual X value gets trashed by rand16).
https://godbolt.org/z/rdoaaqheG

from llvm-mos.

Memblers avatar Memblers commented on July 21, 2024

I'm also now noticing there's more inefficiency in the end of this same loop. Same code as linked above: https://godbolt.org/z/rdoaaqheG

        ldx     mos8(.Lmain_zp_stk+2)           ; 1-byte Folded Reload
        dex
        stx     mos8(.Lmain_zp_stk+2)           ; 1-byte Folded Spill
        ldx     mos8(.Lmain_zp_stk+2)           ; 1-byte Folded Reload
        bne     .LBB0_1
        ldx     #0
        rts

X was tested by BNE, X must be zero to continue, but the next instruction is a redundant LDX #0. Seems related to this issue maybe, but I could open that as a separate issue, if that would help.

from llvm-mos.

Memblers avatar Memblers commented on July 21, 2024

I tried a copy/paste of the loop in question, so it runs another copy of that loop. The first copy does the unoptimized X register stuff, and the second copy uses DEC ZP. This happens after a neslib call, using any single one of them will do it. If there are no neslib calls beforehand, both for loops will use DEC ZP.

If I put another neslib call before the second copy of the loop, that makes the second copy also do the redundant X register stuff.

https://godbolt.org/z/d3n3Ezfn3

from llvm-mos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.