GithubHelp home page GithubHelp logo

65816 support about llvm-mos HOT 32 OPEN

johnwbyrd avatar johnwbyrd commented on July 2, 2024 11
65816 support

from llvm-mos.

Comments (32)

chorman0773 avatar chorman0773 commented on July 2, 2024 1

That is fair. We will get the gcc backend working and return then.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024 1

Also, Connor mentioned to me that he will post his spec attempt on the 6502 forums for some feedback.

from llvm-mos.

Felice-Enellen avatar Felice-Enellen commented on July 2, 2024 1

Hey folks, former industry SNES gamedev here, and I just wanted to chime in on something said very early in this thread, which is that switching 8/16-bit register sizes is "a niche thing."

By now you may already know, but to be clear: it is NOT a niche thing.

It is often essential to good code, especially in terms of optimizing, partially because it effectively gives you a second "B" accumulator that you can switch to and from with the XAB (eXchange A, B) instruction. You absolutely will need good support for dynamic size switching in both code generation and optimization and should be making the heaviest use of it you can.

I'd give examples, but I haven't worked on 65816 since the 90s and my memory is pretty swiss-cheese about it at this point, sorry. Good luck, it's a worthy project. :)

from llvm-mos.

asiekierka avatar asiekierka commented on July 2, 2024 1

We've had many discussions about this since, and I've done my own research - it is absolutely not niche for A, but it does seem to be rather niche for X/Y; which is good, as the latter are much, much harder to model in something generic like LLVM.

from llvm-mos.

Molive-0 avatar Molive-0 commented on July 2, 2024

Is there any easy way to implement handling changing the width of the registers from 8 to 16 bits and back? That seems like a niche feature.

from llvm-mos.

Selicre avatar Selicre commented on July 2, 2024

https://github.com/p4plus2/armadillo-spinach-muffin/ is a transpiler from i386 to 65816, which has a few tricks might be of interest to whoever wants to work on this.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

Is there any easy way to implement handling changing the width of the registers from 8 to 16 bits and back? That seems like a niche feature.

Not a quick or obvious one. Codegen currently assumes that a, x, and y registers are 8 bits. One way to overcome that limitation is to model larger 16-bit registers, of which a, x and y are subregisters. This is possible, but not trivially so.

Re the transpiler: thanks for the link. LLVM's GlobalISel already has really strong opinions about how to optimize for arbitrary architectures. Strangely, performance optimization is not a limiting factor for the project at the moment -- generating consistently correct code is.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

Memorializing a few thoughts on a 65816 version.

The address space for the MOS backend would have to detect that it's targeting a 65816, and so the address space for all instructions would have to change from 16 to 32 bits. (LLVM does not support address sizes that are not a power of two.) This in itself is not a trivial change. As a matter of fact, it might be wise, first of all, to make sure that the 8-bit code generator would still continue to make the right choices, even if the address space for the target was 32 bits.

Once the current compiler survived all the smoke tests with the additional address space, it would then be time to add the additional 65816 instructions into the assembler. At this point you could verify the 65816 opcodes, and make sure that assembly for all the new instructions, compiles and decompiles correctly. (I tried to write the assembler so that it could handle 65816 instructions when we get around to them.)

After that, you could take a look at codegen itself. Codegen would be the biggest issue, The entire legalizer would have a radically different set of lowering rules from the 6502. Many of the "lower everything to 8 bit" rules would be gone. And there'd be a lot of "if (65816)" rules added here and there in codegen. You'd assume that all addresses are 32 bits, and the assembler would render the low bytes of those 32 bit addresses into 24 or 16 bits as the case may be. See MOSDevices.td, and FeatureW65816, for the flag I intended to represent these processor features.

After codegen more or less works, then you would want to look at ways of intelligently segmenting 65816 code to make use of 16-bit addressing when possible, and ignoring the segment byte. GlobalISel can help some with this, but ultimately we may ask for some help from the user to make sure the right code goes into the right code segments -- e.g. a default linker script will be provided, but performance freaks would want to write their own.

In short, the 65816 is not a weekend project. But it could be done at some point, with a patient and organized effort. I don't think it's too wise to focus on it, until 6502 codegen is sane and relatively stable.

from llvm-mos.

chorman0773 avatar chorman0773 commented on July 2, 2024

For w65/65816, I'm working on my own toolchain (including eventually a C compiler, probably gcc). The support is being included in a fork of GNU binutils at https://github.com/chorman0773/binutils-gdb.
I think it may be a good idea to, before support in either or both is too significantly locked in, to discuss the question of abi, to promote compatibility between the toolchains.
Some details of note:

  • The target name used by config.sub for the 65816 processor is w65, with a default vendor of wdc. In patches submitted to [email protected], I have sought to add 65816, w65c816, and wdc65c816 as aliases to w65. In the same patch series I also sought to add m6502 and m65c02 for NMOS 6502 and CMOS 6502.
  • Support for the w65 elf target is using EM_WC65C816, 257.
  • I have a working document for an abi that can be used with the w65 processor, specified here: https://docs.google.com/document/d/1MYAY9Jn7BxH_FsQrGfZC_e5MjG9oCwgM2aYq4q9Nqrs/edit. Beyond the Section pertaining to Elf Files, much of this is open to comment and change. As Elf file support is already implemented in the previously linked fork of binutils, I do not recommend significant changes, beyond those relating to relaxation.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

Instead of running off and creating yet another ELF convention for 6502, please note the following. https://llvm-mos.org/wiki/ELF_specification

from llvm-mos.

chorman0773 avatar chorman0773 commented on July 2, 2024

Note that EM_65816 (257) is a registered with generic-abi, (See the conversation here: https://groups.google.com/g/generic-abi/c/qaPzp2lRzDA, also note inclusion in upstream GNU Binutils: https://github.com/bminor/binutils-gdb/blob/14a6b9b4b68c33f1182c8e6060dcb25514268af9/include/elf/common.h#L354). While changing the relocations to match those of this project may be doable, my preference is to use the registration that already exists, rather than a value that, as far as I can tell, is not officially defined in any way.

Note that the document linked is specifically for the w65 processor.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

It is also worth noting that @chorman0773 has been developing his ABI for the past few years, though not very publicly. Connor, do you have that link?

There were various considerations made in this ABI to solve some of the same problems you have been having, and it would be interesting to compare ideas to see if some of them can be reconciled.

from llvm-mos.

chorman0773 avatar chorman0773 commented on July 2, 2024

Connor, do you have that link?

If you are referring to the ABI Working Document, I linked it above.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

Oh right, I'm blind.

from llvm-mos.

chorman0773 avatar chorman0773 commented on July 2, 2024

Returning the discussion of 65816 vs. 6502 elf machine number from #86 (which primarily references the fact that EM_MCS6502 already existed and was registered with the maintainers of the ELF specification).
I would consider that 65816 vs. 6502 are different enough that they need different toolchain-level and programmer handling, similar to the EM_386 and EM_X86_64 distinction (albeit, those are different in two other respects, 386 uses ELFCLASS32 like m6502 and w65, whereas x86_64 uses ELFCLASS64). w65 needs a different set of relocations, and relaxations, from m6502. For example, the elf spec I wrote for it defines R_65C816_RELAX_JML, which can turn a jml instruction into a jmp or bra instruction. The m6502 wouldn't have such a link-relaxation because it doesn't have the jml instruction. Likewise, a format for m6502 wouldn't have use for R_65C816_REL16, R_65C816_ABS24, or R_65C816_BANK. As a more substantial distinction; R_MOS_ABS8 and R_65C816_DIRECT have different behaviour, R_MOS_ABS8 overflows with a value > 255, whereas R_65C816_DIRECT only overflows for a value > 65535 (and then only uses the low order byte).

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

The document you referenced above, literally refers to itself as an "SNES Dev ABI." That alone should be disqualifying for its use as a generic standard across all 65xxx devices. LLVM-MOS is merely a compiler, and it should not, by definition, have opinions as to which 65xxx hardware implementation it should run on.

As far as I can tell, your core argument here is that certain kinds of relocations only have meaning on certain kinds of hardware. But it's up to the tools in all cases to figure out which features/bits should permit linkability between platforms.

Your R_65C816_RELAX_JML relocation is not a relocation. It's a hint for some types of assemblers to make a relaxation. LLVM handles instruction relaxation far earlier than the link step.

Do any C compilers or assemblers exist which follow your specification?

Have you reviewed our specification and relocation types yet? We've already put together a list of relocations that should cover all LLVM's use cases, across the 65xx and the 658xx series of processors and clones. See https://llvm-mos.org/wiki/ELF_specification for details.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

I will clarify a couple of things before chorman shows up to clarify the rest: the SNES-Dev ABI isn't meant for all 65xxx devices; it is meant for handling the 65816 processor on the SNES machine, similar to x86_64-pc-linux-gnu is meant for handling the amd64 processor, on a PC-compatible machine, running Linux, using the GNU ABI for all of those things.

The compiler does need to target the hardware; otherwise, how do you know where RAM and ROM are? How do you know what math-related hardware registers exist (for prior art, see x87 vs MMX vs SSE and variants)? Moreover, for a hosted system, how do you know how to output data to a screen? All of these, to some extent or another, depend on the machine.

While a PSABI should only target the processor, and there should be other ABI supplements for machine, the two are fairly intertwined in the case of the SNES, as well as (presumably) the IIGS. This does not preclude breaking the SNES-Dev ABI into a PSABI and a machine-specific supplement, it's just not what he chose; the three of us can work together to do that if it makes sense.

We do have a binutils fork that follows the specification; besides that, official GNU binutils and the file command both recognize EM_65816. A GCC fork is also in progress.

Let us know if you have any other questions!

from llvm-mos.

chorman0773 avatar chorman0773 commented on July 2, 2024

Note that the above document is obsolete, and changed locations roughly a month ago. It is now in the SNES-Dev Project's github organization and repository, https://github.com/SNES-Dev/SNES-Dev/blob/main/docs/abi/v1.md.
It has also been redesignated as a general ABI for w65 in general (rather than a SNES-specific Machine Supplicant) and in specific (rather than in general for all 65xx and 65xxx targets), though it remains a part of the SNES-Dev project as that is where it originated. This redesignation occured as all existing SNES-specific portions had be previosuly removed from the ABI spec.

I have indeed reviewed the ELF specification, and find it lacks a few operations, including link-time relaxations, which are notably not specific to the proposed w65 psABI (RISC-V's psABI also defines operations that can be relaxed at link time when the assembler or compiler may not have the necessary knowledge). The other missing operation's are R_WC65816_DIR, which is the same as (R_WC65816_ABS16)&0xff00, which is necessary for efficiently setting up the direct page for several adjacent accesses within the direct page, and R_WC65816_REL16, which is necessary to relocate the brl (branch long) instruction (which itself would be useful for position independent code).

Link time relaxations are useful, especially for the w65 architecture where it can make sometimes meaningful differences in cycle counts and function sizes, notably allowing relaxations on undefined symbols.
Whether or not llvm may perform these operations is largerly irrelevent as the relaxations are optional - a valid implementation is to ignore them entirely - though this may leave llvm inferior to toolchains that do use them.

As was mentioned by @rdrpenguin04, I am willing to work with the llvm-mos project, to ensure that our projects may be compatible in operation with w65 and m6502 object files, though I also need to ensure usefulness of the SNES-Dev project and toolchain in producing reasonably useful SNES homebrew code, as that is the intention of the project.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

Link time relaxations are useful, especially for the w65 architecture where it can make sometimes meaningful differences in cycle counts and function sizes, notably allowing relaxations on undefined symbols.

They may well be so, but LLVM on the whole does not yet understand them. They are a uniquely compiler-specific gcc phenomenon as of this writing. LLVM lacks the infrastructure to regenerate machine code in a (non-LTCG) link step, or to change the locations of machine code once it has been placed relative to the beginning of a section. Therefore, no LLVM implementation can support these concepts, at least until the core of LLVM supports it.

65816 support is purely theoretical at this time for llvm-mos. So is SWEET 16. 65816 codegen is not a weekend project, and I want to leave as much flexibility as possible to Daniel and others to do codegen for 65816 as they see fit in the future, if ever. Our focus as of this writing, is getting 8-bit codegen clean, and leaving the door open to support other variants in the future.

Please help us on improving some of the more immediate 8-bit problems on this project, which will (in the fullness of time) improve any 24-bit support that we contemplate in the future.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

I believe chorman and I are trying to get something more-or-less nailed down so that, when we both do support codegen for 65c816, we don't have two mutually-incompatible ABIs. If you are not planning to work on this for a while, could you at least look at our ELF specification and give feedback on it, and then consider using it yourself later? That's our whole purpose of joining this discussion.

We may work on 8-bit codegen at some point, but that is not where our project is oriented at this time; as such, when/if we do get around to it, we would be trying to follow your existing ABI.

Let us know if you have any feedback; we are more than happy to help with ensuring future compatibility!

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

"We both" is referring to your group and the SNES-Dev group, and "we" without "both" is referring to just SNES-Dev; I just realized that wasn't very clear.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

I believe chorman and I are trying to get something more-or-less nailed down so that, when we both do support codegen for 65c816, we don't have two mutually-incompatible ABIs. If you are not planning to work on this for a while, could you at least look at our ELF specification and give feedback on it, and then consider using it yourself later? That's our whole purpose of joining this discussion.

Do you have a functioning C compiler/assembler/linker, with source code, that we can look at?

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

The repos are at the https://github.com/SNES-Dev organization; the mostly-functioning assembler is at https://github.com/SNES-Dev/binutils-gdb

from llvm-mos.

mysterymath avatar mysterymath commented on July 2, 2024

We may work on 8-bit codegen at some point, but that is not where our project is oriented at this time; as such, when/if we do get around to it, we would be trying to follow your existing ABI.

There may be a conflict in development methodologies between the two projects. I can only directly speak to the C side of this, but we don't provide any ABI stability or backwards-compatibility guarantees, and we're unlikely to for the considerable future. Our ABI isn't compatible with our ABI from 3 months ago, and it's reasonably likely that it won't be compatible with our ABI in 3 months' time.

Stabilizing an interface without at least one high-quality implementation is dangerous, since the interface might be constructed in such a way that non-obviously precludes high-quality implementations. History is littered with examples of this. Similarly, stabilizing an interface with only one high-quality implementation is less dangerous, but it usually guarantees that there's only ever going to be one high-quality implementation of that interface.

The 65816 ABI we should use for llvm-mos should be one that guarantees that llvm-mos can be a high-quality compiler for that platform. I'll consider that a given ABI might satisfy this only after someone shows me a high-quality compiler for it.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

That is definitely a fair argument, and that's why I'm suggesting we work together to make an ABI that works well for both of us. Connor's ABI is a great starting point, but it could probably use some growth and practical implementation.

For one, any ABI for any 65xx platform is going to be limited by trying to be compatible with itself; many cycles are wasted saving registers by calling convention and by forcing long jumps. Also, the SNES-Dev w65 ABI currently makes pointers 32-bit for the sake of alignment, and we're not sure if there's a way to make it 24-bit (or even, ideally, 16-bit and 24-bit) without causing some additional problem.

We'd love to hear input on the problem, and we want to collaborate; we don't want your implementation to lock into something bad any more than we do, and we don't want to lock out of each others' ABI any more either.

Connor also wanted to note that the ABI is not stabilized, and the intent is in fact to finish a compiler (likely GCC) before stabilizing the non-ELF-related sections.

The only other thing I have to say is that "Instead of running off and creating yet another ELF convention for 6502" sounds very hostile, even if it wasn't intended as such. We are not trying to stomp on your project, we are just trying to help make the best, compatible ABI we can.

Let us know if any of you have any questions!

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

We're not against working together. Our issue is that, unless I am missing something, you do not seem to be ready to work with us yet.

Your order of operations to date, seems to have been:

  1. Allocate some constants with Xinuos;
  2. Add constants to binutils;
  3. Write an "official" technical specification.

We believe strongly in a "code first" development strategy, and that strategy has paid off well for us.

Our recommended order of operations is:

  1. Get an assembler working;
  2. Get ELF support sort of working;
  3. Publish a working draft of a standard in the 6502 forums;
  4. Get feedback on it;
  5. Get the C compiler working (this should take you a year minimum);
  6. Optimize the C compiler, without fixing the ABI;
  7. Get an SDK working and stable;
  8. Realize that a need exists to unify ELF support between LLVM-MOS and some other compiler (this is a big one -- literally no one is asking for this feature)
  9. Figure out ways to unify constants while disturbing the smallest number of people;
  10. Freeze the ABI (and we're not confident that we will ever do this);
  11. Allocate some constants with Xinuos;
  12. Unify the constants across compiler projects;
  13. Write an official technical specification.

Please understand, we're not at all saying that you cannot be part of this project. We welcome your reasoned code improvements and other contributions. But it's absolutely the wrong time to begin unifying the toolchains' output formats.

The road to porting a compiler to the 65xx series, is littered with good intentions. Do not underestimate how fantastically hard it is, to get a functional tool suite together.

We should revisit this issue once you have gotten your C compiler passing the gcc torture test suite.

from llvm-mos.

rdrpenguin04 avatar rdrpenguin04 commented on July 2, 2024

To be fair, we do have ELF support working, as well as a mostly working assembler in binutils. We choose to work to fit a specification so that we don't end up rewriting our code every time there's a significant problem; we can define first and find the problems there. Getting GCC working shouldn't be that difficult, it just hasn't been a priority for us; we've been using the assembler for a homebrew project, so the priority has been getting features into the assembler.

The constants were established for the specific purpose of beginning unification early; if the two of us design contrary systems, it will be hard to unify in the future. We're mainly hoping to solidify the ELF specification now, and we can revisit the C ABI later once one of us is closer to done with that.

Let us know if you have any other questions for us!

from llvm-mos.

mysterymath avatar mysterymath commented on July 2, 2024

Getting GCC working shouldn't be that difficult

This is the first time I've heard someone describe working with GCC this way, and given the project's history, I hope you can understand skepticism on our side about this. We're coming out of the tail end of making a minimum-viable C backend for LLVM, and despite the architectural improvements that LLVM offers over GCC, it was still considerably more work than anticipated.

if the two of us design contrary systems, it will be hard to unify in the future.

As John mentioned, this isn't really the order that things are done in industry. (EDIT: There are, of course, counterexamples, many of which are... unfortunate.) The successful standardization and unification efforts (USB, HTML5, even ANSI C itself!) were almost all based off of decades of joint experience in relatively well-understood problem spaces. Here, everything is far more experimental. AFAIK, the aren't really any high-quality C compilers for the 65816 or 6502, so much of what we do here is to some degree original research, or at the very least original development.

Also, if you build things right, it's not as difficult to make major interface changes as you'd might think. Fred Brooks gave the classic advice on this: "Build one to throw away; you will anyway." Pretty much our whole 6502 code generator is designed to be thrown away; each part is kept as independent as possible from every other part, and are continuously replacing components wholesale as our understanding of the problem space evolves. We could replace our whole calling convention in a week or two (I just did), and I'm keeping close tabs on development velocity as the project grows.

Speaking only for myself, I'm not sure how I could offer effective criticism of any proposed ABI, either. For example, what are the performance implications of passing booleans via flags vs via 8-bit or 16-bit registers or zero/direct page memory locations? When ABI designers want to answer a question like that, they might code up both versions in a well-optimized compiler fork and ran a full suite of benchmarks against both options. That's what I'd want to do too, but there's no well-tuned compiler in existence that I could modify to do this. As it is, I'd either be guessing or making armchair philosophical arguments, neither of which is scientific enough to call engineering.

from llvm-mos.

asiekierka avatar asiekierka commented on July 2, 2024

I'd argue it may be good to split this issue - unapproachable as it is - into multiple separate issues, based on John's rough outline:

  • "Add the additional 65816 instructions into the assembler" - this is, as far as I know, almost completed with my recent batch of PRs; the only remaining thing being writing an assembler test (and fixing omissions, if any) with regards to the handling of opcodes with 16-bit immediates. That's done too.
  • "Support the extended 65816 address space"
  • "Support emitting 65816 code in native 8-bit mode" - as far as I understand, once the address space issue is taken care of, this would mostly involve patching llvm-mos-sdk's crt0 to initialize the stack pointer to $01FF rather than $FF; the backend need not support 16-bit registers for one to be able to write inline assembly code to use them briefly. This would allow using llvm-mos as an 65816 C compiler in a sense, though not a very efficient one for sure! (However, this would still allow building support for 65816 targets in the meantime, as they could unlock later code generation benefits with time.)
  • "Support use of 16-bit accumulator and index register modes" - this is still a pretty big issue to tackle, of course.

from llvm-mos.

mysterymath avatar mysterymath commented on July 2, 2024

This SGTM; feel free to file issues for the uncompleted portion of this; this can be kept as a tracking bug for the work overall.

from llvm-mos.

asiekierka avatar asiekierka commented on July 2, 2024

Done.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 2, 2024

@asiekierka writes on Discord:

Either way, the 65C816 is hard but it's not, IMO, a fundamental rethinking of code generation, at least for a first pass: the most invasive changes are roughly divisible into three categories
(a) Adding support for the 24-bit address space; tracking data accesses and distinguishing between "near" and "far" jumps.
(b) Adding support for 16-bit registers; following in the footsteps of almost every other C compiler and most real-world 65C816 development, that means X/Y become permanently 16-bit, while A is switched at runtime.
(c) Adding support for the 65C816's hard stack; it has a 16-bit pointer and addressing modes that allow reading stack-offsetted variables, making this much more viable than on the 6502.
The rest is fairly minor, and some of it has already been done (like supporting the TXY/TYX opcodes).

One minor thing that comes to mind is MVN/MVP-based inline block copies - there's already precedent with the HuC6280's specialized block copy opcodes - the 65C816 requires a very different implementation, but the hooks and domain knowledge is already there.

from llvm-mos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.