GithubHelp home page GithubHelp logo

"cc65 compatibility" about llvm-mos HOT 8 CLOSED

llvm-mos avatar llvm-mos commented on July 20, 2024
"cc65 compatibility"

from llvm-mos.

Comments (8)

oliverschmidt avatar oliverschmidt commented on July 20, 2024 3

I'm not sure if my comments acctually fit here - but I was explicitly encouraged to participate here...

From my POV the question is (way) more broader, it's how llvm-mos and cc65 are positioned. It's what "we" advise someone to use for which scenarion / use case. Maybe we come up with answers that don't ask for compatibility at all.

I imagine it might help when I briefly describe how I believe people are using cc65 what interested them:

First, there's a surprisingly large number of cc65 users not interested in the the C compiler at all. They are only using the ca65 assembler and the ld65 linker. ca65 is a full-fledged, feature-rich macro assembler (the code generated by the cc65 C compiler doesn't make any use of that). However, I tend to believe that ca65 user rather favor it over other 65xx assemblers because of the separated ld65 linker. Using so-called linker config files the user can make ld65 do a lot of interesting things. In general it's about being able to define in a pretty flexible way what code/data segments are placed where in one or multiple output files and - totally independent from that placement - what absolute addresses should be assumed for that code/data segments at runtime. This is obligatory for e.g. generating ROMs for whatever use case.

Second, there are of course the users writing C code. I'd separate them in three subgroups:

The first (I think by far smallest) subgroup is "only" interested in writing (parts of) their business logic in C rather than 65xx assembly. Beside the so-called runtime library they don't make use of any library code coming with cc65. The runtime library contains functions not called explictly by the user's C code but functions called implicitly by the code generated by the C compiler. The C compiler is very closely tied to the runtime library.

The second (I think largest) subgroup wants to write C programs for "their" target machine (like e.g. the C64). Often they don't know (anymore) how to do things directly on that machine. They usually rely on the C library (and other libraries coming with cc65) on doing the right thing when they use console I/O or disk I/O. Please note that I personally have no insight to which extend people have created their own cc65-based libraries to be used by others, not even talking about how popular those libraries actually are!

The third (I think rather small) subgroup wants to write C programs for multiple target machines (like e.g. the Apple II and the ATARI). They additionally rely on the libraries that come with with cc65 to behave as similiar as possible on as many as possible targets. machines.

While for the assembly programmers mostly the flexibility of ld65 is interesting, for the C programmers some rather out-of-the-box features are interesting too. I personally see here primarily support for "structured" output files like (CONVERTed) GEOS VLIR files and ATARI XEX files. BTW: The o65 files mentioned above only play a rather marginal role for cc65 from my POV.

Okay, so the first question for me is to ask: Which of those cc65 users should rather use llvm-mos? I might be totally wrong but I guess you're not after the assembly programmers using only ca65 and ld65.

Looking at the C programmers I presume that the first group not looking for libraries coming with the C compiler may be convinced quickly to use llvm-mos if it outperforms cc65 regarding the generated code speed and/or size. I presume they only need an at least somewhat capable (inline-)assembler to create the necessary glue code to have the C code "do something".

I guess the from llvm-mos perspective problematic/challanging users are C programmers relying on mature libraries for one or even multiple targets.

Another perspective on the topic is that I personally believe that most code using cc65 is written at some point in time as a (most of the time) fun project. Those projects are at some point considered final and then left alone. I don't see many cc65-based projects being worked on over a longer time. So I personally don't see a big win in the ability to migrate exsisting projects from cc65 to llvm-mos.

The result from all I wrote so far is that I don't think that "cc65 compatibility" in itself is desirable feature. But maybe "cc65 compatibility" is desirable as means to reach other desirable goals.

As I already wrote in the Usenet I think it is desirable to not re-create all library FUNCTIONALITY for all targets coming with cc65 from scratch once more. So the from my POV primary/only question is how to leverage the exsisting cc65 library functionality.

So after this (too) long preamble...

The cc65 libraries are nearly completely written in hand-crafted, manually heavily optimized 65xx assembly. For cc65 that was the obvious way to go as it was clear that the C compiler always creates inferior 65xx code. But is that still (and supposed to stay) true for llvm-mos? Maybe it's desirable/necessary to have as much library code as possible written in C in oder to allow llvm-mos to optimize the overall code. In that case the only thing I personally see is to look VERY closely at the cc65 library code and (try to) recreate it in C.

If it should however be desirable to have some/the library code written in 65xx assembly then it might make sense to evaluate if it makes sense to have llvm-mos generate code that's compatible with the cc65 ABI enough to allow to call into unchanged cc65 libraries. In that case one would have to choose between processing the library source code with something different than ca65 and converting the ca65 output. Please note that the library source code tends to not make heavy use of ca65 features.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 20, 2024 1

4. o65 to ELF object file converter. We could just skip the whole business, and provide a command line tool to permit cc65 object files to be converted into an ELF compatible format. The resultant code would not be ABI compatible with llvm-mos, but it would work regardless of how you generated those object files. Again though, a good bit of pain for not a great deal of long term reward.

http://www.6502.org/users/andre/o65/fileformat.html

from llvm-mos.

mysterymath avatar mysterymath commented on July 20, 2024

Of these, (2) seems like the one that would provide the most utility: the ability to mix existing cc65 libraries, ASM code, and linker scripts with llvm-mos generated code, without trying to port code written using compiler-specific extensions/linker scripts.

At the end of the day, either ld65 or lld will be making the binary (or binary-shaped ELF file), and the nature of "cc65 compatibility" would depend pretty strongly on which linker they're thinking to use.

If we want ld65 to be able to link code generated with clang/LLVM together with cc65 or ca65 code, then we'd need to be able to produce an object file format that ld65 is capable of ingesting, and we'd need to be able to have our code play nice with the cc65 ABI. Ideally, we'd have a sort of "cloister"; we'd reserve a zero page region for use with the new compiler, and any internal functions would use a better, faster calling convention, be LTOed together, etc. We'd probably want an in-C calling convention annotation to say "make this function here or this call here use the cc65 ABI for its arguments".

For the integration the other way around, we'd need lld to be able to read cc65's object file format and incorporate it into the link. In that case, the cc65 code would be the cloister: we'd reserve temporaries and so forth for it, probably sharing the same stack pointer though. All of the cc65 code would use the same cc65 calling convention; and we'd annotate any LLVM calls to that code with a "cc65 ABI" annotation, causing them to set things up on cc65's soft stack. Similarly, any LLVM functions that need to be called from within the cloister would use the cc65 ABI to get their arguments.

Supporting the cc65 calling convention as a C annotation ([clang::mos_cc65] void foo().... or something) is the common piece between the two, but it's doubtful how useful it would be on its own without linker support. It seems like the LLVM cloister in ld65 would be more useful for folks with existing cc65 projects, so I'd start with that one.

It looks like cc65 is willing to both import and export the o65 object format, which is reasonably well documented, so that might be a natural way to go. If we could convert that to/from llvm-mos ELF, then the compatibility would work both ways. The o65 is also a pretty good candidate for a file format that allows practical dynamic linking and relocating to be done on real 6502 hardware, so we may want to support it independently of this feature.

I don't think it's an enormous amount of work necessarily, but I doubt it would ever be particularly easy to use. The only way around having to modifying the cc65 linker scripts to build the cloister would be to make the cloisters as minimal as possible (i.e., basically just the stack pointer and the zero page locations already reserved by cc65), which as you mentioned would produce fairly bad performance. The cc65 in LLVM direction would have bad performance just because cc65 sux. The other way around might not actually be all that bad, if we can take advantage of all of the temporary locations that cc65 reserves for its own use. We'd really want to do an on-paper prototype here to see exactly what an integration would entail.

from llvm-mos.

oliverschmidt avatar oliverschmidt commented on July 20, 2024

4. o65 to ELF object file converter. We could just skip the whole business, and provide a command line tool to permit cc65 object files to be converted into an ELF compatible format. The resultant code would not be ABI compatible with llvm-mos, but it would work regardless of how you generated those object files. Again though, a good bit of pain for not a great deal of long term reward.

http://www.6502.org/users/andre/o65/fileformat.html

This seems to be a misunderstanding. The ld65 linker is able to produce an o65 file as OUTPUT file. BTW: ld65 only supports a subset of what o65 files can contain/express. The object files created by ca65 and consumed by ld65 as INPUT files have nothing to do whatsoever with o65 files.

from llvm-mos.

johnwbyrd avatar johnwbyrd commented on July 20, 2024

Thank you for commenting Oliver, your comments are always welcome here and I'm very glad that you're contributing your thoughts and opinions.

I can speak to the assembler portion of things, a bit. llvm-mos has support for assembler variants, of which ca65 might theoretically become one. I recall that the cc65 libraries use the assembler macro functionality in the source code, in a few places, but the library code doesn't exercise all of ca65's features. So it seems possible to convince llvm-mos to compile ca65 assembly files to ELF, in the cc65 calling convention, with llvm-mos style relocations. Then it'd be up to Daniel et al to convince llvm-mos to follow the cc65 calling conventions.

Not a trivial project I think, but possible.

from llvm-mos.

oliverschmidt avatar oliverschmidt commented on July 20, 2024

Yep, the question likely isn't if it's feasible but if it's desirable.

Maybe it makes sense as a short/mid term solution to have sort of a jump-start. Being able to "do things" with llvm-mos might create the momentum necessary to grow the contributor community. And that community might then bring up individuals creating "native" llvm-mos libraries. This is supposed to be possible in a continuous step-by-step approach.

from llvm-mos.

mysterymath avatar mysterymath commented on July 20, 2024

Oliver, thanks for the detailed and well-reasoned look at this issue.

As you've mentioned, I'd agree we can treat the assembly language and C portions of this somewhat separately. Regarding the C portions.

The first (I think by far smallest) subgroup is "only" interested in writing (parts of) their business logic in C rather than 65xx assembly.

While I'd agree that currently this group is relatively small, one of our major goals is to increase its size. My present soft goal is to generate code that is no more than twice as slow as expertly hand-optimized assembly; hopefully this is around the sweet spot where you can start trusting a compiler to do a relatively good job writing most of an application outside it's very innermost loops. As this is a hobby platform, a sizeable group of folks just really enjoy writing 6502 assembly, so there are limits to how big this group can grow. I myself am mainly in this group though, which has definitely influenced my thoughts about code generation.

The cc65 libraries are nearly completely written in hand-crafted, manually heavily optimized 65xx assembly. For cc65 that was the obvious way to go as it was clear that the C compiler always creates inferior 65xx code. But is that still (and supposed to stay) true for llvm-mos? Maybe it's desirable/necessary to have as much library code as possible written in C in order to allow llvm-mos to optimize the overall code. In that case the only thing I personally see is to look VERY closely at the cc65 library code and (try to) recreate it in C.

Yep, for the minimum libcalls and C library functions we needed to get LLVM's test suite working, absolutely everything is written in C, well past the point of absurdity. Not because I think the compiler will ever be able to reliably match or beat a human at these innermost routines, but rather because once you write them in assembly, you lose the pressure that they place on the code generator.

Eventually, our libcalls should probably be written in assembly to maximize the performance of the C code that uses them, but the advantages will steadily decrease as the code generator improves. Accordingly, given the choice of adding optimization or rewriting a libcall, I'd always take the former, since it generalizes to other similar code; this policy would result in libcalls being rewritten once all available optimization opportunities are exhausted.

We'd probably never want to rewrite things like printf; for sufficiently large routines, better algorithms provide more bang for your buck than micro-optimization, and one of the real advantages of C is that it's easier to quickly write and compare approaches. For example, we really should be using double-dabble in our printf to convert from integers to strings, and it's really easy to write a C double dabble to replace the div/mod by 10 currently in our printf. This adds up; there's only so many hours in the day, and we could write a better C printf in the same time it'd take to write a worse assembly one.

So, overall, I'd agree that the benefits we'd get from ABI compatibility with cc65 is suspect, especially relative to the work involved. The folks most likely to benefit from better code generation are the most likely ones to be directly accessing hardware registers; the onus would then be on us to develop/port the OS calls and hardware registers that performance-conscious (and usually high-knowledge) developers would need.

Low-knowledge and performance-insensitive developers should likely continue using cc65, since they'd benefit greatly from the scope of its libraries. No matter how good we make "cc65 compatibility," I have doubts that it could ever be easy to mix llvm-mos and cc65 code together, and unless it's really really easy, we wouldn't have much to offer that group. Even if it were, it'd ring something like a BASIC accelerator, which is usually a nice to have, compared to the folks would consider llvm-mos's code gen abilities a barrier to entry for they type of projects they might like to write with it.

Accordingly, I think it's reasonable to close this issue, at least until we get new information about something specific someone's trying to do with cc65/llvm-mos that bucks this model of our userbase.

EDIT: By the end of this post, I relized I had forgotten the CA65 compatibility angle, which I'm not really inclined to speak on; I just haven't written enough 6502 assembly to have good opinions. The proposal above is to close the C ABI portion of this, and if we want to keep asm compatibility open, to create a separate issue for that.

@johnwbyrd, WDYT?

from llvm-mos.

mysterymath avatar mysterymath commented on July 20, 2024

Closing as won'tfix; issues created along these lines should be more specific.

from llvm-mos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.