GithubHelp home page GithubHelp logo

RFC: x64 Implementation Proposal about mal HOT 7 CLOSED

kanaka avatar kanaka commented on September 26, 2024
RFC: x64 Implementation Proposal

from mal.

Comments (7)

dubek avatar dubek commented on September 26, 2024

That's one big project... Good luck!

This mal implementation could be written using the 10 steps and could pass the test suite when fully implemented.

If I understand correctly you suggest building a mal compiler, that is - takes mal source and outputs ELF executable binaries. How does this work as, for example, step0, which should run an endless loop echoing the user's input? Of course, if you have a full functioning compiler, you can compile the mal implementation of mal (mal/step0_repl.mal), run that, and pass all the tests. malc can do that (luckily the mal-in-mal doesn't use eval which malc doesn't yet support).

@kanaka and I discussed in the past how to add malc to the test suite, perhaps as yet another variant for the mal-in-mal code (see the common-lisp dir for an example of how to run/compile the same source code with different compilers). In that sense malc is in a separate repo (much like gcc and rustc are in separate repos).

from mal.

dubek avatar dubek commented on September 26, 2024

BTW: @pstephens , you're welcome to join #mal on freenode if you're not already there.

from mal.

kanaka avatar kanaka commented on September 26, 2024

@pstephens Emitting raw machine/object code would definitely be an interesting challenge/puzzle, but it would be a lot of effort to learn not that much more than emitting assembly. I think the most interesting parts (in terms of both learning and results) are definitely the items that you enumerated (JIT/eval, GC, efficient persistent data structures, exceptions, interop). I think this is basically the conclusion you arrived at.

Also, unless you are really partial to x86 assembly (I certainly have nostalgia for it), I would suggest going the LLVM IR route. It's easier to emit (unless you're already an x86 assembly expert), more flexible (output to many architectures), and a more useful thing to learn (at least LLVM related stuff is hotter on a resume these days). But perhaps the biggest advantage is that you could leverage the well developed JIT infrastructure that already exists in LLVM.

Regarding where to do the project. I think it makes sense for the compiler itself to live in a separate repo. Once that exists, using that as an alternate build mode for the mal-in-mal in the mal project seems reasonable. I.e. we could have both @dubek's and your implementation as alternate compilers for building a compiled mal-in-mal implementation.

Also, definitely join #mal :-)

from mal.

dubek avatar dubek commented on September 26, 2024

Another advantage for generating LLVM IR is that you can gain from optimizations at the IR level (LLVM's opt tool). For example, in malc a Mal expression like (* 3 (+ 2 5)) will be compiled something like mal_integer_to_raw(mal_mul(make_integer(3), mal_add(make_integer(2), make_integer(5)))) and opt will optimize all that to the literal 21 in the generated optimized LLVM code.

from mal.

pstephens avatar pstephens commented on September 26, 2024

@dubek wrote:

How does this work as, for example, step0, which should run an endless loop echoing the user's input?

I think the main distinction is AOT vs JIT. In order to support the REPL while also generating x64 a JIT will need to be implemented. Generating asm source will also be required to bootstrap from an existing mal implementation. Technically the whole thing could be written in plain assembly, but I kind of like the idea of using mal macros and fns for composition rather than the macro system of the assembler. AOT is a "nice to have" to improve startup timing but not required to pass the test suite.

This whole thing could also be written as an interpreter but would only end up being a lower level version of the C implementation. (Still interesting though!)

@kanaka wrote:

Emitting raw machine/object code would definitely be an interesting challenge/puzzle, but it would be a lot of effort to learn not that much more than emitting assembly.

Yeah, I've already decide to emit NASM assembly source rather than an ELF binary. This will make it possible to bootstrap from any existing mal implementation, similarly to how @dubek bootstraps the malc implementation. The JIT compiler could also shell out to NASM, but it shouldn't be terribly hard to write an in process assembler once the binary is bootstrapped. I'll only need to implement a subset of x64 anyway (no SSE, MMX, FP, etc.)

@kanaka wrote:

I would suggest going the LLVM IR route.

LLVM is also interesting and useful. And definitely better for the resume. But I do want to go pretty low level on how JIT/GC/Syscalls/Unwinding the Stack/etc. work. For example, I've already read one paper on GC, gotten 1/3 of the way through one of the AMD 64 reference volumes, and gotten part way through the persistent data structures book. So the learning strategy seems to be working.

For the "advanced mal course" it may be interesting to take a similar attack for other "virtual machines". I.e. target CIL, Java Byte Code, or LLVM IR. I use the term "virtual machine" loosely here because it may also make sense to cross compile to an existing high level language like JavaScript or Erlang.

@kanaka wrote:

I think it makes sense for the compiler itself to live in a separate repo.

Are you sure? I think this could follow the 11 step progression. It would have more "extra" files to cover the low level nature of the implementation. See https://github.com/pstephens/mal/tree/mal-x64/mal-x64 for my first take on this.

Or... this could be done separately as you suggest. But the 11 step progression would be less distinct. Maybe I'll keep implementing on the fork and we can defer this decision until later.

from mal.

kanaka avatar kanaka commented on September 26, 2024

from mal.

kanaka avatar kanaka commented on September 26, 2024

@pstephens I'm going to close this since it has been open for a while and in the meantime Ben Dudson has done a full x86 assembly/nasm implementation. If you do ever get back around to this alternate approach, I would be happy to consider including it too. I'm happy to include multiple implementations for the same language if they have very different approaches that still fit the overall mal structure and pedagogy goal.

from mal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.