GithubHelp home page GithubHelp logo

innodocs / vm Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 256 KB

Virtual Machine dev

License: GNU General Public License v3.0

Standard ML 26.62% C++ 28.57% Scala 36.41% ANTLR 2.90% Java 1.04% Lex 1.72% Assembly 2.15% GAP 0.59%
virtual-machines scala cpp17 antlr4 smlnj lex yacc gap translator compiler

vm's Introduction

VM

Virtual Machine Development

Introduction

The goal of this "VM" project is to build a language system complete with virtual machines, compilers, assemblers and translators (GAP to C etc).

The system is being constructed in small steps, gradually and incrementally for the purpose of accessibility. The material could also be useful in a CS introductory lecture on language systems. The chapters are as follows:

  • C1: abstract syntax for SLP (Straight Line Programs [APPL1998], pg 7 - 12); compiler/emitter of VM code; and Virtual Machine for SLPs (C++, SML-NJ, Scala)
  • C2: add emitter for assembly language to compiler; add simple assembler (Antlr, Java, Scala)
  • C3: lexers/parsers for SLP (Antlr, ml-lex, ml-yacc)
  • C4: replace SLP w/ GAP language (Groups, Algorithms, and Programming [GAP2021]), implement 'if' and 'while' statements
  • C5: add 'for' and 'repeat' statements; move eval and pretty print code from compiler file into separate files.
  • C6: add string type to compiler, add string pool, string constants, string printing to VM/Asm
  • C7: add functionality for functions (function), local variable declarations (local), function returns (return), as well as loop break and continue statements.

Folder Structure

Each chapter folder is structured in the same way:

  • vm: the actual virtual machine (C++)
  • asm: the assembler (Scala, Antlr)
  • comp: the compiler for SLP (C1-C3), and GAP; the 'scala' sub-directory holds the scala version of the compiler, 'sml' holds the SML-NJ version
  • test: test programs

Building the System

Given our goal of accessibility, we have attempted to make building and experimenting with the system as simple as possible. As such we have refrained from using build systems with involved folder structures, rules, learning curves, and have instead opted to place all required files in a single folder together with a simple ant build file. Even Ant is not required, as it is simple enough to build the various (sub)systems by hand.

Building with Ant

Define the SCALA_HOME, ANTLR_HOME and ANTLR_CLASSPATH environment variables, e.g.:

    export SCALA_HOME="/usr/local/Cellar/scala/2.13.5/libexec/"
    export ANTLR_HOME="/usr/local/Cellar/antlr/4.9.2/"
    export ANTLR_CLASSPATH="$ANTLR_HOME/antlr-4.9.2-complete.jar"
    
    export PATH="$ANTLR_HOME/bin:$PATH"
    export CLASSPATH="$ANTLR_CLASSPATH:$CLASSPATH"

For each subsystem:

  • cd to sub-system folder (vm, asm, comp/scala)
  • run ant:
Task Command
to build the system ant
to build and test ant test
to cleanup ant clean
  • if everything went OK, there will bevm, vm-asm and vm-comp binaries in the bin folder. To test/run:

    cd test
    ../bin/vm-asm asmtest.asm asmtest.vm
    ../bin/vm asmtest.vm
    
    ../bin/vm-comp test4.gap test4.vm
    ../bin/vm test4.vm
    

For the sml build:

cd comp/sml
sml sources.cm
Test.run();

Notes on Migrating from SML to Scala

The file Sml-to-Scala.md in the dist root folder contains instructions on how to convert SML code to Scala. Understand that this is the way we've handled the migration, and there are probably other/better ways to do it.


C1

C2

C3

C4

C5

C6

Quick notes on the functionality added in this section: until now, printing was restricted to lists of integers:

a := 1;
b := 2;
Print(a, b);

would print

1, 2

We've now added support for string arguments to Print, so now we're able to format our output:

Print("a = ", a, "b = ", b, "\n");

will output

a = 1, b = 2

At the VM level, we added an instruction for string printing, SPRINT, and a string pool to the VM file with the following structure:

  +-------+----- -+-------+-------+- - - -+-------+-------+- - - -+
  | string| total | str 1 | str 1 |       | str n | str n |       |
  | count | size  | len   | char 1|       | len   | char 1|       |
  +-------+- -----+-------+-------+- - - -+-------+-------+- - - -+

The assembler and compiler have to maintain a map of all encountered strings to avoid the inclusion of duplicates in the VM file and emit the stringpool as part of the VM file (stringpool.sml).

The only real complication is the fact that GAP is a language w/o type declarations, which means that appart from having to add support for types to the compiler (type.sml)

datatype ty = ANY
            | ANYVAL
            | BOOL
            | INT
            | ANYREF
            | STRING
            | ARRAY of ty
            | RECORD of (Symbol.symbol * ty) list
            | NULL
            | NOTHING
            | META of ty ref

we also had to implement a basic type inferencing algorithm (inferTypes in compiler.sml).

The other notable change was the replacement of the simple/simplisitic way of handling environments with support for symbols and functional symbol tables (see [APPL1998], pg 107 - 111) (symbol.sml).

C7

In this chapter we are adding support for functions (GAP function), local variable definitions (GAP local) and return, continue and break keywords. This will require a rework of the code generator and emitter as the current implementation only supports computing full block jump offsets -- but return can jump across multiple blocks.

Before diving into this we will start with an intermediate project and add a translator from GAP to C++, which is rather easy to implement on top of the current code base.


Bibliography

[APPL1998] Andrew W. Appel, Modern Compiler Implementation in ML; 1998. (https://www.cs.princeton.edu/~appel/modern/ml/)

[GAP2021] The GAP Group, GAP -- Groups, Algorithms, and Programming, Version 4.11.1; 2021. (https://www.gap-system.org)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.