GithubHelp home page GithubHelp logo

swisschili / vm Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 57 KB

stack based virtual machine

License: GNU General Public License v3.0

C 95.78% C++ 2.55% Meson 0.53% Assembly 1.14%
virtual-machine vm c assembler

vm's Introduction

vm

This project contains a simple stack-based virtual machine. the instruction set is available on the wiki.

building

$ meson build
$ ninja -C build
# build/vm is the compiled binary

design

This virtual machine is designed with the following ideas in mind:

  • complete unicode support
  • stack-based
    • untyped stack, everything on the stack is a 32 bit integer, which can be a pointer to the heap, or just a value

convenience > file size

I've made the decision several times during the design of this software to prioritize the convenience of the developer over everything else. By developer I mean both myself working on the virtual machine, and any person who might use it as a compiler target, write assembly for it, or really anything pertaining to its use. For this reason I've chosen to use 32 bit integers as the basis for everything in this vm. Strings are encoded in utf-32, pointers are 32 bit, the heap is a vector of 32 bit ints, even the bytecode encodes every instruction as a 32 bit integer. The detriments of this are pretty obvious, drastically increased file size, possible increase in memory consumption (although it can't be as bad as the JVM... can it?)

types

strings

Strings are encoded in utf32. They are stored on the stack like so:

    End of string     Top of stack
          v                v
[ ...  0xFFFF  o  l  l  e  H ]

And on the heap:

Start of string    End of string
       v                v
[ ...  H  e  l  l  o  0xFFFF  ... ]

complex types

Complex types (structs, unions, etc) should be constructed in a manner similar to lambda calculus. The exact nature of these is really dependent on the compiler. For instance, here is a way a C struct could be compiled to a lambda calculus (and vm) function:

struct foo
{
    int a;
    int b;
    char c;
}
-- foo :: Int -> Int -> Int -> Bool -> Bool -> Bool -> Int
λ a . λ b . λ c .
    λ a1 . λ b1 . λ c1 .
        a1 a (b1 b (c1 c 0))

Assuming of course that the Bool type mentioned is the typical lambda calc boolean implementation of

-- true = 
λ a . λ b . a
-- false =
λ a . λ b . b

Creating a type foo then would involve calling the function with the first three arguments, the items to store in it, and accessing those values by passing three more booleans to it.

usage

  • -D, --debug: show debug output, prints stack, registers, and operation at each iteration
  • -r, --run: text assembly file to execute
  • -e, --execute: compiled bitcode to execute

example

$ cat test.s
PSH 123
POP EAX
LDR EAX
$ vm -Dr test.s
Running test.s with debug mode 1
____________________________
PSH 123
lt: 0 gt: 0 eq: 0
EAX: 0 EBX: 0 ECX: 0 EDX: 0
[ 123 ]
____________________________
POP 0
lt: 0 gt: 0 eq: 0
EAX: 123 EBX: 0 ECX: 0 EDX: 0
[ ]
____________________________
LDR 0
lt: 0 gt: 0 eq: 0
EAX: 123 EBX: 0 ECX: 0 EDX: 0
[ 123 ]
Returned 123

known issues

  • parser uses scanf and is really finicky. It will not accept leading or trailing whitespace.
  • DUP instruction does not parse for some bizarre reason.

roadmap

  • parse using a less horrible method (maybe bison or yacc?)
  • add a heap

license

GPL-3.0 License

Copyright (C) 2019 swissChili

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

vm's People

Contributors

swisschili avatar

Watchers

James Cloos avatar

vm's Issues

Debug output from map does not print when two LDRs are used next to each other.

Consider the following assembly. Whether it works or not is besides the point, but please note the LDR EBX instruction on line 21:

@start
    ; int goal = 10
    PSH 10
    POP EAX
    ; fact()
    CAL @fact
    END

@fact
    ; int i = 1
    PSH 1
    POP EBX
    ; int sum = 1
    PSH 1
    POP ECX

; Unused, for decoration
@fact:iterate
    ; else
    ; sum = sum * i
    LDR EBX
    LDR ECX ; EAX EBX ECX
    MLT     ; EAX EBX*ECX
    POP ECX ; EAX
    LDR EBX ; EAX EBX
    ; i++
    INC
    POP EBX

@fact:compare
    ; if (goal == i) goto end
    LDR EAX ; EAX
    LDR EBX ; EAX EBX
    CMP
    JGT @fact:iterate

@fact:end
    LDR ECX
    RET

If this assembly is run with vm -Dr foo.s, the normal output from the hashmap does not print, nor does a test puts call I put between the two while loops in asm.c (the label parser and the instruction parser).

If this LDR EBX instruction is commented out, the program works fine. I mean, the expected output is printed, and it runs (although, obviously the program does not do what it is supposed to since it lacks a rather important instruction).

I have literally no idea how this sort of bug is even possible. If I move the LDR instruction forward one instruction, it does not work either. It seems to me that the presence of these two LDRs next to each other is breaking something. The likely reason for me is that the label addresses are wrong, and the map is reading garbage data, and causing issues, although I can't prove this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.