GithubHelp home page GithubHelp logo

motapinto / java-minus-compiler Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 4.39 MB

Development of a compiler for the Mini-Java language

Shell 0.13% Batchfile 0.05% HTML 3.39% Java 96.43%
compiler code-generation intermediate-representation semantic-analysis graph-coloring syntactic-analysis constant-propagation interference-graph liveness-analysis java-minus

java-minus-compiler's Introduction

JMM Compiler

Project Requirements

  • Java 15
  • JavaCC
  • Gradle

Compile

To compile the program, run gradle build. This will compile your classes to classes/main/java and copy the JAR file to the root directory. The JAR file will have the same name as the repository folder.

Run

To run the JAR in Windows, do the following command:

.\comp2021-1a.bat [-r=<num>] [-o] <input_file.jmm>

To run the JAR in Linux, do the following command:

./comp2021-1a [-r=<num>] [-o] <input_file.jmm>

The possible flags that can be used are the following:

  • -r= | Activate the -r optimization, relative to the liveness analysis and register allocation for local variables. Must be a positive integer, equal or higher than 1 (representing the maximum number of registers that each function can use for local variables). In case it's not possible to allocate registers to local variables, the compiler will create a report.
  • -o | Activates the -o optimizations. This optimization performs constant propagation, constant folding and dead code removal.

Test

To run all tests, enter the following command. All the tests are located in test/JMMTest.java

gradle test --tests "JMMTest"

Summary

  • Development of a compiler for .jmm files written in Java--, a subset of the Java language.
  • The compiler goes through 4 main steps:
    1. The program parses a Java-- class file by performing Lexical and Syntactic Analysis and generates a JSON file with its representation.
    2. Performs a Semantic Analysis to check any potential semantic errors.
    3. Converts the Java-- class into a Low Level Intermediate Representation.
    4. Performs Code Generation step using JVM instructions accepted by Jasmin, generating .class files.

Syntactic Errors

  • If the compiler finds a syntactic error inside a while statement it does not halt the execution and is able to recover from it, adding a Report with the error messages.
  • When detecting a syntactic error inside a while statement the compiler ignores every token until the next "{" or the next ")"
  • The generated .json file (in /generated/json) saves the AST if the program doesn't have errors otherwise it will save the list of reports.
  • All syntactic error messages include the line, column and expected token. One possible error message in the while statement is the following:
ERROR@SYNTATIC, line 4, col 4: Error(1) detected during parsing process. Unexpected token ')' ....
    Error Details:
        Line: 4          Column: 24
    Was expecting:
        "true" | "false" | "this" | "new" | "(" | "!" | <IDENTIFIER> | <INTEGER_LITERAL>

Semantic Analysis

Main steps

  1. Build Analysis Table (Symbol Table)
  2. Type Analysis
  3. Initialization Analysis

Scenarios covered with reports

The compiler detects the following semantic errors:

  • Duplicated imports
  • Redeclaration of variables
  • Redeclaration of methods
  • Redeclaration of function parameters
  • Missing imports
  • Invalid Types used
  • Accessing length of a non array
  • A method not be found
  • Invalid parameters to method
  • Types don't match
  • Array assignment to a non array variable
  • Array initialization with a Type different from int
  • Variables not initialized

Intermediate Representation & Code Generation

Intermediate Representation

Intermediate Representation & Code Generation Features

  • Class
  • Fields
  • Methods
  • Instructions
  • Conditionals (if and if-else)
  • Loops (while)
  • Arrays
    • Array initialization (newarray int)
    • Array Store (astore)
    • Array Access (aload)
    • Array Position Store (iastore)
    • Array Position Access (iaload)
  • Limits (.limit stack and .limit locals)

Code Generation Instruction Selection (default)

  • iconst_, bipush, sipush, ldc, for pushing integer to the stack with the lowest cost.
  • iinc for incrementing/decrementing local variables by a constant value.
  • ishl, ishr for using shifts with multiplications/division with a power of 2 number.
  • ineg for subtracting a variable to 0.
  • iflt, ifge, ifgt, ifeq, ifneq for if statements comparing with 0

Optimizations

All the optimizations are done at the LLIR level either after the Semantic Analysis or after the generation of the Intermediate Representation.

Optimizations (-o)

  • Constant propagation
  • Constant folding

Optimizations (-r=)

  • Register Allocation to num registers

Optimizations (default)

  • While conditions using do while template

Extra features

All the optimizations are done at the LLIR level either after the Semantic Analysis or after the generation of the Intermediate Representation.

  • Functions overload
  • Variables with keyword names: array, i32, ret, bool, field, method and void
  • Variables starting with $
  • Checks if a variable is initialized
  • Pop instructions to avoid the accumulation of stack size

Task Distribution

The development of the project was done in a collaborative manner using platforms such as Discord and VSCode live share. There was constant interchanging in tasks, and the code many times was implemented in a pair-programming environment and constant discussions about algorithms efficiency, data structures where all members participated.

Pros

  • Function overloading
  • Efficient instructions in Jasmin
  • Checks if variable is initialized
  • Pop instructions to avoid the accumulation of stack size
  • Do-while template optimization
  • Constant folding optimization
  • Constant propagation optimization
  • Meaningful error/warning reports
  • Register allocation (graph coloring)
  • Code structure
  • Robustness of the compiler
  • Comprehensive tests in JmmTest class
  • Storage of all steps
    • Saves .json AST file (/generated/json) while doing syntactic analysis.
    • Saves .symbol file in (/generated/symbol) with the symbol table contents.
    • Saves .ollir file (/generated/ollir) while doing intermediate representation step.
    • Saves .j file in (/generated/jasmin) while doing code generation step.
    • Saves .class file in (/generated/class) with the decompiled result.

Cons

  • Grammar uses one local lookahead of 2

java-minus-compiler's People

Contributors

joaobispo avatar kiko-g avatar luispramos avatar motapinto avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.