GithubHelp home page GithubHelp logo

wombat-symx's Introduction

Wombat SymX

Introduction

Wombat SymX is a symbolic executor that operates on LLVM IR (specifically, *.bc files) and uses a novel node-based approach.

Setup

Note that LLVM 13 is required for running the program and for creating *.bc LLVM binaries. This is packaged with the following rust compilers (rustc 1.60.*-1.64.*).

For MAC and Linux, run:

curl https://sh.rustup.rs -sSf | sh

Project is tested on an M1 Pro Macbook 2021 with rustc 1.60.0 (7737e0b5c 2022-04-04) and llvm-13.0.1.

Mac Dependency Installation

You will need the following dependencies:

  • CMake (brew install cmake)
  • Swig (brew install swig)
  • LLVM (brew install llvm@13)

Run brew info llvm@13 to see information from the llvm installation. Run the command to add llvm to the $PATH.

Lastly, add the prefix for llvm (as seen in the installation path from brew info llvm@13) to the environment variable LLVM_SYS_130_PREFIX.

  • ex: export LLVM_SYS_130_PREFIX="/opt/homebrew/opt/llvm@13"

Build Project

To build the project, use:

cargo build

Update Project Dependencies

To update project dependencies, use:

cargo update

Note it is wise to backup your local dependencies in case an external dependency is updated in a breaking way and is not properly versioned.

Runtime Execution

To see an overview of run commands, use the following:

cargo run -- --help

To run the project, use:

cargo run -- [rs-file-path] [function-name]

To run the project with debug output enabled, use:

cargo run -- -d [rs-file-path] [function-name]

Run Test Suite

To run all (integration) test case functions (optionally matching a prefix), use:

cargo test [test-prefix]

To run test case functions with output, use:

cargo test [test-prefix] -- --show-output

Creating LLVM IR files

To create bc files containing LLVM IR that Wombat SymX can use, run the following command:

rustc --emit=llvm-bc <file-name>.rs

A human-readable LLVM IR format can be created by using the following:

rustc --emit=llvm-ir <file-name>.rs

Linting and Formatting

Obtain a report from the linter by running the following command:

cargo clippy

Attempt to automatically fix linter issues by running the following command:

cargo clippy --fix

Standardize the format of the source code by running the following command:

cargo fmt

Benchmarking

Install KLEE

brew install klee

Install hyperfine

brew install hyperfine

Add to Include Path

Add the path to the directory (check with brew info klee) with the header file to the C include path. The path looks like the following:

m1: "/opt/homebrew/Cellar/klee/2.3_4/include/klee"

intel: "/usr/local/Cellar/klee/2.3_4/include/klee"

Generate Test Cases

python3 generate_test_seq_br.py <language> <number_of_branches> <safety>

Compile C Code for KLEE

m1: clang -I "/opt/homebrew/Cellar/klee/2.3_4/include/klee" -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone <c_file>

intel: clang -I "/usr/local/Cellar/klee/2.3_4/include/klee" -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone <c_file>

Time KLEE

time klee <c_bc_file>

Time Wombat SymX

time cargo run -- <rust_file> test

Resources

LLVM Unsigned vs Signed

LLVM lifts all integers to signed. Intrinsic functions still use unsigned operations while taking signed integers as arguments.

https://stackoverflow.com/questions/14723532/llvms-integer-types

wombat-symx's People

Contributors

wu-benjamin avatar justinreiter avatar yujerry24 avatar tomlu1323241 avatar

Stargazers

 avatar

Watchers

James Cloos avatar Bjon Li avatar  avatar  avatar  avatar

wombat-symx's Issues

Support Showing Stack Trace for Unsafe Main Function

The current implementation for showing the stack trace of a crash for an unsafe function works by creating a modified copy of the Rust source code being analyzed.

In particular, it renames the original main function to _main and creates a new main function that calls the function being analyzed with arguments that cause the program to crash. This renaming of main to _main will probably cause issues if we try to analyze main (since it should now call _main instead of main).

We can consider what happens if _main already exists or we can maybe leave that as an unsupported edge case; it is easy enough for a user to manually rename a _main function to something else before using this tool.

Avoid checking for magic strings

Currently, the get_var_name function and code checking variable types parse magic strings.

We should find a less brittle alternative if possible.

Handle variable type domain restrictions

Currently only i32 and i1 (bool) variables have their domain properly constrained.

This sort of constraint should be applied to all types we support in the future.

Standardize Benchmarking

We make sure that the performance comparisons between KLEE and Wombat SymX are between equivalent actions.

Specifically, we should not include the compilation time of the Rust source code in Wombat SymX performance time if such compilation is not timed in the KLEE workflow.

Use stable identifier for intrinsic function calls

Currently the panic function is matched to _ZN4core9panicking5panic17he60bb304466ccbafE which is not stable across machines (and possibly Rust/LLVM builds).

We need to find a way to identify such functions reliably to do codegen for LLVM call instructions.

Fix Phi Codegen Implementation

Given a Phi instruction in basic block bb3 assigning x <- 1 when coming from bb1 and x <- 2 when coming from bb2, it is not correct to push the assignments into the bb1 and bb2 nor is it correct to condition the assignment based on the entry conditions from bb1 to bb3 and from bb2 to bb3.

This is because such a solution for converting to dynamic single assignment incorrectly assumes that bb1 and bb2 are never in the same execution path of the control flow graph. An example of this issue is having bb1 -> bb2, bb1 -> bb3, and bb2 -> bb3 as edges in our control flow graph.

To fix this, we can create intermediate dummy nodes for each edge in the original graph that are either empty or contain the assignments for the Phi instructions of its successor. These intermediate blocks always go to their destination unconditionally. This means only one such block can be in a path to a given node (due to the control flow graph being assumed to be acyclic so we must not enter the same destination node twice through the two nodes in a given execution path).

Extend signed integer support for i128

Currently all signed integer types below i128 are supported (ie. i1, i8, i16, i32, i64). Since the Z3 Int types were created using from_i64, this could not be immediately extended to i128.

This can probably be implemented still through using other init functions for Z3 Ints.

Relevant:

Issue: #9

PR #27

Fix i16 bounds

The i16 bounds have a typo causing them to be bounded as if they were i8s.

Notify inkwell of order of conditional branch operands

The documentation of LLVM suggests the operand order of predicate, true_block, false_block. However, inkwell switches the order of true_block and false_block. We should suggest inkwell add this to their documentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.