GithubHelp home page GithubHelp logo

xiaotili0123 / uroboros Goto Github PK

View Code? Open in Web Editor NEW

This project forked from piax93/uroboros

0.0 0.0 0.0 814 KB

Infrastructure for Reassembleable Disassembling and Transformation

Python 68.08% TeX 31.92%

uroboros's Introduction

Uroboros

Infrastructure for Reassembleable Disassembling and Transformation

Fork motivation

This fork is made with the idea of extending this technique to ARM Thumb executables. In such process, the OCaml core has been completely rewritten in Python.

To this date the rewritten tool has been tested to work on the following executables: bzip, gzip, BLAKE2, Himeno benchmark, dcraw (with statically linked libjpeg and liblcms, ARM requires assumption 3), FLAC encoder (with statically linked libFLAC), dolfyn, OPUS encoder (with statically linked libopus, ARM requires assumption 3).

Installation

Uroboros uses the following utilities (version numbers are in line with what was used during development, older releases may work as well):

Tool Version
python 2.7
objdump ≥2.22
readelf ≥2.22
awk ≥3.18
libcapstone 3.0.5-rc3

and the following python packages (available through pip repositories):

Package Version
capstone ≥3.0.4
termcolor ≥1.1.0
pyelftools ≥0.24

Build

Uroboros is now completely written in Python on the allpy branch. You don't need to build anything. However, you may want to modify some values in config.py to match your system configuration. Also, the parser, though recognising a large number of operators, is not complete; in case invalid operator exceptions are raised, these can be added to the right set in Types.py.

Usage

Uroboros supports 64-bit and 32-bit ELF x86 executables and, experimentally, also Thumb2 ARM binaries. To use Uroboros for disassembling:

 $> python uroboros.py path_to_bin

The disassembled output can be found in the workdir directory, named final.s. Uroboros will also assemble it back into an executable, a.out.

The startup Python script provides the following options:

  • -o output

    This option allows to specify an output path for the reassembled binary.

  • -g

    Apply instrumentations. New instrumentations can be implemented by creating subpackages in the instrumentation package. These must contain at least two modules (see the example package):

    • a module having the same name of the package with a function named perform, accepting a list of instructions and a list of function objects and returning the instrumented list of instructions, and a function named aftercompile. The first is invoked just after the symbol reconstruction phase is completed, while the latter allows further modifications after the code has already been adjusted for compilation;
    • a module named plaincode which must contain three string variables name beforemain, aftercode and instrdata. These are respectively inserted at the beginning of the main function, at the end of the .text section and at the end of the source file.

    Instrumentations are applied in alphabetical order, the task of preventing interference among different instrumentations is left to the user. If multiple instrumentations have been implemented but only a subset has to be used, adding their package names as strings in the instrumentors list of the config.py file will allow only these to be loaded and executed (in this case the order is the one specified by the user).

    Instrumentation against ROP attacks using an adaptation of the technique described in [2] is already available in this repository.

  • -gcc "parameters"

    String of additional arguments a user may want to pass to the compiler.

  • -ex exclusions_file

    Allows to specify a file containing on each line either a hexadecimal value to exclude from symbol search inside the code or an address ranges, in the format hexaddress-hexaddress, of the data sections which will be skipped when searching for pointers.

  • -fex function_exclusion_file

    In case a non-stripped binary is being analysed, allows to specify a file containing a list of symbol which should not be considered functions.

  • -a assumption_number

    This option configures the three symbolization assumptions proposed in the original Uroboros paper [1]. Note that in the current version, the first assumption (n-byte alignment) are set by default. The other two assumptions can be set by users.

    Assumption two reqires to put data sections (.data, .rodata and .bss) to its original starting addresses. Linker scripts can be used during reassembling (gcc -T ld_script.sty final.s). Users may write their own linker script, some examples are given at ld_script folder.

    Assumption three requires to know the function starting addresses. To obtain this information, Uroboros can take unstripped binaries as input. The function starting address information is obtained from the input, which is then stripped before disassembling.

    These assumptions can also be used at the same time (python uroboros.py path_to_bin -a 3 -a 2)

Stuff it would be nice to do

  • More testing on real applications and, after that, even more testing.
  • Change the way data flow is managed: now, to ease the debugging process, most of the data is passed along via file, which implies a lot of unnecessary IO operations.

[1] Reassembleable Disassembling, by Shuai Wang, Pei Wang, and Dinghao Wu. In Proceedings of the 24th USENIX Security Symposium, Washington, D.C., August 12-14. 2015.

[2] G-Free: defeating return-oriented programming through gadget-less binaries, by Onarlioglu Kaan, Leyla Bilge, Andrea Lanzi, Davide Balzarotti, and Engin Kirda. In Proceedings of the 26th Annual Computer Security Applications Conference, pp. 49-58. ACM, 2010."

uroboros's People

Contributors

piax93 avatar s3team avatar computereasy avatar wangshuai901 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.