GithubHelp home page GithubHelp logo

umnsec / mlta Goto Github PK

View Code? Open in Web Editor NEW
76.0 9.0 18.0 72 KB

TypeDive: Multi-Layer Type Analysis (MLTA) for Refining Indirect-Call Targets

License: MIT License

Makefile 2.09% CMake 1.97% C++ 94.19% Shell 1.74%

mlta's Introduction

TypeDive: Multi-Layer Type Analysis (MLTA) for Refining Indirect-Call Targets

This project includes a prototype implementation (TypeDive) of MLTA. MLTA relies on an observation that function pointers are commonly stored into objects whose types have a multi-layer type hierarchy; before indirect calls, function pointers will be loaded from objects with the same type hierarchy layer by layer. By matching the multi-layer types of function pointers and functions, MLTA can dramatically refine indirect-call targets. MLTA's approach is highly scalable (e.g., finishing the analysis of the Linux kernel within minutes) and does not have false negatives in principle.

TypeDive has been tested with LLVM 15.0, O0 and O2 optimization levels, and the Linux kernel. The finally results of TypeDive may have a few false negatives. Observed causes include hacky code in Linux (mainly the out-of-bound access from container_of), compiler bugs, and false negatives from the baseline (function-type matching).

How to use TypeDive

Build LLVM

	$ ./build-llvm.sh 
	# The tested LLVM is of commit e758b77161a7 

Build TypeDive

	# Build the analysis pass 
	# First update Makefile to make sure the path to the built LLVM is correct
	$ make 
	# Now, you can find the executable, `kanalyzer`, in `build/lib/`

Prepare LLVM bitcode files of OS kernels

  • First build IRDumper. Before make, make sure the path to LLVM in IRDumper/Makefile is correct. It must be using the same LLVM used for building TypeDive
  • See irgen.py for details on how to generate bitcode/IR

Run TypeDive

	# To analyze a list of bitcode files, put the absolute paths of the bitcode files in a file, say "bc.list", then run:
	$ ./build/lib/kalalyzer @bc.list
	# Results will be printed out, or can you get the results in map `Ctx->Callees`.

Configurations

  • Config options can be found in Config.h
	# If precision is the priority, you can comment out `SOUND_MODE`
	# `SOURCE_CODE_PATH` should point to the source code 

More details

@inproceedings{mlta-ccs19,
  title        = {{Where Does It Go? Refining Indirect-Call Targets with Multi-Layer Type Analysis}},
  author       = {Kangjie Lu and Hong Hu},
  booktitle    = {Proceedings of the 26th ACM Conference on Computer and Communications Security (CCS)},
  month        = November,
  year         = 2019,
  address      = {London, UK},
}

mlta's People

Contributors

huhong789 avatar kengiter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlta's Issues

Hidden requirement to g++-10?

I noticed that the kanalyzer build requires the C++ header files which can be found in g++-10. In my case, installing g++-10 helped me pass that stage. Probably it was a missing requirement when TypeDive was upgraded to LLVM-15.

sudo apt install g++-10

Hope it helps somebody having the same problem.

Spell error

The variable CUR_DIR is referred to as CURDIR in many cases.

Potential false negative problem in type confinement

First, I would like to express my respect for this work. By reviewing the source code of MLTA and some test cases I found a potential case that may cause MLTA to produce additional false negatives than FLTA.

Below is an example from dovecot project. And iostream_pump_flush is an address-taken function used as an argument of call expression o_stream_set_flush_callback(pump->output, iostream_pump_flush, pump);.

I noticed MLTA does process the case that addr-taken function is used as a call argument for type confining. However, in this case, there exists an indirect call in the call in the call-chain, _stream->set_flush_callback(_stream, callback, context); call o_stream_default_set_flush_callback. Where type confinement happens in o_stream_default_set_flush_callback. Will this lead to insufficient confinement of iostream_pump_flush to field ostream::callbacks ?

void iostream_pump_start(struct iostream_pump *pump)
{
	i_assert(pump != NULL);
	i_assert(pump->callback != NULL);

	/* add flush handler */
	if (!pump->output->blocking) {
		o_stream_set_flush_callback(pump->output,
					    iostream_pump_flush, pump);
	}

	/* make IO objects */
	if (pump->input->blocking) {
		i_assert(!pump->output->blocking);
		o_stream_set_flush_pending(pump->output, TRUE);
	} else {
		pump->io = io_add_istream(pump->input,
					  iostream_pump_copy, pump);
		io_set_pending(pump->io);
	}
}


void o_stream_set_flush_callback(struct ostream *stream,
				 stream_flush_callback_t *callback,
				 void *context)
{
	struct ostream_private *_stream = stream->real_stream;

	_stream->set_flush_callback(_stream, callback, context);
}

// indirect invoked by _stream->set_flush_callback
static void
o_stream_default_set_flush_callback(struct ostream_private *_stream,
				    stream_flush_callback_t *callback,
				    void *context)
{
	if (_stream->parent != NULL)
		o_stream_set_flush_callback(_stream->parent, callback, context);

	_stream->callback = callback;
	_stream->context = context;
}

can't find `kanalyzer` in `build/lib/` after `make`

the output of make:
(mkdir -p /data/syx/cfg/mlta/IRDumper/build && cd /data/syx/cfg/mlta/IRDumper/build && PATH=/data/syx/cfg/mlta/IRDumper/../llvm-project/prefix/bin:/data/syx/env/miniconda3/bin:/data/syx/env/miniconda3/condabin:/data/syx/.vscode-server-insiders/cli/servers/Insiders-11bfd76a61a299156a9f3138ecfad70937af3527/server/bin/remote-cli:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/cuda/bin:/opt/pycharm-community-2020.2.3/bin:/snap/bin LLVM_ROOT_DIR=/data/syx/cfg/mlta/IRDumper/../llvm-project/prefix/bin LLVM_LIBRARY_DIRS=/data/syx/cfg/mlta/IRDumper/../llvm-project/prefix/lib LLVM_INCLUDE_DIRS=/data/syx/cfg/mlta/IRDumper/../llvm-project/prefix/include CC=clang CXX=clang++ cmake /data/syx/cfg/mlta/IRDumper/src -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS_RELEASE="-std=c++14 -fno-rtti -fpic -O3 -v" && make -j) -- Found LLVM 15.0.0git -- Using LLVMConfig.cmake in: /data/syx/cfg/mlta/llvm-project/prefix/lib/cmake/llvm -- Configuring done -- Generating done -- Build files have been written to: /data/syx/cfg/mlta/IRDumper/build make[1]: Entering directory '/data/syx/cfg/mlta/IRDumper/build' make[2]: Entering directory '/data/syx/cfg/mlta/IRDumper/build' make[3]: Entering directory '/data/syx/cfg/mlta/IRDumper/build' make[3]: Leaving directory '/data/syx/cfg/mlta/IRDumper/build' [ 33%] Built target DumperObj make[3]: Entering directory '/data/syx/cfg/mlta/IRDumper/build' make[3]: Entering directory '/data/syx/cfg/mlta/IRDumper/build' make[3]: Leaving directory '/data/syx/cfg/mlta/IRDumper/build' make[3]: Leaving directory '/data/syx/cfg/mlta/IRDumper/build' [ 66%] Built target DumperStatic [100%] Built target Dumper make[2]: Leaving directory '/data/syx/cfg/mlta/IRDumper/build' make[1]: Leaving directory '/data/syx/cfg/mlta/IRDumper/build'

Questions about some logic in the source code

https://github.com/umnsec/mlta/blob/acb8f4ca60cbae108f077202985c059f40391bc4/src/lib/MLTA.cc#L490C1-L496C7
First of all, I would like to express my respect for your work. In the process of reading the source code of mlta, I have some questions here.
The logic of this part seems to be to process the callback function type. Should the first parameter of confineTargetFunction here be the storage address of StoreInst, rather than the entire instruction?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.