GithubHelp home page GithubHelp logo

operation-mango-public's Introduction

Operation Mango [Paper PDF]

Fast taint-style static analysis based vulnerability discovery

Common vulnerability discovery techniques all follow a top-down scheme: They start from the entry point of the target program, reach as deep as possible, and examine all encountered program states against a set of security violations. These vulnerability discovery techniques are all limited by the complexity of programs, and are all prone to the path/state explosion problem.

Alternatively, we can start from the location where vulnerabilities might occur (vulnerability sinks), trace back, and verify if the corresponding data flow may lead to a vulnerability. On top of this, we need “assumed execution”, which means when we are tracing back from a vulnerability sink to its sources, we do not faithfully execute or analyze every function on the path, instead we assume a data flow based on prior knowledge or some static analysis in advance and skip as many functions as possible during back tracing.

Checkout our experiment reproduction section to reproduce all the figures found in the paper.

Getting Started

There are several ways to run operation mango if you so choose.

Docker

Bypass all this non-sense and just use the container.

Tip

Don't forget to add volumes with -v for both the binary and result directory

docker run -it cl4sm/operation-mango:latest

Local

I highly recommend you create a separate python virtualenv for this method.

source venv/bin/activate
git clone https://github.com/sefcom/operation-mango-public.git
cd operation-mango-public
pip install .

To build the docker container locally:

cd operation-mango-public
docker build -f docker/Dockerfile . -t mango-user

Using Operation Mango

Once you install Operation Mango or use the docker container, you'll have access to two commands: mango and env_resolve.

mango

mango is your default command for running our basic taint analysis on binaries.

Tip

Using the --concise will significantly shrink the output size and speed up analysis.
It will not print the entire analysis taint trace in your results, but normally you won't need that.

mango /path/to/bin --results your_res_dir

will run the basic command injection taint analysis, checkout the --help flag for more options.

mango output structure

The output of this tool is fairly verbose, you'll be given the following:

Tip

Any values labeled as TOP are of unknown or unresolvable values

{category}_mango.out - The entire stdout/stderr of the mango run.
{category}_results.json - The json is as follows:

{
    "closures": [
        {
            "trace": {} // Function trace starting from input down to sink
            "sink": {} // Sink location
            "depth": int
            "inputs": {
                "likely": [] //Functions that flow directly into the sink
                "possibly": [] //Functions seen along the way that generally are used as inputs
            }
            "rank": int // How confident are we this function is a TruPoC
        },
    ],
    "cfg_time": float // time it took to generate the cfg in seconds
    "vra_time": float // time it took to run the variable recovery in seconds
    "mango_time": float // time it took for actual mango analysis in seconds
    "path": str // path to analyzed binary
    "name": str // binary name
    "sha256": str // sha256 of the file
    "error": str|None // If an error occured print it here
    ... // Other timing info
}

{category}_closures/ - The folder containing the results of individual flows to the sink, all of these are unresolvable by our tool. {category}_closures/0.{rank}_{entry_func@addr}_{sink_func@addr} - The individual closures printed with extra information about likely and possible input sources.
e.g. 0.70_main_0x403e70_system_0x40143c:

|||system(
|||       a0: <BV32 0x4455f8> -> "<BV32 TOP>",
|||       ) @ 0x40143c -> <BV32 0x0>

INPUT SOURCES:
Likely:
NONE
Possibly:
----------
KEY: "accept(fd: 3)@0x403fb0_274_3"
Keywords: None
Binary Source - UNKNOWN
socket(AF_INET, SOCK_DGRAM, 0)_273_32
accept(fd: 3)@0x403fb0_274_32
recv(accept(fd: 3)@0x403fb0_274_32)@0x404088
----------
RANK: 0.700

execv.json - This is mostly unused but should contain info about which other processes this binary tries to execute.

env_resolve

env_resolve performs a taint analysis of a given binary to find all uses of env and nvram variables. This is what enables our cross-binary bug finding.

env_resolve /path/to/bin --results your_res_dir

The output of this tool will be found at your_res_dir/env.json.
To feed this info into mango merge all of the env.json files together (even if there is only one) with

env_resolve /path/to/bin --results your_res_dir/env.json --merge

This will spit out the file your_res_dir/env.json.
Then feed it into mango.

mango /path/to/bin --env-dict your_res_dir/env.json --results your_res_dir

env_resolve output structure

The env.json output from env_resolve follows the results.json that mango outputs e.g.

{
    "results": {
        "func_name": // i.e. nvram_get
            {
                "key_name": //key used to retrieve the value i.e. "http_passwd
                {
                    "keywords": str // Any frontend keywords used to retrieve this value
                    "1":  // position of the argument starting from "1" (i know...)
                    {
                        "arg_value": [ // arg value, in the case of getter funcs it's always the key name.
                            "0xaddr", // addr where the value is used

                        ]
                    }
                }

            }
    },
    "cfg_time": float // time it took to generate the cfg in seconds
    "vra_time": float // time it took to run the variable recovery in seconds
    "analysis_time": float // time it took for actual mango analysis in seconds
    "path": str // path to analyzed binary
    "name": str // binary name
    "sha256": str // sha256 of the file
    "error": str|None // If an error occured print it here
    ... // Other timing info
}

Firmware Cross Binary and Frontend Keyword Bug Finding

If you're trying to find bugs in some firmware samples as described in our paper, then have a look at the mango_pipeline Here.
For further examples of how to use this checkout the Experiment Replication section.

Testing

# run all the tests for the developed features (isolated in the `package` module)
pip install pytest-cov
pytest

Handcrafted binaries

To ease testing, we crafted small binaries highlighting one (or several) case(s) we wanted to be able to handle properly. It was particularly helpful to drive the development of the Handlers.

They are located under the package/tests/binaries/ folder.

Binary Description
after_values/program Contains multiple calls to a sink in a single function.
layered/program Nested calls running more than the default 7-depth limit before reaching the sink.
looper/program Runs a loop before reaching a sink.
off_shoot/program Calls multiple functions that alter the input in sub functions before reaching the sink.
recursive/program Contains direct and in-direct recursive calls (Highlights flaw of unresolvable call-depth).
nested/program Nested calls and returns before reaching a sink.
simple/program Contains call to external function puts. Run through nested functions, leading to different sinks (execve, system).
sprintf_resolved_and_unresolved/program Contains two calls to system: one with constant data, the other one that could be influenced by the program user.

To ensure reproducibility of testing, the binaries have been added to the repository. Although, if looking to add a new one, a Makefile has been written for convenience.

# build some homemade light binaries
cd binaries/ && make && cd -

operation-mango-public's People

Contributors

cl4sm avatar

Stargazers

Sezangel avatar Jan Mall avatar qqq123 avatar Chris McMahon Stone avatar Xingwei Lin avatar  avatar R avatar  avatar  avatar Zichuan Li avatar Ben Demick avatar  avatar Josh Collyer avatar George Muscat avatar Felix Montanari avatar Nate Warfield avatar XYlearn avatar SungJun Park avatar  avatar Michael Messner avatar HaxRob avatar ccoday avatar Sam L. Thomas avatar Alloysius avatar  avatar

Watchers

Adam Doupe avatar Kevin Phoenix avatar Yan Shoshitaishvili avatar Tiffany Bao avatar Fish avatar Connor Nelson avatar  avatar Nate Warfield avatar

operation-mango-public's Issues

Mango fails to start with "Unable to find a loader backend" error

While trying to run Mango against a firmware image (Netgear R7000 router) it fails during startup with the following error:

root@28662db50d47:/operation-mango# mango /analysis/R7000-V1.0.11.216_10.2.122.chk --results /analysis/R7000 --concise
Traceback (most recent call last):
  File "/usr/local/bin/mango", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/operation-mango/package/argument_resolver/analysis/mango.py", line 767, in main
    analyzer = MangoAnalysis(**args.__dict__)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/operation-mango/package/argument_resolver/analysis/mango.py", line 41, in __init__
    super().__init__(*args, **kwargs)
  File "/operation-mango/package/argument_resolver/analysis/base.py", line 150, in __init__
    self.project = self.init_analysis(
                   ^^^^^^^^^^^^^^^^^^^
  File "/operation-mango/package/argument_resolver/analysis/base.py", line 191, in init_analysis
    project = angr.Project(self.bin_path, auto_load_libs=False)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/angr/lib/python3.11/site-packages/angr/project.py", line 147, in __init__
    self.loader = cle.Loader(self.filename, concrete_target=concrete_target, **load_options)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/angr/lib/python3.11/site-packages/cle/loader.py", line 188, in __init__
    self.initial_load_objects = self._internal_load(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/angr/lib/python3.11/site-packages/cle/loader.py", line 782, in _internal_load
    obj = self._load_object_isolated(main_spec)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/angr/lib/python3.11/site-packages/cle/loader.py", line 978, in _load_object_isolated
    raise CLECompatibilityError(
cle.errors.CLECompatibilityError: Unable to find a loader backend for /analysis/R7000-V1.0.11.216_10.2.122.chk.  Perhaps try the 'blob' loader?

Your paper mentions "Mango takes a firmware image as input, uses exist�ing tools (such as binwalk [2]) to unpack the firmware sample,
and finds all ELF executables" so I am assuming I don't need to unpack the image first, correct? The firmware image is fairly standard:

emba@emba:~$ binwalk R7000-V1.0.11.216_10.2.122.chk

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
58            0x3A            TRX firmware header, little endian, image size: 30666752 bytes, CRC32: 0xF6BF3E6A, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x21E5A8, rootfs offset: 0x0
86            0x56            LZMA compressed data, properties: 0x5D, dictionary size: 65536 bytes, uncompressed size: 5436480 bytes
2221538       0x21E5E2        Squashfs filesystem, little endian, version 4.0, compression:xz, size: 28440547 bytes, 1868 inodes, blocksize: 131072 bytes, created: 2024-01-04 03:28:17

emba@emba:~$

I used the Docker image method from your documentation, I didn't build it by hand.

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.