GithubHelp home page GithubHelp logo

cuckoo-osx-analyzer's People

Contributors

rodionovd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cuckoo-osx-analyzer's Issues

autoprobes.py => join similar probes together

For example, probes with the same signatures can be joined together in the following way:

/* Both foo() and bar() have this signature: (int, char *) -> int */
pid$target::foo:entry,
pid$target::bar:entry
{
    /* ... */
}

pid$target::foo:return,
pid$target::bar:return
{
    /* ... */
    printf(/*(int, char *) -> int*/);
     /* ... */
}

Add support for struct arguments in dtrace probes

It would be nice to be able to see the contents of any structure, not just it's address.


Consider the following API:

typedef struct {
    int hash;
    const char *definition;
} entry_t;


char *copy_entry_definition(entry_t *entry, int len);

One should be able to describe entry_t type in the API definition (or somewhere globally) like so:

"entry_t" : [
    {"field_name": "hash",       "field_type": "int"},
    {"field_name": "definition", "field_type": "string"},
]

and receive a report similar to this from Cuckoo:

          API          |               Arguments                   |     Status
-----------------------|-------------------------------------------|------------
copy_entry_definition  | entry: {hash => 971, definition = "hey"}  | SUCCEED
                       |     n: 128                                |

dtruss tests fail sometimes

Things break and I don't know why. It happens on Travis only, not on my machine so it's hard to debug. I also don't think that fixing it is important right now since we don't use dtruss for analysis (yet?).

Thus I guess this issue should be marked with maybe or even wontfix until we need dtruss to do something real for us.

Bootstrapping

Need a bootstrap script that would integrate the analyser into an existing/new Cuckoo Sandbox repository clone automatically.

Here're the steps required so far:

# $ git clone https://github.com/cuckoobox/cuckoo/ ~/projects/cuckoo
$ cd ./cuckoo-osx-analyzer
$ ln -s ./analyzer/darwin/ ~/projects/cuckoo/analyzer/

Generate dtrace probes from API descriptions

There's a lot of boilerplate code required for installing a single probe; here's an example for system():

pid$target::system:entry
{
    self->deeplevel++;
    /* Save the arguments we've already got for our callee */
    self->arguments_stack[self->deeplevel, "arg0"] = self->arg0;
    /* And remember our own arguments */
    self->arg0 = arg0;
}

pid$target::system:return
{
    this->retval = arg1;
    this->timestamp_ms = walltimestamp/1000000;

    printf("{\"api\":\"%s\", \"args\":[\"%S\"], \"retval\":%d, \"timestamp\":%ld, \"pid\":%d, \"ppid\":%d, \"tid\":%d}\n",
        probefunc,
        copyinstr(self->arg0),
        (int)this->retval,
        this->timestamp_ms, pid, ppid, tid);

    /* Restore arguments for our callee */
    self->arg0 = self->arguments_stack[self->deeplevel, "arg0"];
    /* Release the memory for the current level stack */
    self->arguments_stack[self->deeplevel, "arg0"] = 0;
    --self->deeplevel;
}

So instead of copy-pasting this code manually we must have an automatic probe generator.

We may use JSON for storing descriptions as it's understandable for both humans and computers 🎱

bootstrap_host.sh => make sure a host-only adapter (e.g. vboxnet0) is up and running

If it's not, you'll receive the following error from pfctl:

no IP address found for vboxnet0:network
./pfrules:1: could not parse host specification

Cuckoo itself also won't be able to run:

$ ./cuckoo.py
2015-07-11 01:44:24,398 [root] CRITICAL: CuckooCriticalError: Unable to bind ResultServer on 192.168.56.1:2042: [Errno 49] Can't assign requested address

The fix is described, for example, here and here, so we can include it right into the script.

Send analysis results to host [BSON]

Since we're using dtrace for analysing now, we have to form and upload the results from analyzer itself. I guess there's a Python library for handling BSON (e.g. PyMongo has bson module).


P.S. I've been working on the analyser here, but it's not complete yet.

package.py -- Unable to import package

Hi,

I am analyzing Mach-o file but it's showing an error unable to import package , package doesn't exit.
.
Kindly tell me how to resolve this issue.
.
Thanks & Regards
Sean

bootstrap_guest.sh

Hi,
Two questions:

  1. Where should bootstrap_guest.sh be run? I ran it on my mac-mini host and it blew my network connectivity as it assigned 192 address to eth0.
  2. I get timeout error. "Error: OSX108: the guest initialization hit the critical timeout, analysis aborted.". Looks it is due to some network issue detonating the sample in the OSX VM. Any suggestion?

TIA,
krishna

apicalls.d doesn't follow *all* children, only fork‘ed ones

There's a bug in my implementation of apicalls.d that prevents it from following exec’ed children 👎

On the other hand, if they were fork’ed it works like a charm)


I guess the reason is attaching dtrace way too early — when shared libraries are not loaded yet into the child address space, and thus probes from pid provider could not be installed on them.

bootstrap_guest.sh => disable system integrity protection on OS X 10.11

There's a new thing in OS X 10.11 called SIP (system integrity protection) aka «Rootless». Basically it takes all privileges away from root: you can no longer write to protected system locations, modify system files and so on.

That's no good for us, and right now we have an ability to disable it on 10.11 machines with the new boot argument:

$ sudo nvram boot-args="rootless=0"

Although Apple may (and they are going to) remove this argument in a release version of the OS. Let's just hope that there will be a workaround we could use.

Resources

Tracing calls to APIs from libraries loaded at runtime

The problem is that dtrace/pid can't enable probes on functions that haven't been loaded on program startup. Which basically means that if a target process loads a library at runtime (e.g. via dlopen()), we won't be able to install probes on any symbol from this library.

Demo

// clang -o demo demo.c
#include <dlfcn.h>
int main(int argc, char const *argv[])
{
    void *h = dlopen("/usr/lib/libThaiTokenizer.dylib", RTLD_NOW);
    int (*isThai)(int) = dlsym(h, "isThai");;
    int x = isThai(123);
    return 0;
}

Compile and try attaching dtrace to it:

sudo dtrace -n 'pid$target::isThai:entry { printf("isThai()"); }' -c ./demo

The following error will occur:

dtrace: invalid probe specifier pid::isThai:entry { printf("isThai()"); }: probe description pid::isThai:entry does not match any probes

OverflowError: MongoDB can only handle up to 8-byte ints

From my understanding there's a problem with huge? agrument values for some APIs. Need to revise apicalls.d for invalid format specifiers and so.

Here's an error log:

2015-06-14 04:33:42,723 [root] DEBUG: Starting analyzer from /Users/bigmike/tlihl
2015-06-14 04:33:42,723 [root] DEBUG: Storing results at: /var/folders/75/s_0qg5y16_3dmhs1_fxnn0wh0000gn/T/PYzMFCoV
2015-07-16 03:04:23,542 [modules.packages.zip] DEBUG: Missing file option, auto executing: CCMenu.app/
2015-07-16 03:04:23,588 [modules.packages.zip] DEBUG: Analysing file "CCMenu.app/" using package "App"
2015-07-16 03:04:43,081 [root] ERROR: Traceback (most recent call last):
  File "/Users/bigmike/tlihl/analyzer.py", line 119, in <module>
    success = analyzer.run()
  File "/Users/bigmike/tlihl/analyzer.py", line 48, in run
    self._analysis(package)
  File "/Users/bigmike/tlihl/analyzer.py", line 108, in _analysis
    package.start()
  File "/Users/bigmike/tlihl/modules/packages/zip.py", line 60, in start
    self.real_package.start()
  File "/Users/bigmike/tlihl/lib/core/packages.py", line 101, in start
    self.apicalls_analysis()
  File "/Users/bigmike/tlihl/lib/core/packages.py", line 112, in apicalls_analysis
    self.host.send_api(call)
  File "/Users/bigmike/tlihl/lib/core/host.py", line 85, in send_api
    "args" : self._prepare_args(thing)
  File "/Library/Python/2.7/site-packages/pymongo-3.0.2-py2.7-macosx-10.8-intel.egg/bson/__init__.py", line 888, in encode
    return cls(_dict_to_bson(document, check_keys, codec_options))
OverflowError: MongoDB can only handle up to 8-byte ints

Add timeout to dtrace wrappers

Since some targets may run forever, we need to set a timeout value for dtrace tracing process, so it will only run for X seconds and no longer.

Document Mac OS X Setup

In order to use the Mac OS X analyzer one will have to setup a Mac OS X environment. Either bare metal (running the analyses on an actual mac mini or so) or running Mac OS X inside a Virtual Machine.

Regardless of the approach taken, it'd be very, very useful if you can document the steps that you have taken in order to set it up as I'd very much like to do the same at some point.

dtrace scripting

Need to figure our how to script dtrace so it can catch up the following:

  • syscalls

    see dtruss(1), errinfo(1)

  • disk I/O

    see iosnoop(1) (but it doesn't catch a cached data i.e. from RAM); opensnoop(1) (for open());

  • internet connections

  • fork(), exec() and stuff like this

  • API calls

All built-in scripts

See the output of $ man -k dtrace

Further reading:

[Meta] The overview

Core Strategy

Preparation

  • setup directories to store the results in
  • setup logging?
  • Parse the passed config
    • setup vm clock
    • read the default injector lib path
  • remember a target path (file analysis) or URL (URL analysis)

Run

  • read an analysis package from the config or detect it automatically (e.g. based on the file name extension)
  • initialise all the auxiliary modules and launch them
  • run the decided analysis package
    • it could return a list of PID that we’ll add into our watch list (we won’t stop the analysis until all of these procs are alive)
    • or it might return nothing and we’ll terminate the analysis process by the timeout (read it’s vale from the config)
  • (in case of injection) listen to the injected library for analysis updates
  • upload analysis artefacts to the host
  • terminate all of the aux modules
  • kill the spawned processes if needed

Completion

  • (in case of dtrace) parse the dtrace output (files?) and figure them out; send the results to the host
  • dump files dropped to us from the host

Required Stuff

  • Choose a way the analyser will talk to an injected process and receive updates back

bootstrap_guest.sh => install an anti-anti-dtrace kernel module

Checklist:

Cut apicalls.d into pieces

apicalls.d is just too big and has lots of responsibilities: it handles both API probes we need for analysis and all the "follow-my-children" stuff.

What we need is to move the latter into a new script (think of follow_children.d) which will be #includeed into the master script.

See also: separate-dtrace-scripts branch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.