rodionovd / cuckoo-osx-analyzer Goto Github PK
View Code? Open in Web Editor NEWAn OS X analyzer for Cuckoo Sandbox project
License: MIT License
An OS X analyzer for Cuckoo Sandbox project
License: MIT License
For example, probes with the same signatures can be joined together in the following way:
/* Both foo() and bar() have this signature: (int, char *) -> int */
pid$target::foo:entry,
pid$target::bar:entry
{
/* ... */
}
pid$target::foo:return,
pid$target::bar:return
{
/* ... */
printf(/*(int, char *) -> int*/);
/* ... */
}
It would be nice to be able to see the contents of any structure, not just it's address.
Consider the following API:
typedef struct {
int hash;
const char *definition;
} entry_t;
char *copy_entry_definition(entry_t *entry, int len);
One should be able to describe entry_t
type in the API definition (or somewhere globally) like so:
"entry_t" : [
{"field_name": "hash", "field_type": "int"},
{"field_name": "definition", "field_type": "string"},
]
and receive a report similar to this from Cuckoo:
API | Arguments | Status
-----------------------|-------------------------------------------|------------
copy_entry_definition | entry: {hash => 971, definition = "hey"} | SUCCEED
| n: 128 |
Things break and I don't know why. It happens on Travis only, not on my machine so it's hard to debug. I also don't think that fixing it is important right now since we don't use dtruss
for analysis (yet?).
Thus I guess this issue should be marked with maybe
or even wontfix
until we need dtruss to do something real for us.
Need a bootstrap script that would integrate the analyser into an existing/new Cuckoo Sandbox repository clone automatically.
Here're the steps required so far:
# $ git clone https://github.com/cuckoobox/cuckoo/ ~/projects/cuckoo
$ cd ./cuckoo-osx-analyzer
$ ln -s ./analyzer/darwin/ ~/projects/cuckoo/analyzer/
There's a **~ 30 seconds** delay between starting analysis and the moment when an application starts launching (Dock icon appears, windows open, etc).
There's a lot of boilerplate code required for installing a single probe; here's an example for system()
:
pid$target::system:entry
{
self->deeplevel++;
/* Save the arguments we've already got for our callee */
self->arguments_stack[self->deeplevel, "arg0"] = self->arg0;
/* And remember our own arguments */
self->arg0 = arg0;
}
pid$target::system:return
{
this->retval = arg1;
this->timestamp_ms = walltimestamp/1000000;
printf("{\"api\":\"%s\", \"args\":[\"%S\"], \"retval\":%d, \"timestamp\":%ld, \"pid\":%d, \"ppid\":%d, \"tid\":%d}\n",
probefunc,
copyinstr(self->arg0),
(int)this->retval,
this->timestamp_ms, pid, ppid, tid);
/* Restore arguments for our callee */
self->arg0 = self->arguments_stack[self->deeplevel, "arg0"];
/* Release the memory for the current level stack */
self->arguments_stack[self->deeplevel, "arg0"] = 0;
--self->deeplevel;
}
So instead of copy-pasting this code manually we must have an automatic probe generator.
We may use JSON for storing descriptions as it's understandable for both humans and computers 🎱
Given that all vboxmanage
-related stuff is cross-platform, the only piece I have to add is traffic forwarding for Linux and Windows.
If it's not, you'll receive the following error from pfctl
:
no IP address found for vboxnet0:network ./pfrules:1: could not parse host specification
Cuckoo itself also won't be able to run:
$ ./cuckoo.py 2015-07-11 01:44:24,398 [root] CRITICAL: CuckooCriticalError: Unable to bind ResultServer on 192.168.56.1:2042: [Errno 49] Can't assign requested address
The fix is described, for example, here and here, so we can include it right into the script.
Have no idea right now 👍
Anyway, since I'll play with dtrace
first, it's not something that needs to me decided nor implemented right now.
It's
$ sudo easy_install pymongo
as far as I can remember.
Now you can only launch target as root because of #9: we use sudo -u
for dropping privileges and without children support we'll only get the sudo -u
calls, not our target’s.
Hi,
I am analyzing Mach-o file but it's showing an error unable to import package , package doesn't exit.
.
Kindly tell me how to resolve this issue.
.
Thanks & Regards
Sean
In particular it hangs when invoking the following test cases:
test_apicalls_children()
test_apicalls_children_root()
both from tests/test_apicalls.py
.
I believe that it's a regression introduced with 224f470 (so it's related to my lock-file technique©®™).
Hi,
Two questions:
TIA,
krishna
There's a bug in my implementation of apicalls.d
that prevents it from following exec
’ed children 👎
On the other hand, if they were
fork
’ed it works like a charm)
I guess the reason is attaching dtrace way too early — when shared libraries are not loaded yet into the child address space, and thus probes from pid
provider could not be installed on them.
There's a new thing in OS X 10.11 called SIP (system integrity protection) aka «Rootless». Basically it takes all privileges away from root
: you can no longer write to protected system locations, modify system files and so on.
That's no good for us, and right now we have an ability to disable it on 10.11 machines with the new boot argument:
$ sudo nvram boot-args="rootless=0"
Although Apple may (and they are going to) remove this argument in a release version of the OS. Let's just hope that there will be a workaround we could use.
The problem is that dtrace/pid
can't enable probes on functions that haven't been loaded on program startup. Which basically means that if a target process loads a library at runtime (e.g. via dlopen()
), we won't be able to install probes on any symbol from this library.
// clang -o demo demo.c
#include <dlfcn.h>
int main(int argc, char const *argv[])
{
void *h = dlopen("/usr/lib/libThaiTokenizer.dylib", RTLD_NOW);
int (*isThai)(int) = dlsym(h, "isThai");;
int x = isThai(123);
return 0;
}
Compile and try attaching dtrace
to it:
sudo dtrace -n 'pid$target::isThai:entry { printf("isThai()"); }' -c ./demo
The following error will occur:
dtrace: invalid probe specifier pid::isThai:entry { printf("isThai()"); }: probe description pid::isThai:entry does not match any probes
From my understanding there's a problem with huge? agrument values for some APIs. Need to revise apicalls.d
for invalid format specifiers and so.
Here's an error log:
2015-06-14 04:33:42,723 [root] DEBUG: Starting analyzer from /Users/bigmike/tlihl
2015-06-14 04:33:42,723 [root] DEBUG: Storing results at: /var/folders/75/s_0qg5y16_3dmhs1_fxnn0wh0000gn/T/PYzMFCoV
2015-07-16 03:04:23,542 [modules.packages.zip] DEBUG: Missing file option, auto executing: CCMenu.app/
2015-07-16 03:04:23,588 [modules.packages.zip] DEBUG: Analysing file "CCMenu.app/" using package "App"
2015-07-16 03:04:43,081 [root] ERROR: Traceback (most recent call last):
File "/Users/bigmike/tlihl/analyzer.py", line 119, in <module>
success = analyzer.run()
File "/Users/bigmike/tlihl/analyzer.py", line 48, in run
self._analysis(package)
File "/Users/bigmike/tlihl/analyzer.py", line 108, in _analysis
package.start()
File "/Users/bigmike/tlihl/modules/packages/zip.py", line 60, in start
self.real_package.start()
File "/Users/bigmike/tlihl/lib/core/packages.py", line 101, in start
self.apicalls_analysis()
File "/Users/bigmike/tlihl/lib/core/packages.py", line 112, in apicalls_analysis
self.host.send_api(call)
File "/Users/bigmike/tlihl/lib/core/host.py", line 85, in send_api
"args" : self._prepare_args(thing)
File "/Library/Python/2.7/site-packages/pymongo-3.0.2-py2.7-macosx-10.8-intel.egg/bson/__init__.py", line 888, in encode
return cls(_dict_to_bson(document, check_keys, codec_options))
OverflowError: MongoDB can only handle up to 8-byte ints
Since some targets may run forever, we need to set a timeout value for dtrace
tracing process, so it will only run for X
seconds and no longer.
It's not possible to pass any command line arguments for a target now.
In order to use the Mac OS X analyzer one will have to setup a Mac OS X environment. Either bare metal (running the analyses on an actual mac mini or so) or running Mac OS X inside a Virtual Machine.
Regardless of the approach taken, it'd be very, very useful if you can document the steps that you have taken in order to set it up as I'd very much like to do the same at some point.
Now it only captures the parent process' calls.
Currently it's always 1 for is_success
and unknown for category
.
is_success
valueNeed to figure our how to script dtrace so it can catch up the following:
syscalls
see
dtruss(1)
,errinfo(1)
disk I/O
see
iosnoop(1)
(but it doesn't catch a cached data i.e. from RAM);opensnoop(1)
(foropen()
);
internet connections
fork()
, exec()
and stuff like this
API calls
See the output of $ man -k dtrace
dtrace
) parse the dtrace output (files?) and figure them out; send the results to the hostChecklist:
Make sure pt_deny_attach
still works on modern OS X systems.
If it doesn't: replace it with something like this: https://github.com/gdbinit/onyx-the-black-cat/blob/master/kext/antidebug.c#L72
I'l also have to create a brand-new kernel module from this stuff (because
onyx
is the all-in-one solution we're now interested in right now).
As for now, I'm just sending dummy names like "arg0", "arg1", "arg2", etc.
apicalls.d
is just too big and has lots of responsibilities: it handles both API probes we need for analysis and all the "follow-my-children" stuff.
What we need is to move the latter into a new script (think of follow_children.d
) which will be #include
ed into the master script.
See also: separate-dtrace-scripts branch.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.