immunant / ibresolver Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 2.0 11.49 MB

A QEMU TCG plugin for resolving indirect branches.

License: BSD 3-Clause "New" or "Revised" License

Makefile 3.44% C++ 32.23% C 52.62% Shell 2.86% Python 8.86%

ibresolver's People

Contributors

Stargazers

Watchers

Forkers

learner0x5a

ibresolver's Issues

Remove need to pass in indirect callsites

It'd be nice to have this tool find indirect callsites automatically instead of passing in a list of callsites. The two options are

Shell out to objdump and grep for indirect branches like in find_indirect_*.sh. The grepping could probably be done from the plugin in the post-install initialization, but using objdump would require cross-compiling binutils for the target arch which isn't as user-friendly. I'd also need to expand the arm regex since it's missing some indirect jumps (e.g. ldr pc, Rn)
In the translate block callback pattern match each instruction against the target arch's indirect branches. For x64 this is reasonable since only unconditional jumps/calls can be indirect so there are only a few patterns to check. I'm not sure how involved this would be for ARM (i.e. how many patterns we'd have to match against), but insn sizes are limited to 2 or 4 bytes and the instruction encoding manual is much easier to follow than intel's.

Fixing this issue also means we don't have to pass in the binary name twice as explained here so the plugin would only require one arg for the output file.

Make resolving jumps to dynamically linked shared objects less arch-specific

To resolve jumps to dynamically linked shared objects we currently trace mmap and openat syscalls to track what is in memory. Aside from the overhead of the extra tracing, the syscalls made by ELF interpreters for different architectures vary slightly which makes it harder to support more architectures. Also it seems that vaddrs for non-native binaries can't be dereferenced without adding an offset provided by QEMU so even just checking the filename passed to openat on arm32 is a hassle.

An easier alternative may be to check what's in memory by looking at /proc/$PID/maps. It turns out that QEMU (and the plugin) and the emulated process have the same pid so instead of tracing syscalls and manually tracking what's in memory we can look at proc/self/maps from the plugin when we need to resolve any jump. For more info on the format we'd need to parse look for "/proc/[pid]/maps" on this page.

Consecutive indirect branches are not handled properly

When the plugin encounters consecutive indirect branches the the indirect_branch_exec callback is registered for both, but branch_skipped is also registered for the second. This means that results may vary depending on which callback is executed first.

This could be fixed by not registering the branch_skipped for the second instruction (i.e. the branch_skipped callback that corresponds to skipping the first branch). Since this scenario is rare in practice, the plugin currently just emits a warning to stdout when it runs into this. It'd be good to have some test cases before making the fix to verify it'll work as expected.

Check if branches on ARM32 switch between THUMB/ARM mode

It might be possible to use binja like in #3 to check this. Basically for branch destinations with 32-bit instructions, we'd try to parse them with binja in both ARM and THUMB mode and see if one case fails. u32s that are valid in both may require a more involved solution (e.g. looking at cpu registers), but using binja would be a good first step.

Support Vivisect as a frontend

Something to consider once there is a clear need.

Handle branches that occur in the middle of a block

The tracer assumes that indirect branches always occur at the end of the translation blocks defined by QEMU to avoid the need for single-step mode. Currently input addresses that are found in the middle of a block will not show up in the output .csv even if they're executed.

While this assumption will very likely always be true for unconditional indirect branches, it'd be nice to log a warning to stderr when one of these inputs is encountered in block_trans_handler. To avoid needing to iterate through all input callsites in block_trans_handler we should sort the callsites in qemu_plugin_install then limit the callsites checked to those within the block being translated.

Fix support for non-native binaries

Switching from tracing syscalls to reading /proc/self/maps in #2 seems to have broken support for non-native binaries. The args to the syscalls we were tracing had addresses in terms of the guest's memory map which is what we want. For non-native binaries (e.g. arm32), these addresses don't correspond to the addresses in /proc/self/maps so arm32 doesn't work anymore.

Adding (or subtracting) QEMU's guest_base let's you go from guest to host addresses and seems to solve the issue. Implementing this fix requires two things

Find a way to get access to guest_base, probably by patching QEMU and modifying the plugin API. So far I've tested by adding extern uintptr_t guest_base to plugin.cpp but this probably isn't reliable.
Add guest_base where required in block_trans_handler and mark_indirect_branch. We should probably use newtypes instead of uint64_t for the different types of addresses to make things explicit and avoid breaking other use cases.

Incorrect implementation of the Binary Ninja backend

For x86_64 indirect calls like call rax, binary ninja does not identify them as branch instruction.

Thus even BNBranchType::CallDestination cannot record these indirect calls.

Maybe by parsing the disassembly text token types can resolve this issue, as shown in https://api.binary.ninja/binaryninja.architecture-module.html#binaryninja.architecture.InstructionTextToken

Add trace output for DARPA AMP challenges

The Binaryninja backend doesn't mark indirect calls correctly

For some reason binaryninja doesn't mark indirect calls (blx on arm32 or callq on x64) as indirect branches so all tests are failing with this backend. I'm not sure if this is a bug or the expected behavior, but it's the same in both is_indirect_branch_default_impl in src/binaryninja_backend.cpp and the python equivalent of that function.

It'd be good to make a list of instructions that binaryninja marks as indirect branches to know what to expect in the results and add calls to that list if possible. Binaryninja does mark jmp *%rax, ldr pc ... and other indirect jumps correctly though, so a temporary workaround might be to use a custom backend to catch the instructions that binaryninja misses. This would require a slight changes to the Makefile to allow custom backends to have reverse dependencies (i.e. allow dynamically loaded code to call is_indirect_branch_default_impl).

immunant / ibresolver Goto Github PK

ibresolver's People

Contributors

Stargazers

Watchers

Forkers

ibresolver's Issues

Remove need to pass in indirect callsites

Make resolving jumps to dynamically linked shared objects less arch-specific

Consecutive indirect branches are not handled properly

Check if branches on ARM32 switch between THUMB/ARM mode

Support Vivisect as a frontend

Handle branches that occur in the middle of a block

Fix support for non-native binaries

Incorrect implementation of the Binary Ninja backend

Add trace output for DARPA AMP challenges

The Binaryninja backend doesn't mark indirect calls correctly

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs