GithubHelp home page GithubHelp logo

Interactivity or scripting about fcd HOT 9 OPEN

lifting-bits avatar lifting-bits commented on June 3, 2024
Interactivity or scripting

from fcd.

Comments (9)

surovic avatar surovic commented on June 3, 2024

I don't have much experience with integrating python interpreters into C++ projects or the other way around, but the linux.py and ELF.py approach sounds good. It's actually how Macho and I think PE binary formats are supported. ELF is hardcoded in fcd.

I think one could actually get something like "function filtering" to work by delegating some of the entry point discovery from fcd to the scripts. Discover some entry points in the .py scripts, omit suff like __libc_start_main and pass the entry point addresses to fcd for lifting and decompilation.

The issue with this approach I see is that some entry points are discovered using Remill via the recursive descent disassembly and I'm not sure if that would not reintroduce some entry points filtered by the scripts. Then again one could also pass a list of filtered entry points from the script to fcd and have fcd omit them as well.

from fcd.

surovic avatar surovic commented on June 3, 2024

To add, I think the scripting approach is strictly better than the interactive one. But that's just my opinion.

from fcd.

pgoodman avatar pgoodman commented on June 3, 2024

This raises one question for me, which is: should the main binary loading / parsing be done by C++ code? If we made fcd's C++ side cooperate with a Python side, then we could bring in third-party packages like Angr's cle to load in binary images, and have the C++ side actually invoke CLE to do the reading. I envision something like microx, where a class is provided that can be extended, and the extension implements methods for reading virtual memory, etc. This would then generalize to handling actual process memory dumps.

from fcd.

surovic avatar surovic commented on June 3, 2024

Yeah, this sounds pretty good. And I think fcd actually has some support for this already, from glacing over fcd/scripts and fcd/fcd/executables. I think the idea there is that a Python script needs to provide certain functions, like a function to translate virtual addresses and maybe others, to a C++ class. But I bet this could be modified to better suit things like Angr's cle.

from fcd.

surovic avatar surovic commented on June 3, 2024

I can also imagine *.py scripts being very useful in scenarios with packed and / or encrypted executables.

from fcd.

pgoodman avatar pgoodman commented on June 3, 2024

So maybe something like...

import cle
import fcd

# Memory abstraction that will let the decompiler read memory. You could
# implement Memory here by invoking APIs from cle, Binary Ninja, IDA Pro, etc.
# You could also provide info to fcd from a McSema-lifted CFG file, which contains
# rich info.
class ExecutableMemory(fcd.ExecutableMemory):
  def __init__(self, ld):
    self.ld = ld

  def read(self, addr, num_bytes):
    # do something with self.ld, returning a list or tuple bytearray

ld = cle.Loader(sys.argv[0])
memory = ExecutableMemory(ld)
decomp = fcd.Decompiler(memory)

decomp.add_entrypoint(0xf00, name="main")
# Fill in other named entrypoints from ld

# Maybe bring in Angr's CFGFast to invoke other APIs,
# e.g. decomp.mark_as_function() or something. Down
# the line, having the ability to mark indirect xrefs would
# be nifty.

# Now lift to bitcode
bc = decomp.lift()

# Show me the bitcode!
bc.dump(address=0xf00)
bc.dump(name="main")

# Eventually we could implement the emulator test suite
# via whatever bc is, e.g. bc.execute(cpu), where cpu is
# an object of a class implementing methods like
# read_register and read_memory.

bc.set_calling_convention(...)

bc.decompile(address=0xf00)
bc.decompile(name="main")

from fcd.

surovic avatar surovic commented on June 3, 2024

I think your example looks good, but it's also the reverse of what fcd currently does. Currently fcd uses Python to parse executables. Like for example...

import pefile
import bisect

# helper globals
stubs = {}
sectionStart = []
sectionInfo = {}

# fcd interface below (I assume this is what fcd's C++ Executable class requires)

executableType = "Portable Executable"
targetTriple = "unknown-unknown-win32"
entryPoints = []

def init(data):
  # fill stubs, sectionStart, sectionInfo, ...

def getStubTarget(target):
  # returns the target of a stub function (library functions, etc)

def mapAddress(address):
  # maps virtual addresses to actual addresses in the binary

The above script is then passed to fcd via a command-line flag, for example $ fcd -f scripts/pe.py pefile.exe, and during lifting, fcd then calls the functions from the above script to resolve stub targets, virtual addresses and what have you. Fcd then does the actual reading of binary data on it's own.

In your example it seems to me that fcd, would be more of a library with Python bindings, rather than a standalone executable, which I'm not opposed to, but I assume it would be a bit more work. That being said, it seems that C++ library with Python bindings is the way a lot of projects nowadays go, so why do something different.

from fcd.

pgoodman avatar pgoodman commented on June 3, 2024

I think library-ifying it is something I could pull together in a reasonable amount of time. It'd be pretty cool to expose fcd to Binary Ninja, for example.

from fcd.

surovic avatar surovic commented on June 3, 2024

It'd be pretty cool to expose fcd to Binary Ninja, for example.

That I completely agree with.

from fcd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.