GithubHelp home page GithubHelp logo

strongarm's Introduction

strongarm

Build PyPI version PyVersion badge

strongarm is a full-featured, cross-platform ARM64 Mach-O analysis library.

strongarm is production-ready and is used throughout DataTheorem's iOS static analyzer stack.

REPL example

This repo contains multiple tools to explore strongarm and the API. In the scripts folder, several popular Mach-O analysis tools have been reimplemented in strongarm, to demonstrate real API usage. As strongarm is cross-platform, all of these tools are as well:

  • strongarm-cli: Static analysis REPL (try me!)
  • class-dump: Dump the Objective-C class information from a Mach-O with Objective-C declaration syntax
  • insert_dylib: Add a load command to a Mach-O
  • dsc_symbolicate: Given a dyld_shared_cache, generate a symbol map from the embedded system images
  • nm: List the symbol table of a Mach-O
  • lipo: Thin or fatten Mach-O files and slices
  • hexdump: Output the hex content of a byte range in a file
  • strings: Output the C-strings in a Mach-O
  • dump_entitlements: Print the code-signing information
  • bitcode_retriever: Extract the XAR archive containing LLVM bitcode from a Mach-O

Installation

strongarm is supported on macOS and Linux.

Via pip

pip install strongarm-ios

Via git (for local development)

To setup a local environment:

git clone ...
cd strongarm
python -m venv .venv
source .venv/bin/activate
pip install -U pip setuptools wheel 'pip-tools<7.0.0'
pip-sync requirements.txt requirements-dev.txt

If you modify requirements.in or requirements-dev.in:

pip-compile requirements.in
pip-compile requirements-dev.in
pip-sync requirements.txt requirements-dev.txt
git add requirements-dev.in requirements-dev.txt

Features

  • Access and cross-reference Mach-O info via an API
  • Dataflow analysis
  • Function-boundary detection

Mach-O parsing:

  • Metadata (architecture, endianness, etc)
  • Load commands
  • Symbol tables
  • String tables
  • Code signature
  • Dyld info
  • Objective-C info (classes, categories, protocols, methods, ivars, etc)

Mach-O analysis:

  • Cross-references (xrefs) of code and strings
  • Function boundary detection & disassembly
  • Track constant data movement in assembly
  • Dyld bound symbols & implementation stubs
  • Parse constant NSStrings and C strings
  • Basic block analysis

Mach-O editing:

  • Load command insertion
  • Write Mach-O structures
  • Byte-edit binaries

Quickstart

Pass an input file to MachoParser, which will read a Mach-O or FAT and provide access to individual MachoBinary slices.

import pathlib
from strongarm.macho import MachoParser, MachoBinary

# Load an input file
parser = MachoParser(pathlib.Path("~/Documents/MyApp.app/MyApp"))
# Read the ARM64 slice and perform some operations
binary: MachoBinary = parser.get_arm64_slice()
print(binary.get_entitlements().decode())
print(hex(binary.section_with_name("__text", "__TEXT").address))

Advanced analysis

Some APIs which require more memory or cross-referencing are available through MachoAnalyzer

from pathlib import Path
from strongarm.macho import MachoParser, MachoBinary, MachoAnalyzer

macho_parser = MachoParser(Path("~/Documents/MyApp.app/MyApp"))
binary: MachoBinary = macho_parser.get_arm64_slice()
# A MachoAnalyzer wraps a binary and allows deeper analysis
analyzer = MachoAnalyzer.get_analyzer(binary)

# Find all calls to -[UIAlertView init] in the binary
print(analyzer.objc_calls_to(["_OBJC_CLASS_$_UIAlertView"], ["init"], requires_class_and_sel_found=False))

# Print some interesting info
print(analyzer.imported_symbol_names_to_pointers)   # All the dynamically linked symbols which will be bound at runtime
print(analyzer.exported_symbol_names_to_pointers)   # All the symbols which this binary defines and exports
print(analyzer.get_functions())                     # Entry-point list of the binary. Each of these can be wrapped in an ObjcFunctionAnalyzer
print(analyzer.strings())                           # __cstring segment
print(analyzer.get_imps_for_sel("viewDidLoad"))     # Convenience accessor for an ObjcFunctionAnalyzer

# Print the Objective-C class information
for objc_cls in analyzer.objc_classes():
    print(objc_cls.name)
    for objc_ivar in objc_cls.ivars:
        print(f"\tivar: {objc_ivar.name}")
    for objc_sel in objc_cls.selectors:
        print(f"\tmethod: {objc_sel.name} @ {hex(objc_sel.implementation)}")

Code analysis

Once you have a handle to a FunctionAnalyzer, representing a source code function, you can analyze the code:

from pathlib import Path
from strongarm.macho import MachoParser, MachoBinary, MachoAnalyzer
from strongarm.objc import ObjcFunctionAnalyzer

macho_parser = MachoParser(Path("~/Documents/MyApp.app/MyApp"))
binary: MachoBinary = macho_parser.get_arm64_slice()
analyzer = MachoAnalyzer.get_analyzer(binary)
function_analyzer = ObjcFunctionAnalyzer.get_function_analyzer_for_signature(binary, "ViewController", "viewDidLoad")
print(function_analyzer.basic_blocks)   # Find the basic block boundaries

# Print some interesting info about Objective-C method calls in the function
for instr in function_analyzer.instructions:
    if not instr.is_msgSend_call:
        continue
    
    # In an Objective-C message send, x0 stores the receiver and x1 stores the selector being messaged.
    classref = function_analyzer.get_register_contents_at_instruction("x0", instr)
    selref = function_analyzer.get_register_contents_at_instruction("x1", instr)
    
    class_name = analyzer.class_name_for_class_pointer(classref.value)
    selector = analyzer.selector_for_selref(selref.value).name
   
    # Prints "0x100000000: _objc_msgSend(_OBJC_CLASS_$_UIView, @selector(alloc));"
    print(f"{hex(instr.address)}: {instr.symbol}({class_name}, @selector({selector}));")

Modifying Mach-O's

You can also modify Mach-O's by overwriting structures or inserting load commands:

from pathlib import Path
from strongarm.macho import MachoParser, MachoBinary
from strongarm.macho.macho_definitions import MachoSymtabCommand

macho_parser = MachoParser(Path("~/Documents/MyApp.app/MyApp"))
# Overwrite a structure
binary: MachoBinary = macho_parser.get_arm64_slice()
new_symbol_table = MachoSymtabCommand()
new_symbol_table.nsyms = 0
modified_binary = binary.write_struct(new_symbol_table, binary.symtab.address, virtual=True)

# Add a load command
modified_binary = modified_binary.insert_load_dylib_cmd("/System/Frameworks/UIKit.framework/UIKit")

# Write the modified binary to a file
MachoBinary.write_binary(Path(__file__).parent / "modified_binary")

MachoBinary provides several functions to faciliate binary modifications.

As modifying a MachoBinary may invalidate its public attributes, these APIs return a new MachoBinary object, which is re-parsed with the edits.

# Write raw bytes or Mach-O structures to a binary
MachoBinary.write_bytes(self, data: bytes, address: int, virtual=False) -> MachoBinary
MachoBinary.write_struct(self, struct: Structure, address: int, virtual=False) -> MachoBinary

# Insert a load command
MachoBinary.insert_load_dylib_cmd(dylib_path: str) -> MachoBinary

# Flush a modified slice to a thin Mach-O file, or a list of slices to a FAT Mach-O file:
MachoBinary.write_binary(self, path: pathlib.Path) -> None
@staticmethod
MachoBinary.write_fat(slices: List[MachoBinary], path: pathlib.Path) -> None

To make several modifications to a MachoBinary while triggering only one extra parse, use a MachoBinaryWriter:

from pathlib import Path
from strongarm.macho import MachoParser, MachoBinary
from ctypes import c_uint64, sizeof
from strongarm.macho.macho_binary_writer import MachoBinaryWriter

macho_parser = MachoParser(Path("~/Documents/MyApp.app/MyApp"))
binary: MachoBinary = macho_parser.get_arm64_slice()
# Initialise a batch binary writer
writer = MachoBinaryWriter(binary)

# Make a series of changes to the binary
with writer:
    for i in range(5):
        writer.write_word(word=c_uint64(0xdeadbeef), address=0x1000 + (i * sizeof(c_uint64)), virtual=False)

# `writer.modified_binary` contains the re-parsed binary containing the provided changes
# Persist the modified binary to disk
writer.modified_binary.write_binary(Path(__file__) / "modified_binary")

License

AGPL license

strongarm's People

Contributors

codyd51 avatar ethanarbuckle avatar superlama avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

strongarm's Issues

[Question] What does "Failed to find a corresponding ObjC class for ref ~ " mean?

When the following code is executed
I get the output
Failed to find a corresponding ObjC class for ref 0x10000000050ff8 from ObjcCategory(<Base class of Service category @ 0x100050210 will be populated later> (Service))

from pathlib import Path
from strongarm.macho import MachoParser, MachoBinary, MachoAnalyzer

macho_parser = MachoParser(Path("~/Desktop/MyApp.app/MyApp"))
binary: MachoBinary = macho_parser.get_arm64_slice()

analyzer = MachoAnalyzer.get_analyzer(binary)
for call in analyzer.objc_calls_to(["_OBJC_CLASS_$_UIAlertView"], ["init"], requires_class_and_sel_found=True):
    offset_within_func = call.caller_addr - call.caller_func_start_address
    print(f'Found call to -[UIAlertView init] at {call.caller_func_start_address} + {offset_within_func}')

What does this fail indicate?
Is there something wrong with my code? Or is it something I can ignore?

Thank you!

Is there any function to resize section/segment after modification?

It seems like strongarm does not automatically resize/readjust section/segment size after modification. If the provided data has more length or size than the actual data (I am trying to edit a string here), then it will also overflow/leak into the next section.

How can I implement this feature? Do you have any suggestion? @codyd51

Thanks.

Can't install strongarm-dataflow

Hello, i tried install strongarm-dataflow but got error
ERROR: Could not find a version that satisfies the requirement strongarm-dataflow (from versions: none)
ERROR: No matching distribution found for strongarm-dataflow
Could you help me?
I using win 10 with python 3.7

Pointers may exist in the segment `__DATA_DIRTY`

I am using strongarm to analysis libsystem_kernel.dylib from iOS. I found pointers in the section __data of segment __DATA_ DIRTY, I tried using MachoBinary.read_pointer_section to read, but this segement has been filtered out.

I suggest adding an optional parameter segment_name to MachoBinary.read_pointer_section to specify the segment.

Thanks.

capstone 4.x could not be found, is the capstone backend installed?

I am trying advance analysis example from the readme.md without changing anything.
i have installed capstone with pip install capstone
running on mac M2 chip

$ pip3 show capstone
Name: capstone
Version: 4.0.2
Summary: Capstone disassembly engine
Home-page: http://www.capstone-engine.org
Author: Nguyen Anh Quynh
Author-email: [email protected]
License: UNKNOWN
Location: /Users/<user>/Library/Python/3.9/lib/python/site-packages
Requires: 
Required-by: strongarm-dataflow, strongarm-ios

when i run this particular script I am facing this error.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
File /opt/anaconda3/lib/python3.11/site-packages/strongarm/macho/macho_analyzer.py:237, in MachoAnalyzer._compute_function_basic_blocks(self, entry_point, end_address)
    236 try:
--> 237     from strongarm_dataflow.dataflow import compute_function_basic_blocks_fast
    238 except ImportError as e:

ImportError: dlopen(/opt/anaconda3/lib/python3.11/site-packages/strongarm_dataflow/dataflow.cpython-311-darwin.so, 0x0002): Library not loaded: /opt/homebrew/opt/capstone/lib/libcapstone.4.dylib
  Referenced from: <227343E6-529F-3E5C-B807-54436434ECF9> /opt/anaconda3/lib/python3.11/site-packages/strongarm_dataflow/dataflow.cpython-311-darwin.so
  Reason: tried: '/opt/homebrew/opt/capstone/lib/libcapstone.4.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/capstone/lib/libcapstone.4.dylib' (no such file), '/opt/homebrew/opt/capstone/lib/libcapstone.4.dylib' (no such file), '/usr/local/lib/libcapstone.4.dylib' (no such file), '/usr/lib/libcapstone.4.dylib' (no such file, not in dyld cache), '/opt/homebrew/Cellar/capstone/5.0.1/lib/libcapstone.4.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/Cellar/capstone/5.0.1/lib/libcapstone.4.dylib' (no such file), '/opt/homebrew/Cellar/capstone/5.0.1/lib/libcapstone.4.dylib' (no such file), '/usr/local/lib/libcapstone.4.dylib' (no such file), '/usr/lib/libcapstone.4.dylib' (no such file, not in dyld cache)

During handling of the above exception, another exception occurred:

SystemExit                                Traceback (most recent call last)
    [... skipping hidden 1 frame]

Cell In[1], line 7
      6 # A MachoAnalyzer wraps a binary and allows deeper analysis
----> 7 analyzer = MachoAnalyzer.get_analyzer(binary)
      9 # Find all calls to -[UIAlertView init] in the binary

File /opt/anaconda3/lib/python3.11/site-packages/strongarm/macho/macho_analyzer.py:415, in MachoAnalyzer.get_analyzer(cls, binary)
    414     return cls._ANALYZER_CACHE[binary]
--> 415 return MachoAnalyzer(binary)

File /opt/anaconda3/lib/python3.11/site-packages/strongarm/macho/macho_analyzer.py:183, in MachoAnalyzer.__init__(self, binary)
    182 self._build_callable_symbol_index()
--> 183 self._build_function_boundaries_index()
    185 self._cfstring_to_stringref_map = self._build_cfstring_map()

File /opt/anaconda3/lib/python3.11/site-packages/strongarm/macho/macho_analyzer.py:286, in MachoAnalyzer._build_function_boundaries_index(self)
    284 for entry_point, end_address in pairwise(sorted_entry_points):
    285     # The end address of the function is the last instruction in the last basic block
--> 286     basic_blocks = [x for x in self._compute_function_basic_blocks(entry_point, end_address)]
    287     # If we found a function with no code, just skip it
    288     # This can happen in the assembly unit tests, where we insert a jump to a dummy __text label

File /opt/anaconda3/lib/python3.11/site-packages/strongarm/macho/macho_analyzer.py:243, in MachoAnalyzer._compute_function_basic_blocks(self, entry_point, end_address)
    242     print("\ncapstone 4.x could not be found, is the capstone backend installed?\n")
--> 243     sys.exit(1)
    244 raise

SystemExit: 1

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
    [... skipping hidden 1 frame]

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2121, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code)
   2118 if exception_only:
   2119     stb = ['An exception has occurred, use %tb to see '
   2120            'the full traceback.\n']
-> 2121     stb.extend(self.InteractiveTB.get_exception_only(etype,
   2122                                                      value))
   2123 else:
   2125     def contains_exceptiongroup(val):

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:710, in ListTB.get_exception_only(self, etype, value)
    702 def get_exception_only(self, etype, value):
    703     """Only print the exception type and message, without a traceback.
    704 
    705     Parameters
   (...)
    708     value : exception value
    709     """
--> 710     return ListTB.structured_traceback(self, etype, value)

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:568, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context)
    565     chained_exc_ids.add(id(exception[1]))
    566     chained_exceptions_tb_offset = 0
    567     out_list = (
--> 568         self.structured_traceback(
    569             etype,
    570             evalue,
    571             (etb, chained_exc_ids),  # type: ignore
    572             chained_exceptions_tb_offset,
    573             context,
    574         )
    575         + chained_exception_message
    576         + out_list)
    578 return out_list

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:1435, in AutoFormattedTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
   1433 else:
   1434     self.tb = etb
-> 1435 return FormattedTB.structured_traceback(
   1436     self, etype, evalue, etb, tb_offset, number_of_lines_of_context
   1437 )

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:1326, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1323 mode = self.mode
   1324 if mode in self.verbose_modes:
   1325     # Verbose modes need a full traceback
-> 1326     return VerboseTB.structured_traceback(
   1327         self, etype, value, tb, tb_offset, number_of_lines_of_context
   1328     )
   1329 elif mode == 'Minimal':
   1330     return ListTB.get_exception_only(self, etype, value)

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:1173, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
   1164 def structured_traceback(
   1165     self,
   1166     etype: type,
   (...)
   1170     number_of_lines_of_context: int = 5,
   1171 ):
   1172     """Return a nice text document describing the traceback."""
-> 1173     formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
   1174                                                            tb_offset)
   1176     colors = self.Colors  # just a shorthand + quicker name lookup
   1177     colorsnormal = colors.Normal  # used a lot

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:1063, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset)
   1060 assert isinstance(tb_offset, int)
   1061 head = self.prepare_header(str(etype), self.long_header)
   1062 records = (
-> 1063     self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else []
   1064 )
   1066 frames = []
   1067 skipped = 0

File /opt/anaconda3/lib/python3.11/site-packages/IPython/core/ultratb.py:1131, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset)
   1129 while cf is not None:
   1130     try:
-> 1131         mod = inspect.getmodule(cf.tb_frame)
   1132         if mod is not None:
   1133             mod_name = mod.__name__

AttributeError: 'tuple' object has no attribute 'tb_frame'

CodeSearch not working

Hi,

I am exploring the code search functionality of this library. While running the script 'api-search-for-function-use.py' I get,

ImportError: cannot import name 'CodeSearch' from 'strongarm.objc' (/opt/homebrew/lib/python3.11/site-packages/strongarm/objc/__init__.py)

And on looking in the objc folder, there is no corresponding code either. Is this functionality is internal to DataTheorem or am I missing something?

Symbol not found in dataflow.cpython-39-darwin.so

Hi,
I have installed strongarm-ios using pip install strongarm-ios
but I got the wrong with

ImportError: dlopen(/opt/homebrew/lib/python3.9/site-packages/strongarm_dataflow/dataflow.cpython-39-darwin.so, 2): Symbol not found: _cs_disasm
  Referenced from: /opt/homebrew/lib/python3.9/site-packages/strongarm_dataflow/dataflow.cpython-39-darwin.so
  Expected in: flat namespace
 in /opt/homebrew/lib/python3.9/site-packages/strongarm_dataflow/dataflow.cpython-39-darwin.so

I am using Mac OSX 11.4 with python3.9
install with strongarm_dataflow-2.1.5-cp39-cp39-macosx_11_0_arm64.whl
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.