GithubHelp home page GithubHelp logo

capstone-engine / capstone Goto Github PK

View Code? Open in Web Editor NEW
7.0K 304.0 1.5K 52.34 MB

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), BPF, Ethereum VM, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.

Home Page: http://www.capstone-engine.org

C++ 11.25% C 80.46% Makefile 0.09% Python 2.11% Java 1.32% Shell 0.05% OCaml 0.74% Tcl 0.01% Smalltalk 0.06% C# 3.54% CMake 0.05% PowerShell 0.04% Batchfile 0.03% VBA 0.02% Cython 0.03% Visual Basic 6.0 0.19% Ruby 0.02%
reverse-engineering disassembler security framework arm arm64 x86 sparc powerpc mips

capstone's People

Contributors

adamjseitz avatar aquynh avatar bughoho avatar catenacyber avatar danghvu avatar emoon avatar fay59 avatar finnwilkinson avatar fuzzysecurity avatar fvrmatteo avatar hardtobelieve avatar ibabushkin avatar imbillow avatar kabeor avatar mrexodia avatar nplanel avatar oleavr avatar opntr avatar peace-maker avatar pranith avatar radare avatar rhelmot avatar riptl avatar rot127 avatar scudette avatar sidneyp avatar stephengroat avatar tacoxnguyen avatar tandasat avatar tmfink avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

capstone's Issues

PC operand missing for vldr

$ test-as -mthumb 'vldr d18, [pc, #108]'

/tmp/foo.s.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <.text>:
0: eddf 2b1b vldr d18, [pc, #108] ; 70 <.text+0x70>

$ ./a.out -t eddf 2b1b
2b1beddf: vldr nrop=1
writeback: no
update_flags: no
cond: ARM_CC_AL
operand 00: type=ARM_OP_REG reg=d18
group: ARM_GRP_VFP2

next : regs_read returning incorrect registers for "\x55" (pushl %ebp)

perhaps I'm doing this wrong, but when disassembling a "push ebp" instruction, the regs_read field seems to contain "esp" instead of "ebp" as i would expect. my library was built via CMake using the code within the "next" branch. here's some python code:

CODE = "\x55\x48\x8b\x05\xb8\x13\x00\x00"
import capstone
cs = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32)
cs.detail = True
insn = cs.disasm(CODE, 0).next()

print insn.mnemonic,insn.op_str
print map(insn.reg_name,insn.regs_read),map(insn.reg_name,insn.regs_write)

assert 'ebp' in map(insn.reg_name,insn.regs_read), "whoops"
assert 'esp' in map(insn.reg_name,insn.regs_write), "omgosh"

which outputs

push ebp
[u'esp'] [u'esp']

I would expect regs_write to contain the esp register and regs_read to contain ebp, right?

MIPS pseudo-instructions

IDA Pro is disassembling MIPS into pseudo-instruction:

.text:00416CB4 26 00 01 3C+                li      $at, 0x269EC

while capstone is not:

0x00416cb4    2600013c     lui at, 0x26
0x00416cb8    c39e2134     ori at, at, 0x9ec3

It would be nice to be able to get both in capstone :)

Crash in arrch64 when detail is false

Test case:

from capstone import *
from capstone.arm64 import *

code = "\x1a\x7d\x07\xb8"
md = Cs(CS_ARCH_ARM64, CS_MODE_ARM)
md.detail = False # this makes it crash
for insn in md.disasm (code, 1):
        print "ok"

Fix:

diff --git a/arch/AArch64/AArch64InstPrinter.c b/arch/AArch64/AArch64InstPrinter.c
index 11baf01..c4dfe86 100644
--- a/arch/AArch64/AArch64InstPrinter.c
+++ b/arch/AArch64/AArch64InstPrinter.c
@@ -805,7 +805,7 @@ static void printVectorList(MCInst *MI, unsigned OpNum,
 void AArch64_post_printer(csh handle, cs_insn *flat_insn, char *insn_asm)
 {
        // check if this insn requests write-back
-       if (strrchr(insn_asm, '!') != NULL)
+       if (flat_insn && flat_insn->detail && strrchr(insn_asm, '!') != NULL)
                flat_insn->detail->arm64.writeback = true;
 }

`groups` is unexpectedly empty for many x86 instructions

When I enable detail mode, I expected the groups field to have at least one entry. But many x86 instructions are returned with empty groups. Here's a minimal test case:

from capstone import *

CODE = "\x55\x48\x89\xe5"

md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True
for i in md.disasm(CODE, 0x1000):
    print "0x%x:\t%s\t%s\tgroups = %s" % (
        i.address, i.mnemonic, i.op_str, str(i.groups)
    )

Two instructions will be printed: the first instruction has X86_GRP_MODE64 set; that looks right. But the second instruction has empty groups. Certainly this instruction should also be at least part of the same X86_GRP_MODE64 group?

Actual output:

0x1000: push    rbp     groups = [17]
0x1001: mov     rbp, rsp        groups = []

This bug seems to affect many, many instructions, not just mov.

jz vs je

capstone disassembles 74/75 opcode as 'je/jne' while other disassemblers use 'jz/jnz'. In fact, both mnemonics are assembled as the same instruction, so that's just an aesthetical issue.

I personally prefer the 'jz/jnz' form, should we change this? What's your preference?

unable to install cython binding

running build_ext
cythoning pyx/capstone.pyx to pyx/capstone.c
error: /var/tmp/portage/dev-python/capstone-python-2.0/work/capstone-2.0/bindings/python/pyx/capstone.pyx: No such file or directory
/pyx $ ls
README       arm.py    arm64_const.py  capstone.py    ccapstone.pyx  mips_const.py  ppc_const.py  x86_const.py
__init__.py  arm64.py  arm_const.py    ccapstone.pxd  mips.py        ppc.py         x86.py

MIPS disassembler segfault

$ lldb -- rasm2 -a mips.cs -d 04110001
Current executable set to 'rasm2' (x86_64).
(lldb) r
Process 96820 launched: '/usr/bin/rasm2' (x86_64)
Process 96820 stopped
* thread #1: tid = 0x54470, 0x000000010100205a libcapstone.dylib`insn_find(m=0x0000000101488bb0, max=<unavailable>, id=0) + 58 at utils.c:30, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x9701488a84)
    frame #0: 0x000000010100205a libcapstone.dylib`insn_find(m=0x0000000101488bb0, max=<unavailable>, id=0) + 58 at utils.c:30
   27
   28       while(begin <= end) {
   29           i = (begin + end) / 2;
-> 30           if (id == m[i].id)
   31               return i;
   32           else if (id < m[i].id)
   33               end = i - 1;
(lldb) bt
* thread #1: tid = 0x54470, 0x000000010100205a libcapstone.dylib`insn_find(m=0x0000000101488bb0, max=<unavailable>, id=0) + 58 at utils.c:30, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x9701488a84)
    frame #0: 0x000000010100205a libcapstone.dylib`insn_find(m=0x0000000101488bb0, max=<unavailable>, id=0) + 58 at utils.c:30
    frame #1: 0x0000000101033cd0 libcapstone.dylib`Mips_get_insn_id(insn=0x00007fff5fbeb720, id=<unavailable>) + 288 at mapping.c:1410
    frame #2: 0x00000001010016ee libcapstone.dylib`fill_insn(handle=<unavailable>, insn=0x00007fff5fbeb720, buffer=0x00007fff5fbeb518, mci=0x00007fff5fbfcd30, printer=0x0000000000000000, code=0x0000000100501480) + 78 at cs.c:179
    frame #3: 0x00000001010018ec libcapstone.dylib`cs_disasm_dyn(ud=4300216320, buffer=0x0000000100501480, size=4, offset=0, count=1, insn=0x00007fff5fbfd430) + 348 at cs.c:321
    frame #4: 0x00000001003ebd89 asm_mips_cs.dylib`disassemble + 169
    frame #5: 0x00000001000b271e libr_asm.dylib`r_asm_disassemble(a=0x0000000100403900, op=0x00007fff5fbfd500, buf=0x0000000100501480, len=4) + 110 at asm.c:307
    frame #6: 0x00000001000b2c78 libr_asm.dylib`r_asm_mdisassemble(a=0x0000000100403900, buf=0x0000000100501480, len=4) + 440 at asm.c:370
    frame #7: 0x000000010000224d rasm2`rasm_disasm(buf=0x00007fff5fbffd14, offset=0, len=4, bits=32, ascii=0, bin=0, hex=0) + 893 at rasm2.c:101
    frame #8: 0x0000000100001a95 rasm2`main(argc=5, argv=0x00007fff5fbffbe0) + 4149 at rasm2.c:364
    frame #9: 0x00007fff8af065fd libdyld.dylib`start + 1
(lldb) disassemble -p
libcapstone.dylib`insn_find + 58 at utils.c:30:
-> 0x10100205a:  movl   (%rdi,%rcx), %ecx
   0x10100205d:  cmpl   %edx, %ecx
   0x10100205f:  je     0x10100206a               ; insn_find + 74 at utils.c:40
   0x101002061:  leal   -1(%rax), %esi
(lldb) register read
General Purpose Registers:
       rax = 0x000000007fffffff
       rbx = 0x0000000000000258
       rcx = 0x00000095fffffed4           <----- this value is 'i' and that's an out of bounds read op
       rdx = 0x0000000000000000
       rdi = 0x0000000101488bb0  insns
       rsi = 0x00000000ffffffff
       rbp = 0x00007fff5fbeb440
       rsp = 0x00007fff5fbeb440
        r8 = 0x0000000000000000
        r9 = 0x00000000ffffffff
       r10 = 0x001004003200c803
       r11 = 0xfffffffffffee7e0
       r12 = 0x00007fff5fbeb720
       r13 = 0x00007fff5fbfcd30
       r14 = 0x0000000101488bb0  insns
       r15 = 0x00007fff5fbeb518
       rip = 0x000000010100205a  libcapstone.dylib`insn_find + 58 at utils.c:30
    rflags = 0x0000000000010206
        cs = 0x000000000000002b
        fs = 0x0000000000000000
        gs = 0x000000007fff0000

Remove unused defines and update version script in Makefile

From 84d3a315cdded403d5478d9bae22b1fc4769e5c7 Mon Sep 17 00:00:00 2001
From: pancake <[email protected]>
Date: Wed, 11 Dec 2013 01:42:12 +0100
Subject: [PATCH] Remove unused defines and update version script in Makefile

---
 Makefile | 2 +-
 cs.c     | 4 ----
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 9ab50de..d6aef0f 100644
--- a/Makefile
+++ b/Makefile
@@ -60,7 +60,7 @@ LIBRARY = lib$(LIBNAME).$(EXT)
 ARCHIVE = lib$(LIBNAME).$(AR_EXT)
 PKGCFGF = $(LIBNAME).pc

-VERSION=$(shell echo `grep -e PKG_MAJOR -e PKG_MINOR cs.c | grep -v = | awk '{print $$3}'` | awk '{print $$1"."$$2}')
+VERSION=$(shell echo `grep -e CS_API_ include/capstone.h | grep -v = | awk '{print $$3}'` | awk '{print $$1"."$$2}')

 .PHONY: all clean install uninstall

diff --git a/cs.c b/cs.c
index b517f4e..964a5fc 100644
--- a/cs.c
+++ b/cs.c
@@ -28,10 +28,6 @@

 #include "utils.h"

-// Package version
-#define PKG_MAJOR 1
-#define PKG_MINOR 0
-

 void cs_version(int *major, int *minor)
 {
-- 
1.8.5.1


Wrong x86-16 disassembly

Using this shell function we will test the issues:

$ chk() { rasm2 -a x86.cs -b16 -d $1 ; rasm2 -a x86 -b16 -d $1; }

Some 16bit x86 instructions are disassembled as 32bit

$ chk 257175
and eax, 0x7571
and ax, 0x7571
$ chk 2d0066
sub eax, 0x6600
sub ax, 0x6600
$ chk 6f           ######### FIXED
outsw
outsw
$ chk 357320
xor eax, 0x2073
xor ax, 0x2073
$ chk ed
in eax, dx
in ax, dx

Some look wrong on udis86 and capstone...

$ chk 67653a20
cmp ah, byte ptr gs:[eax]
cmp ah, [gs:eax]
$ chk 66696c65202e2e2e
imul ebp, word ptr [si + 0x65], 0x2e2e2e20
imul ebp, [si+0x65], 0x2e2e2e20
$ chk 0f57c0
xorps xmm0, xmm0
xorps xmm0, xmm0

No distinction between insb and insw opcodes

$ chk 6c  ######## FIXED
insb
insb
$ chk 6d
insd
insw

Use instruction groups for classification of instruction operations

Right now the groups are limited to just the different feature flags supported by LLVM (as I understand it). It would be useful to have additional groups that define how the instruction operates.

So far the "JUMP" group is the only one similar. But I can imagine having groups for "Reads memory", "Writes memory", "Arithmetic", "Conditional", "Floating point", "Coprocessor/DSP", ... to name just a few examples. The only instructions which should not be a member of any groups are NOP, and SKIPDATA.

Missing operands for "xchg eax, ebx"

Hi,

When disassembling xchg eax, ebx, 'eax` is not included in the operands array. However, op_str correctly returns "eax, ebx".

I've attached python code to trigger the bug, but the problem also exists when using the C API directly.

import capstone

md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32)
# xchg eax, ebx
# same problem for all other 1byte xchg
instr = md.disasm("\x93",0x10,1)[0]
print len(instr.operands) # 1
print instr.op_str # eax, ebx
print instr.reg_name(instr.operands[0].value.reg) # ebx

Thanks in advance.

-- Felix

Segfault in x86 disassembler

The killer sequence of bytes is: "\xff\x8c\xf9\xff\xff\x9b\xf9"

You can reproduce the crash with this program:

#include <stdio.h>
#include <capstone.h>

int main() {
        int i, n, ret;
        csh handle;
        cs_insn *insn;

        ret = cs_open (CS_ARCH_X86, CS_MODE_32, &handle);
        if (ret) {
                printf ("Failed\n");
                return 1;
        }
        n = cs_disasm_dyn (handle, "\xff\x8c\xf9\xff\xff\x9b\xf9", 7, 0, 0, &insn);
        if (n>0)
        for (i=0; i<n; i++) {
                printf ("%d -> (sz=%d) : %s %s\n", i,
                        insn[i].size,
                        insn[i].mnemonic,
                        insn[i].op_str);
        }
        cs_close (handle);
        return 0;
}

backtrace:

(lldb) bt
* thread #1: tid = 0x5e686, 0x00007fff908a0866 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread, stop reason = signal SIGABRT
    frame #0: 0x00007fff908a0866 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff90cd235c libsystem_pthread.dylib`pthread_kill + 92
    frame #2: 0x00007fff9097fbba libsystem_c.dylib`abort + 125
    frame #3: 0x00007fff9097fd31 libsystem_c.dylib`abort_report_np + 181
    frame #4: 0x00007fff909a38c5 libsystem_c.dylib`__chk_fail + 48
    frame #5: 0x00007fff909a3895 libsystem_c.dylib`__chk_fail_overflow + 16
    frame #6: 0x00007fff909a3ae4 libsystem_c.dylib`__strcpy_chk + 83
    frame #7: 0x000000010000d77d a.out`X86_Intel_printInst [inlined] get_first_op(buffer=<unavailable>) + 3101 at X86IntelInstPrinter.c:178
    frame #8: 0x000000010000d718 a.out`X86_Intel_printInst(MI=0x00007fff909a3895, O=<unavailable>, Info=<unavailable>) + 3000 at X86IntelInstPrinter.c:208
    frame #9: 0x0000000100001f45 a.out`cs_disasm_dyn(ud=4300224704, buffer=0x0000000100049378, size=<unavailable>, offset=0, count=<unavailable>, insn=0x00007fff5fbffb60) + 725 at cs.c:270
    frame #10: 0x000000010000150d a.out`main + 125
    frame #11: 0x00007fff9630a5fd libdyld.dylib`start + 1
(lldb)
frame #7: 0x000000010000d77d a.out`X86_Intel_printInst [inlined] get_first_op(buffer=<unavailable>) + 3101 at X86IntelInstPrinter.c:178
   175              memcpy(firstop, tab + 1, comma - tab - 1);
   176              firstop[comma - tab - 1] = '\0';
   177          } else
-> 178              strcpy(firstop, tab + 1);
   179      } else  // no op
   180          firstop[0] = '\0';
   181  }

Missing operand for Thumb 4919 ldr r1, [pc, #100]

$ ./a.out -t 4919
00004919: ldr nrop=1
writeback: no
update_flags: no
cond: ARM_CC_AL
operand 00: type=ARM_OP_REG reg=r1
group: ARM_GRP_THUMB
group: ARM_GRP_THUMB1ONLY

There's a PC-relative memory operand here too. Substituting another register, say, r0, results in correct output:

$ ./a.out -t 6e41
00006e41: ldr nrop=2
writeback: no
update_flags: no
cond: ARM_CC_AL
operand 00: type=ARM_OP_REG reg=r1
operand 01: type=ARM_OP_MEM base=r0 index=none scale=1 disp=100
group: ARM_GRP_THUMB
group: ARM_GRP_THUMB1ONLY

16bit segment bounds error

Using this test from radare2 capstone returns wrong result:

NAME="16bit segment bounds - capstone"
FILE=malloc://1024k
CMDS='
e asm.arch=x86.cs
e asm.bits=16
e anal.hasnext=0
wx e9c300 @ f000:ffaa
s f000:ffaa
pi 1
'
EXPECT='jmp 0xf0070
'
run_test

Python 3 and capstone

Hi,
it's great that such excellent library such capstone is available for Python, but there is problem with Python 3, since it doesn't supporting relative imports in contrary to Python 2.
So instead doing in __init__.py from capstone import Cs # other imports omitted for sake of clarity you can do it this way: from . import Cs. In my opinion it would be even better to remove module capstone.py file and move its contents directly to __init__.py. Other modules also need to be slighty corrected.
Beside that, in some distributions like Arch, python is symbolic link to python3 or to be exact to python3.4 so building bindings fails due to Python2-style prints. It can be easily fixed, e.g. in test.py prints are Python 3 friendly (except line 47 print to_hex(code), but you forgot to do from __future__ import print_function). If you want, I can fix these defects and create pull request.

libcapstone.dylib is placed in the wrong location, causes errors with the python binding

The error:

2502:python jesus$ python test_x86.py
Traceback (most recent call last):
  File "test_x86.py", line 5, in <module>
    from capstone import *
  File "/Users/xxx/development/capstone-2.0/bindings/python/capstone/__init__.py", line 1, in <module>
    from capstone import Cs, CsError, cs_disasm_quick, cs_version, cs_support, CS_API_MAJOR, CS_API_MINOR, CS_ARCH_ARM, CS_ARCH_ARM64, CS_ARCH_MIPS, CS_ARCH_X86, CS_ARCH_PPC, CS_ARCH_ALL, CS_MODE_LITTLE_ENDIAN, CS_MODE_ARM, CS_MODE_THUMB, CS_OPT_SYNTAX, CS_OPT_SYNTAX_DEFAULT, CS_OPT_SYNTAX_INTEL, CS_OPT_SYNTAX_ATT, CS_OPT_SYNTAX_NOREGNAME, CS_OPT_DETAIL, CS_OPT_ON, CS_OPT_OFF, CS_MODE_16, CS_MODE_32, CS_MODE_64, CS_MODE_BIG_ENDIAN, CS_MODE_MICRO, CS_MODE_N64
  File "/Users/xxx/development/capstone-2.0/bindings/python/capstone/capstone.py", line 150, in <module>
    raise ImportError("ERROR: fail to load the dynamic library.")
ImportError: ERROR: fail to load the dynamic library.

bindings/python/capstone.py when imported looks for libcapstone.dll, libcapstone.so, and libcapstone.dylib in a number of locations. There is a libcapstone.dylib file that is produced in the root directory of the project. Moving this to the same folder as capstone.py (in reality it should work anywhere as long as it's in $PATH) fixes the error. I faced this problem using both brew&pip and compiling from source. I did not have this issue on Linux.

Muliple segmentations fault with CS_OPT_SKIPDATA

Whenever an instruction is marked as SKIPDATA, capstone should not try to access insn->detail as it will be null thus causing segmentations fauls (see, for example, cs_insn_group).

I think the correct way to handle this is to

  • Set handle->errnum to CS_ERR_SKIPDATA if insn is actually skipped due to CS_OPT_SKIPDATA
  • Always check for insn->detail to be not null before accessing it (Maybe throwing something different from CS_ERR_DETAIL to identify the two cases)

Wrong ARM32 BL decoding

Hello, I've noticed that capstone fails to decompile the following BLNE instruction:

# test1.py
from capstone import *

CODE = "\x83\x74\x21\x1B" # 0xFF7A46EC -> BLNE 0x1900

md = Cs(CS_ARCH_ARM, CS_MODE_ARM)
for i in md.disasm(CODE, 0xFF7A46EC):
    print "0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str)

Output (wrong):

0xff7a46ec: blne    #0x85d20c

ARM_GRP_JUMP is empty

The ARM_GRP_JUMP contains no instructions.

This is because no ARM instruction is marked .branch or .indirect_branch inside ARMMapping.c

64 addresses on 16bit disassembly

The problem is not only the address, but also the length of the instruction, in capstone that call is 5 byte length, like in 32bit, but it should be 3.

$ rasm2 -o 0x100000000 -a x86.cs -b16  -d e8c6020000
call 0x1000002cb

$ rasm2 -o 0x100000000 -a x86 -b16  -d e8c602
call 0x2c9

GCC MIPS: Default macro issue

GCC MIPS toolchain has the following default macro defined:

mips-linux-gcc -dM -E - < /dev/null | grep mips
[...]
#define __mips__ 1
#define mips 1
#define _mips 1
[...]

The mips default GCC macro clashes with fields in structures, e.g. cs_mips mips becomes cs_mips 1

Adding an #undef mips at the beginning of include/mips.h solves the issue and capstone happily compiles with the MIPS toolchain.

Shift seems to be attached to wrong ARM operand

Shouldn't we be attaching the shift to the second operand? Unless I'm misunderstanding what the assembly does, the only thing that happens to r1 is a word being stored in it.

$ test-as 'ldr r1, [r2, r0, lsl #3]'

/tmp/foo.s.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <.text>:
0: e7921180 ldr r1, [r2, r0, lsl #3]

$ ./a.out e7921180
e7921180: ldr nrop=2
writeback: no
update_flags: no
cond: ARM_CC_AL
operand 00: type=ARM_OP_REG reg=r1 sft=ARM_SFT_LSL #3
operand 01: type=ARM_OP_MEM base=r2 index=r0 scale=1 disp=0
group: ARM_GRP_ARM

~/Desktop

Undocumented dependency on Cs object in python API

Hi,

the python API fails silently when the Cs object goes out of scope. I am not sure about the best way to fix this. The __del__ method of Cs calls cs_close which makes a big chunk of the API unusable.
Adding a reference to Cs in each object is an easy fix, but adds an (inacceptable) overhead to each instruction object. A better way might be some kind of internal reference counting to ensure that cs_close is only called when all instruction are out of scope as well.

Example code:

import capstone

def test_disasm():
    md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32)
    # jmp esp
    instr = md.disasm("\xff\xe4",0x10,1)[0]
    print instr.reg_name(instr.operands[0].value.reg) # "esp"
    return instr

instr2 = test_disasm()
print instr2.reg_name(instr2.operands[0].value.reg) # "None"

If this is the desired behavior, it should be documented more clearly.

-- Felix

Incorrect offset in ARM64 LDP/STP

When disassembling the following ARM64 instruction:
STP W0, W1, [X13, #-8]

Capstone will produce the following output:
STP W0, W1, [X13, #4294967288]

It seems the immediate offset isn't sign-extended properly

Segfault in AArch64

$ lldb -- rasm2 -a arm.cs -b64 -d 0000004c
Current executable set to 'rasm2' (x86_64).
(lldb) r
Process 90750 launched: '/usr/bin/rasm2' (x86_64)
Process 90750 stopped
* thread #1: tid = 0x4634c, 0x0000000101099218 libcapstone.dylib`printVectorList [inlined] getRegisterName(RegNo=<unavailable>) + 9 at AArch64GenAsmWriter.inc:8847, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x5010e591c)
    frame #0: 0x0000000101099218 libcapstone.dylib`printVectorList [inlined] getRegisterName(RegNo=<unavailable>) + 9 at AArch64GenAsmWriter.inc:8847
   8844   //for (i = 0; i < sizeof(RegAsmOffset)/4; i++)
   8845   //      printf("%s = %u\n", AsmStrs+RegAsmOffset[i], i + 1);
   8846   //printf("*************************\n");
-> 8847   return AsmStrs+RegAsmOffset[RegNo-1];
   8848 }
   8849
   8850 #ifdef PRINT_ALIAS_INSTR
(lldb) bt
* thread #1: tid = 0x4634c, 0x0000000101099218 libcapstone.dylib`printVectorList [inlined] getRegisterName(RegNo=<unavailable>) + 9 at AArch64GenAsmWriter.inc:8847, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x5010e591c)
    frame #0: 0x0000000101099218 libcapstone.dylib`printVectorList [inlined] getRegisterName(RegNo=<unavailable>) + 9 at AArch64GenAsmWriter.inc:8847
    frame #1: 0x000000010109920f libcapstone.dylib`printVectorList(MI=<unavailable>, OpNum=<unavailable>, O=0x00007fff5fbeb4e8, MRI=0x0000000100601650, Layout=<unavailable>, Count=<unavailable>) + 143 at AArch64InstPrinter.c:641
    frame #2: 0x000000010109607f libcapstone.dylib`AArch64InstPrinter_printInstruction(MI=<unavailable>, O=<unavailable>, MRI=0x0000000100601650) + 7519 at AArch64GenAsmWriter.inc:6743
    frame #3: 0x0000000101077928 libcapstone.dylib`cs_disasm_dyn(ud=4301264240, buffer=0x00000001006011f0, size=4, offset=0, count=1, insn=0x00007fff5fbfd410) + 312 at cs.c:316
    frame #4: 0x00000001003c4d0f asm_arm_cs.dylib`disassemble + 303
    frame #5: 0x00000001000b1cd8 libr_asm.dylib`r_asm_disassemble(a=0x00000001004038d0, op=0x00007fff5fbfd4e0, buf=0x00000001006011f0, len=4) + 120 at asm.c:308
    frame #6: 0x00000001000b2258 libr_asm.dylib`r_asm_mdisassemble(a=0x00000001004038d0, buf=0x00000001006011f0, len=4) + 440 at asm.c:374
    frame #7: 0x000000010000224d rasm2`rasm_disasm(buf=0x00007fff5fbffd08, offset=0, len=4, bits=64, ascii=0, bin=0, hex=0) + 893 at rasm2.c:101
    frame #8: 0x0000000100001a95 rasm2`main(argc=6, argv=0x00007fff5fbffbc8) + 4149 at rasm2.c:364
    frame #9: 0x00007fff943805fd libdyld.dylib`start + 1
    frame #10: 0x00007fff943805fd libdyld.dylib`start + 1
(lldb)

Optimize MIPS disassembler

The file arch/Mips/mapping.c contains lot of O(n) functions that can be simplified into O(1) by just using an indirection instead of iterating over all items of the array.

Error on Fedora Linux 20, Capstone 2.1.2 while searching for libcapstone.so library

Enabling this debug line https://github.com/aquynh/capstone/blob/master/bindings/python/capstone/capstone.py#L133, shows that the library is searched here:
from capstone import *
Trying to load: /usr/lib/python2.7/site-packages/capstone/libcapstone.dll
Trying to load: /usr/lib/python2.7/site-packages/capstone/libcapstone.so
Trying to load: /usr/lib/python2.7/site-packages/capstone/libcapstone.dylib
While the lib is installed actually into:
/usr/lib/libcapstone.so

Note: I've installed capstone directly downloading it from the git repo, I've done "./make.sh" from the main dir followed by su -c "./make.sh install" , and from the python bindings dir I've done 'su -c "make install"'

Small issues in web doc lang_python.html

There are a few small incoherences in
http://www.capstone-engine.org/lang_python.html

Extra modes are said to be used combined with [...] CS_MODE_MIPS
CS_MODE_MIPS doesn't exist
You wanted probably to say they are to be combined with CS_MODE_32, CS_MODE_64 of CS_ARCH_MIPS

in 5. Architecture-dependent details:

The example code is missing the line
md.detail = True

I had to remove the unknown function to_x() to get it working as in the example output

Example output is showing addresses starting at 0x38 but example code puts starting address = 0x1000

Remove tabs

I have found that there are many places where โ€˜\tโ€™ is inside the opcode name or the operands.

Here's a small python test case:

$ cat cs.py
from capstone import *
MIPS_CODE = "\x04\x11\x00\x01"
md = cs(CS_ARCH_MIPS, CS_MODE_32+CS_MODE_BIG_ENDIAN)
for insn in md.disasm(MIPS_CODE, 0x1000):
        print("==> ", insn.op_str)

$ python cs.py
('==> ', 'bal\t0x8')

Finding the tabs

$ grep -re '\\t' arch/

Bad disassembly of MOV pc, lr

~/Desktop
$ test-as 'mov pc, lr'

/tmp/foo.s.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <.text>:
0: e1a0f00e mov pc, lr

~/Desktop
$ ./a.out e1a0f00e
e1a0f00e: mov nop=0
writeback: no
update_flags: no
cond: ARM_CC_AL
group: ARM_GRP_ARM

~/Desktop
$ test-as 'mov pc, ip'
Capstone reports "mov pc, lr", e1a0f00e, as having no operands, even though it correctly disassembles other forms of mov.

/tmp/foo.s.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <.text>:
0: e1a0f00c mov pc, ip

~/Desktop
$ ./a.out e1a0f00c
e1a0f00c: mov nop=2
writeback: no
update_flags: no
cond: ARM_CC_AL
operand 00: type=ARM_OP_REG reg=pc
operand 01: type=ARM_OP_REG reg=ip
group: ARM_GRP_ARM

~/Desktop

Invalid decomposition of MIPS instruction

When disassembling "\x1c\x00\x40\x14" (which is a bnez opcode). checking the .id field returns BNE instead of BNEZ.

Here's the test case:

/* Vala/Capstone Example - 2013 - pancake */

using Capstone;

void *bytes = "\x1c\x00\x40\x14";
int bytes_len = 4;

void main() {
    Insn* insn;
    Handle handle;

    var ret = open (Arch.MIPS, Mode.@32, out handle);
    if (ret != Capstone.Error.OK) {
        stderr.printf ("Error initializing capstone\n");
        return;
    }

    var n = disasm_dyn (handle, bytes, bytes_len, 0x01000, 0, out insn);
    if (n<1) {
        print ("invalid\n");
    } else if (n>0) {
        for (int i = 0; i<n; i++) {
            var op = &insn[i];
            print ((string)op.mnemonic+" "+(string)op.op_str+"\n");
            if (op.id == MipsInsn.BNEZ) {
                print ("Works fine!\n");
            } else {
                print ("Invalid decomposition :(!\n");
                print ("op.id=%d (should be %d)\n", (int)op.id, MipsInsn.BNEZ);
            }
        }
    }
    close (handle);
}

Execution of the test:

$ ./test-mips
bnez $2, 0x74
Invalid decomposition :(!
op.id=51 (should be 64)

OSX pkgconfig file locations

The install script for OSX puts .pc files in a location that is not in the default path. I don't even know what locations ARE in the default path for vanilla OSX, but macports uses /opt/local/lib/pkgconfig

Until we get capstone into macports, can we detect ports in the install script on OSX and use the port path? The other common package manager is Homebrew, I don't know where they put .pc files.

If we can do this then I can resolve bnagy/gapstone#2 by just using pkgconfig for my Go bindings

next : cmake win32 exports directory

When building the "next" branch w/ CMake on win32 (specifically with -G "NMake Makefiles") there are no exports defined within the linked .dll

You can use generate_export_header in CMakeLists.txt to generate a file to include, and then just include that somewhere (like in platform.h). It'll give you some defines to prefix all your exported functions with. Although, it might be better to just not use CMake to accomplish this and do it manually since there's other more preferable methods of building the library available.

X86 Prefix ordering

I have found an issue with prefix ordering that causes the disassembler to ignore both the osz and the repe/repne prefixes:

% ./quickcs 66 f2 af
scasd eax, dword ptr es:[edi]
% ./quickcs f2 66 af
repne scasw ax, word ptr es:[edi]

This is currently a problem in LLVM ToT, but the prefix handling in this base is much closer to reality. Thanks!

MIPS register naming in operand string

When using the most straightforward way to produce disassembly (ie. insn.mnemonic + insn.op_str) I'd expect Capstone to output register names instead of numeric register identifiers.

Objdump:

5c980:  8f998010    lw  t9,-32752(gp)
5c984:  03e07821    move    t7,ra
5c988:  0320f809    jalr    t9

Capstone:

0x30:   lw  $25, -0x7ff0($gp)
0x34:   move    $15, $ra
0x38:   jalr    $25

Looking at /arch/Mips/mapping.c it has names commented out in favor of numbered registers, which seems rather counterintuitive to me.

(I was also unable to patch the names back in, as it seemingly requires regenerating the .inc files - I don't know how these are produced)

New bytes appearing when disassembling

I have a very simple shellcode, and when I disassemble it with capstone, two new random bytes appear in my code out of nowhere.. It's really annoying, since it is destroying all the relative offset jumps..
("75 F6" becomes "0A 09 75 F6")
disasm

Windows 7 32bits
Python 2.7.2
fresh install of capstone with the windows binary install

Code used:
from capstone import *
from binascii import hexlify

CODE = """\xeb\x11\x5e\x31\xc9\xb1\x27\x80\x6c\x0e\xff\x35\x80\xe9\x01
\x75\xf6\xeb\x05\xe8\xea\xff\xff\xff\x20\x4a\x66\xf5\xe5\x44
\x90\x66\xfe\x9b\xee\x34\x36\x02\xb5\x66\xf5\xe5\x36\x66\x10
\x02\xb5\x1d\x1b\x34\x34\x34\x64\x9a\xa9\x98\x64\xa5\x96\xa8
\xa8\xac\x99"""

md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(CODE, 0x0000):
print "0x%s:\t%s\t\t%s\t%s" %(i.address, hexlify(i.bytes), i.mnemonic, i.op_str)

Incorrect condition code for bx instruction

The condition code should be ARM_CC_AL, not ARM_CC_INVALID.

/tmp/foo.s.o: file format elf32-littlearm
Disassembly of section .text:

00000000 <.text>:
0: e12fff10 bx r0

~/Desktop
$ ./a.out e12fff10
e12fff10: bx nop=1
writeback: no
update_flags: no
cond: ARM_CC_INVALID
operand 00: type=ARM_OP_REG reg=r0
group: ARM_GRP_ARM
group: ARM_GRP_V4T
group: ARM_GRP_JUMP

~/Desktop

Files built without respecting LDFLAGS have been detected

  • QA Notice: Files built without respecting LDFLAGS have been detected
  • Please include the following list of files in your report:
  • /usr/share/capstone/tests/test_x86
  • /usr/share/capstone/tests/test_arm64
  • /usr/share/capstone/tests/test_detail
  • /usr/share/capstone/tests/test
  • /usr/share/capstone/tests/test_arm
  • /usr/share/capstone/tests/test_mips

Do not overwrite LDFLAGS variable in your Makefile. You should inherit it instead

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.