Building with Stack
Flexdis is meant to build with stack. For example,
stack --stack-yaml stack-8.6.yaml build
A library for disassembling x86-64 binaries.
License: BSD 3-Clause "New" or "Revised" License
Flexdis is meant to build with stack. For example,
stack --stack-yaml stack-8.6.yaml build
If you run DumpInstr
on a push
instruction with an immediate argument, you'll get something like this:
$ cabal run DumpInstr -- 6a 00
II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "push", iiArgs = [(ByteImm 0,OpType ImmediateSource BSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b00000000, _prVEX = Nothing, _prASO = False, _prOSO = False}, iiRequiredPrefix = Nothing, iiOpcode = [106], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}
push 0x0
Notice that while the address size (and in this case, the operand size as well) iiAddrSize
is Size64
(8 bytes), while the operand is ByteImm 0
(only 1 byte). This is a bit dubious. Per the ISA manual:
If the source operand is an immediate of size less than the operand size, a sign-extended value is pushed on the stack. If the source operand is a segment register (16 bits) and the operand size is 64-bits, a zero-extended value is pushed on the stack; if the operand size is 32-bits, either a zero-extended value is pushed on the stack or the segment selector is written on the stack using a 16-bit move. For the last case, all recent Core and Atom processors perform a 16-bit move, leaving the upper portion of the stack location unmodified.
Since push
sign-extends the operand in some cases, ideally we would indicate this in the InstructionInstance
itself. In the example above, for instance, we should be using ByteSignedImm
instead of ByteImm
. Fixing this may be as simple as editing the parts of optable.xml
that pertain to push
.
The utils/dump.sh
script is handy, but it always assumes that the input is in Intel syntax. It would be handy to allow it to accept AT&T syntax as well, especially since the test cases here use AT&T syntax. Similarly for utils/dump_bytes.sh
.
binary-symbols
is a general-purpose library for describing symbols in binaries, which makes its placement in flexdis86
(ostensibly an x86-specific library) somewhat confusing. We should move binary-symbols
to a different home to reflect its broader scope.
This instruction will fail the roundtrip test.
mov %rdi,0x601000: FAIL
(This came up in Refurbish, but it's almost certainly in Flexdis.)
The musl implementation of floating-point formatting has the instructions
40103e: db ac 24 90 1d 00 00 fldt 0x1d90(%rsp)
401045: 48 8d 05 f8 33 00 00 lea 0x33f8(%rip),%rax # 404444 <_fini+0xa0>
Refurbish rewrites them as
40103e: db 2c 24 fldt (%rsp)
401041: 90 nop
401042: 1d 00 00 48 8d sbb $0x8d480000,%eax
401047: 05 f8 33 00 00 add $0x33f8,%eax
What seems to be happening is that the ac
in fldt
gets misread as 2c
, which shortens the instruction, causing the next few instructions to get mangled.
You can test this by adding a line to test-fp.c
in the Refurbish tests:
int main() {
entry();
printf("%.9f", -0.169075164); /* new line */
return 0;
}
This will cause a segfault when someone tries to access the nonsense pointer.
The parse tables occupy about 400MB in memory after they are constructed, as can be seen in this profile collected by @RyanGlScott: verify-RSA.saw.pdf. There are two factors to this memory consumption:
flexdis86/src/Flexdis86/Disassembler.hs
Lines 415 to 420 in c19b55e
flexdis86/src/Flexdis86/Disassembler.hs
Lines 171 to 180 in c19b55e
Addressing the former is tricky. One could use a simple DFA to parse prefix bytes separately to save an enormous amount of space. However, not all prefixes are valid for all instructions; those restrictions are currently properly encoded in the fully elaborated tables. To separate out prefix parsing, it would be necessary to add a post-parsing check to see if the parse was valid or not.
Addressing the latter might be less tricky, as we could change the representation of the tables. Another disassembler uses a mostly unboxed structure: https://github.com/travitch/dismantle/blob/48433e7ccb02924b2f4695c8c9f09fb9cfccdfc4/dismantle-tablegen/src/Dismantle/Tablegen/LinearizedTrie.hs#L34. The x86 case is a bit trickier as the parser has more states than the parsers generated by dismantle. However, we might be able to take inspiration from the more compact parse table representation and adapt it for flexdis.
E.g., adding the test ("movslq 0x601018,%rax", ["movslq 0x601018,%rax"])
to the twoOperandOpcodes
block of round trip tests causes a failure:
movslq 0x601018,%rax: FAIL
tests/Roundtrip.hs:297:
Assembled bytes
[II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "movsxd", iiArgs = [(QWordReg rax,OpType ModRM_reg QSize),(Mem32 (Addr_64 ds Nothing Nothing (Disp32 0x601018)),OpType ModRM_rm DSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b01001000, _prVEX = Nothing, _prASO = False, _prOSO = False}, iiRequiredPrefix = Nothing, iiOpcode = [99], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}]
Length: 7 (0x7) bytes
0000: 48 63 05 18 10 60 00 Hc...`.
expected: "Hc\EOT%\CAN\DLE`\NUL"
but got: "Hc\ENQ\CAN\DLE`\NUL"
During my update to 8.8 compatibility, I changed a few calls to fail
into calls to error
. I thought that this was safe, but there has been a regression in the renovate tests that make it seem like this might not have been as safe as I thought. In particular, it seems like the failure in parseValue
(
flexdis86/src/Flexdis86/Disassembler.hs
Line 685 in c263871
fail
in this case is challenging because neither of the instances of the ByteReader
monad can easily implement it in a way besides calling error
.flexdis86 currently fails to build on GHC 8.8 with the following MonadFail
related error:
src/Flexdis86/ByteReader.hs:74:24: error:
• Could not deduce (MonadFail m) arising from a use of ‘fail’
from the context: ByteReader m
bound by the class declaration for ‘ByteReader’
at src/Flexdis86/ByteReader.hs:52:35-44
Possible fix:
add (MonadFail m) to the context of
the type signature for:
invalidInstruction :: forall a. m a
or the class declaration for ‘ByteReader’
• In the expression: fail "Invalid instruction"
In an equation for ‘invalidInstruction’:
invalidInstruction = fail "Invalid instruction"
|
74 | invalidInstruction = fail "Invalid instruction"
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
It looks like some newer GCCs will emit the endbr instructions at targets of indirect branches to make use of Intel's Indirect Branch Tracking feature. It is semantically a no-op.
Constructing the NextOpcodeTable
at runtime is rather expensive according to my recent profiling. This seems like something that could run at compile-time.
#12 introduced some tests that disassemble the "tiny" binaries in sample-binaries. Optimally, we would be able to roundtrip the .text sections of all these binaries. At the moment, that test code is commented out because it fails.
The output of utils/dump.sh
on xchg ax,ax
and xchg eax,eax
reveals something unusual:
$ ./utils/dump.sh xchg ax,ax
66 90 xchg ax,ax
II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "xchg", iiArgs = [(DWordReg eax,OpType (Opcode_reg 0) VSize),(DWordReg eax,OpType (Reg_fixed 0) VSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b00000000, _prVEX = Nothing, _prASO = False, _prOSO = False}, iiRequiredPrefix = Nothing, iiOpcode = [102,144], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}
nop
$ ./utils/dump.sh xchg eax,eax
87 c0 xchg eax,eax
II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "xchg", iiArgs = [(DWordReg eax,OpType ModRM_rm VSize),(DWordReg eax,OpType ModRM_reg VSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b00000000, _prVEX = Nothing, _prASO = False, _prOSO = False}, iiRequiredPrefix = Nothing, iiOpcode = [135], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}
nop
The unusual part is that despite these instructions having different operand sizes (16-bit in the former and 32-bit in the latter), flexdis86
claims that both instructions have 32-bit (DWord
) operands. The former mistakenly claims that the 16-bit operands are 32-bit because of xchg
's janky treatment in data/optable.xml
:
Lines 8389 to 8393 in 7109bdc
This has both an operand-size override (oso
, or 0x66
) as well as an opcode beginning with 0x66
. But the ISA manual entry for xchg
shows the opcode being just 0x90
. I suspect that the reason things are this way is to make the opcode table work with the current disassembler machinery, which encodes all prefix combinations and therefore needs the 0x66
bit to disambiguate xchg
from nop
, whose opcode is also 0x90
. (See also c20797e, the commit which gave xchg
this special treatment in the opcode table.)
Besides the bug above, one other strange consequence of this is that flexdis86
thinks that 66 66 90
is a different instruction that 66 90
:
$ cabal run exe:DumpInstr -- 66 90
II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "xchg", iiArgs = [(DWordReg eax,OpType (Opcode_reg 0) VSize),(DWordReg eax,OpType (Reg_fixed 0) VSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b00000000, _prVEX = Nothing, _prASO = False, _prOSO = False}, iiRequiredPrefix = Nothing, iiOpcode = [102,144], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}
nop
$ cabal run exe:DumpInstr -- 66 66 90
II {iiLockPrefix = NoLockPrefix, iiAddrSize = Size64, iiOp = "xchg", iiArgs = [(WordReg ax,OpType (Opcode_reg 0) VSize),(WordReg ax,OpType (Reg_fixed 0) VSize)], iiPrefixes = Prefixes {_prLockPrefix = NoLockPrefix, _prSP = SegmentPrefix {unwrapSegmentPrefix = 0}, _prREX = 0b00000000, _prVEX = Nothing, _prASO = False, _prOSO = True}, iiRequiredPrefix = Nothing, iiOpcode = [102,144], iiRequiredMod = Nothing, iiRequiredReg = Nothing, iiRequiredRM = Nothing}
xchg ax,ax
Moreover, this reveals that there is a special case in the pretty-printer for xchg ax,ax
that causes it to be printed as nop
, but there is not a corresponding special case for xchg eax,eax
. This seems inconsistent.
Modern x86_64 processors support a noprefix
instruction (0x3e
) that can be attached to jmp
and call
instructions, which can be used to ignore CET indirect branch tracking. Here is an example from a cat
binary found on a Linux machine:
3e ff e0 notrack jmpq *%rax
Currently, flexdis86
fails to disassemble this:
$ cabal run exe:DumpInstr -- 3e ff e0
Up to date
DumpInstr: No parse: Invalid instruction
CallStack (from HasCallStack):
error, called at utils/DumpInstr.hs:59:24 in main:Main
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.