GithubHelp home page GithubHelp logo

repz lifting question about b2r2 HOT 6 CLOSED

b2r2-org avatar b2r2-org commented on August 15, 2024
repz lifting question

from b2r2.

Comments (6)

sangkilc avatar sangkilc commented on August 15, 2024 1

Okay here is what's happening.

EIP := ... may or may not correspond to an InterJmp statement. If you look at https://github.com/B2R2-org/B2R2/blob/master/src/BinIR/LowUIR.fs#L412, there's really no difference between InterJmp and Put statements after pretty-printing, which is bad in my opinion. When there is an InterJmp statement, you should not follow the next statement. Instead, you should change your PC (EIP) value and go to the corresponding next instruction.

Now, if you look at the loop -0x3 case, EIP := (EIP + 0x19146A:I32) this is not an InterJmp, but it is merely a Put statement, which should be handled differently. In this case, you update the EIP, but keep execute the next IR statement. In other words, we stop executing IR statements only when we encounter either InterJmp or InterCJmp. Inter-jumps in B2R2 basically means that we should stop executing/interpreting the current IR statements, and should immediately jump to the next instruction. If you see https://github.com/B2R2-org/B2R2/blob/master/src/BinIR/LowUIR.Eval.fs#L219, those statements are passed to endBlock function for that reason.

Nonetheless, the loop instruction looks wrong, because it should not contain internal (within an instruction) loop as you pointed out. I will create two follow-up issues to handle this problem. First, I don't like the current way of pretty-printing InterJmp statements because it is so confusing and we cannot easily distinguish it with Put statements. Second, the loop instruction semantic is buggy, and we should fix it.

from b2r2.

enkomio avatar enkomio commented on August 15, 2024 1

OK understood, I updated my code in order to follow your indication.

Thanks for the explanation :)

from b2r2.

sangkilc avatar sangkilc commented on August 15, 2024

Yes, I understand your concern.

It may seem weird, but here, we would like to consider each iteration of a REPZ instruction as a "separate" instruction. In fact, such an assumption is not uncommon. For example, when you are taking an instruction-level execution trace from Pin, it will instrument every iteration of a REPZ instruction too. When you do single-stepping with GDB, the same assumption holds. So we decided to explicitly separate each iteration with EIP := .... This way we can easily align B2R2's execution engine with execution traces obtained from Pin or similar tools.

from b2r2.

enkomio avatar enkomio commented on August 15, 2024

Ok gotcha. Thanks for the clarification :)

from b2r2.

enkomio avatar enkomio commented on August 15, 2024

Hi @sangkilc,

sorry to bother you again on this but I struggle to understand how the statements should be interpreted, in particular by considering the following instructions:

0x19146D:    E2 FB                          loop -0x3 ; 0x19146A

which is lifted in the following statements:

-------------ISMark (19146D, 2)-------------
T_710:I32 := ECX
-------------LMark (Loop)-------------
T_710:I32 := (T_710:I32 - 0x1:I32)
if(T_710:I32 != 0x0:I32) then Jmp (Continue, 127) else Jmp (End, 128)
-------------LMark (Continue)-------------
EIP := (EIP + 0x19146A:I32)
Jmp (Loop, 126)
-------------LMark (End)-------------
-------------IEMark (19146F)-------------

and

0x40102E:    F3 A4                          repz movsb

which is lifted in the following statements:

-------------ISMark (40102E, 2)-------------
if(ECX = 0x0:I32) then Jmp (Exit, 114) else Jmp (Continue, 115)
-------------LMark (Continue)-------------
[EDI] := [ESI]:I8
ESI := (ite (DF) ((ESI - 0x1:I32)) ((ESI + 0x1:I32)))
EDI := (ite (DF) ((EDI - 0x1:I32)) ((EDI + 0x1:I32)))
ECX := (ECX - 0x1:I32)
EIP := 0x40102E:I32
-------------LMark (Exit)-------------
EIP := 0x401030:I32
-------------IEMark (401030)-------------

Initially I supposed that we have to emulate the statements in a sequential order and in the first sample it makes sense. But for the second example this strategy doesn't work, since EIP would assume always the same value.

Also, I have to admit that is not very clear to me why the loop instruction is lifted in that way :)

from b2r2.

sangkilc avatar sangkilc commented on August 15, 2024

Okay it seems that you found a bug! Especially the loop instruction looks wrong. I will follow up on this soon. Sorry about the brevity.

from b2r2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.