Describe the bug The LowIR of relative calls may not be correct.<

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Relative calls lifting about b2r2 HOT 7 CLOSED

b2r2-org commented on August 15, 2024

Relative calls lifting

from b2r2.

Comments (7)

sangkilc commented on August 15, 2024

Thanks for the report. In fact this is not a bug, the target address computation is based on the concrete address of the instruction itself as you indicated. Do you have any specific reason why RIP should appear in the lifted IR? IMHO, it will complicate the formula anyways.

from b2r2.

tathanhdinh commented on August 15, 2024

Thanks you for the response.
There're cases where we need a symbolique RIP, for example in the snippet:

0x0        e8 fb ff ff ff        call -0x5
0x5        58                    pop rax

which is more or less mov rax, rip.

Because of RIP is always concrete, then it's not possible to prove rax == rip in LowIR; and it has effects on optimization and caching.

For example, in implementing a DBI we may get several the same basic block at different addresses, a method to accelerate DBI is to optimize this basic block and put the result into a cache. Given the basic block (well, it's not because of call and ret but just an example)

0x0        53                    push rbx
0x1        c3                    ret
0x2        e8 f4 ff ff ff        call -0xc
0x7        58                    pop rax
0x8        48 31 d8              xor rax, rbx

an optimizer would be able to know that rax = 0 at the end of the block; but it's not possible yet in current B2R2.

from b2r2.

sangkilc commented on August 15, 2024

0x0        e8 fb ff ff ff        call -0x5
0x5        58                    pop rax

From your example above, the call instruction will jump to itself, and it will go infinite loop. So I think you wanted to say something like:

0x0        e8 00 00 00 00        call 0x5
0x5        58                    pop rax

Now, this is effectively the same asmov rax, 0, where 0 is the address of the first instruction. If the first instruction is at 0x42, then the above snippet will be lifted as mov rax, 0x42, and so on.

We still know that rax will be always the same as the address of the first instruction (call instr), and an optimizer can benefit from it regardless of the fact that we have the symbolic RIP or not. Let me know if I am missing something.

from b2r2.

tathanhdinh commented on August 15, 2024

Thanks you for the response.

...the call instruction will jump to itself, and it will go infinite loop. So I think you wanted to say something like:

I feel ashamed :(, your example is correct, this is indeed call 0x5 (or call 0 under Intel's assembly).

Yes, the optimizer will reduce this snippet into mov rax, 0x5 (instead of mov rax, rip + 0x5). But that's the problem that I want to say about: if we have many blocks of:

call 0x5
pop rax

at different base addresses, then we can not reduce them into a general formula rax = rip + 0x5: the only way we can do is to enumerate one by one so that the optimizer will re-optimize each block with corresponding base rip.

from b2r2.

sangkilc commented on August 15, 2024

Sorry but I am not still sure about "reducing multiple of those call-pop pairs".

Suppose we have consecutive call-pop pairs like this:

0  call 5
5  pop rax
6  call 11
11 pop rax
12 call 17
17 pop rax

Now, we can translate this into

mov rax, 5
mov rax, 11
mov rax, 17

Now, with what you have mentioned, we would have:

mov rax, rip
mov rax, rip
mov rax, rip

But this is wrong, because RIP will change as you convert the instruction to mov rax, rip. The address of the instruction will change, and the corresponding RIP will also change. This means, after executing the above three instructions, we will have different a value in rax. So, to me, it is more accurate to use concrete addresses.

from b2r2.

tathanhdinh commented on August 15, 2024

Thanks you for the response, I really appreciate that.

Sorry for my bad explanation, this is not what I mean in saying replace the "call-pop" by the single mov rax, rip + 0x5, and your example

mov rax, rip
mov rax, rip
mov rax, rip

is indeed a very good example to say that it will not work.

What I mean is by symbolizing RIP, we get a general formula rax = base_address + 0x5. Though "call-pop" is a too trivial example to see the effect, we can imagine a more sophisticated basic block where the result on of registers depending on the base address, and there are many copies of it with different base addresses.

An emulator (using LowIR as internal IR and optimizing with symbolic PC) will have chance to directly compute the result without replaying each concrete basic block.

from b2r2.

sangkilc commented on August 15, 2024

Thank you for the explanation. I now see what you mean. I think such a global optimization can definitely benefit, but we may not change our design decision at this point for several reasons:

Having explicit RIP for every relative branch instructions makes ASTs unnecessarily big. It is just one more node for one instruction, but overall there are too many relative branches in a program.
The optimization you mentioned is difficult to implement in reality, and we are not sure if the optimization is critical for the efficiency of an analyzer or an emulator.
Making RIP explicit is never difficult, but requires some carefully modification on the entire front-end of our system.

So all in all, If we see more needs on this, we may resurrect this issue and the PR, but for now, we will not handle this issue. Hope this makes sense to you, and thanks for the fruitful discussion.

from b2r2.

Relative calls lifting about b2r2 HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs