GithubHelp home page GithubHelp logo

Dedupe Traits about binlex HOT 11 CLOSED

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024
Dedupe Traits

from binlex.

Comments (11)

jbx81-1337 avatar jbx81-1337 commented on May 20, 2024 2

If someone can link an executable that generate duplicates I will investigate about that

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

binlex/src/decompiler.cpp

Lines 518 to 527 in 0f84757

if (sections[index].discovered.empty()){
uint64_t tmp_addr = MaxAddress(sections[index].coverage);
if (tmp_addr < sections[index].data_size){
sections[index].discovered.push(tmp_addr);
sections[index].addresses[tmp_addr] = DECOMPILER_OPERAND_TYPE_FUNCTION;
sections[index].visited[tmp_addr] = DECOMPILER_VISITED_QUEUED;
continue;
}
break;
}

Looks like no check if addr is visited here, maybe why we getting dupes

from binlex.

jbx81-1337 avatar jbx81-1337 commented on May 20, 2024

This check seems dont fix the issue, trying to put a 'guard' check after the element is get from the queue by the DecompilerWorker and check if is visited (this should not be visited) if we dont have an issue on address scheduling this bug is on TraitWorker

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

@jbx81-1337 you may want to try checking out the branch staging as we keep all our bleeding edge changes there to see if there is any difference.

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

@herrcore did we address this, this issue ongoing?

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

binlex/src/decompiler.cpp

Lines 229 to 257 in dd107c1

if (block == true && IsConditionalInsn(insn) > 0){
b_trait.tmp_trait = TrimRight(b_trait.tmp_trait);
b_trait.tmp_bytes = TrimRight(b_trait.tmp_bytes);
b_trait.size = GetByteSize(b_trait.tmp_bytes);
b_trait.offset = sections[index].offset + myself.pc - b_trait.size;
AppendTrait(&b_trait, sections, index);
ClearTrait(&b_trait);
if (function == false){
ClearTrait(&f_trait);
break;
}
}
if (block == true && IsEndInsn(insn) == true){
b_trait.tmp_trait = TrimRight(b_trait.tmp_trait);
b_trait.tmp_bytes = TrimRight(b_trait.tmp_bytes);
b_trait.size = GetByteSize(b_trait.tmp_bytes);
b_trait.offset = sections[index].offset + myself.pc - b_trait.size;
AppendTrait(&b_trait, sections, index);
ClearTrait(&b_trait);
}
if (function == true && IsEndInsn(insn) == true){
f_trait.tmp_trait = TrimRight(f_trait.tmp_trait);
f_trait.tmp_bytes = TrimRight(f_trait.tmp_bytes);
f_trait.size = GetByteSize(f_trait.tmp_bytes);
f_trait.offset = sections[index].offset + myself.pc - f_trait.size;
AppendTrait(&f_trait, sections, index);
ClearTrait(&f_trait);
break;

from binlex.

jbx81-1337 avatar jbx81-1337 commented on May 20, 2024

Seems AppendTrait is called multiple times for the same trait, i dont understand if is possible to trigger multiple conditional path in the section code provided by @c3rb3ru5d3d53c . I have tried to switch from 'if' to 'else if' but seems equal, my suggestion is to keep track of processed trait with AppendTrait in 'visited' map, set address as DECOMPILER_VISITED_APPENDED and remove the duplicated trait.

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

Investigating this one now 😄

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

Seems AppendTrait is called multiple times for the same trait, i dont understand if is possible to trigger multiple conditional path in the section code provided by @c3rb3ru5d3d53c . I have tried to switch from 'if' to 'else if' but seems equal, my suggestion is to keep track of processed trait with AppendTrait in 'visited' map, set address as DECOMPILER_VISITED_APPENDED and remove the duplicated trait.

So I thought about this a decent amount, there are times where a jump instruction can be referenced multiple times, and I also notice that functions are not duplicated. As such, I think that this issue does stem from what you are saying. Now the hard part is to determine where to best implement code to solve that. We could implement an additional check perhaps to the method CollectOperands under each switch case perhaps by checking sections[index].addresses and their associated types. I think this would be much more ideal than adding it to the method AppendTrait as I think this would be inefficient.

from binlex.

c3rb3ru5d3d53c avatar c3rb3ru5d3d53c commented on May 20, 2024

So believe I found the issue, we need to split sections[index].addresses into two sets, this would be easier to maintain, I could be incorrect, however it would appear we cannot store 2 different values for the same key in a map. This is needed because Functions will start with a block at the same address.

image

from binlex.

jbx81-1337 avatar jbx81-1337 commented on May 20, 2024

Ok, I don't really understand where the issue is because I noticed that the duplicated traits are type=block, and in any case the actual code doesn't push the addr on the Worker queue if Is already visited, I also made some checks for older version of binlex and seems has the same bug

from binlex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.