GithubHelp home page GithubHelp logo

Comments (10)

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

quiveringlemon avatar quiveringlemon commented on August 16, 2024

WRT problem 1 - i meant to say line 18 executes just in case line 17 is true (this is true because line 18 is in the body of the conditional statement given at line 17) - they should then have the same suspiciousness which they don't.

WRT problem 2 - i'm worried about this. As we are still at the testing phase, we still want to test pfl according to specification (it has been shown to be a lot more accurate than sbo in my HVC paper). However, pfl makes its assignments of fault probability according to the coverage matrix, where this coverage matrix must conform to the specification I described where each line of code gets one column in the coverage matrix. I am presuming then with the current setup that pfl will make its assessments of fault probability given the matrix provided - in which each unwinding of the loop gets a new column. Now, very roughly speaking (and without having to get into the statistical particularities of pfl), the probability a covered program artefact will be the bug in a given row of the matrix will be 1/n where n is the number of 1s in the row of that matrix (that's at least how i wrote the C++ code which i passed to you and I gather is implemented in the current branch). Suppose the execution in question only covers a single faulty line of code, and that happens to be a line of code in the loop. Then, if we have unwound a given loop 1000 times, and each of these unwindings features in the matrix and each unwinding is executed, then the current setup will output a fault probability of that line being faulty as 1/1000 - which is completely wrong (the correct probability is 1, as it is the only line executed in an error trace). Thus, in general it is very important the coverage matrix which is input to the fault localisation measures behaves according to specification - simply removing repeated lines in the suspiciousness report won't solve this problem.

from cbmc.

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

quiveringlemon avatar quiveringlemon commented on August 16, 2024

WRT problem 1. I think I see what is going on now - I was for some reason presuming that line 17 (and indeed any conditional expression) gets a 1 in the matrix just in case i) flow of control evaluates the truth value of the condition in the given execution and ii) that condition evaluates to true, and 0 if i) but not ii) holds. I see what is actually happening is it gets a 1 in the matrix whenever i) holds simpliciter.

If I have understood what is going on in the implementation I think your solution is probably the better one, but it suffers from the problem that any condition which is always evaluated simply turns up as a vertical column of 1s in the coverage matrix and therefore fault localisation data becomes pretty uninformative.
In contrast there is also a problem with the alternative proposal above: that is, if a programmer creates a bug in a conditional expression which makes it unsatisfiable, then it will never turn up in the matrix at all and any fault localisation method will be ineffective.

I think for the purposes of our experiments we can leave this issue perhaps. But in the future I think the ideal solution would be to have two columns in the matrix for each conditional expression - one column gets 1 just in case the given expression evaluates to true and is executed, the other gets 1 just in case it evaluates to false and is executed. This way different evaluations of conditional expressions can have different degrees of suspiciousness, and both problems outlined above are avoided. However, for the moment I imagine adding this distinction to the implementation is more trouble that it is worth.

from cbmc.

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

quiveringlemon avatar quiveringlemon commented on August 16, 2024

WRT conditional expressions - some of my benchmarks now perform substantially worse, where previously the bug was reported as the most suspicious. Here is the command

cbmc byte_add_false-unreach-call_true-no-overflow.c --incremental --stop-on-fail --unwind 8 --localize-faults --localize-faults-max-traces 20 --localize-faults-method sbo --localize-faults-max-display 100 --verbosity 0 --slice-formula

Here is the output (the bug is the conditional expression at 76):

** Most likely fault location:
Fault localization scores:
[main.assertion.1]: Single Bug Optimal Fault Localization
[score: 8.55] ##file byte_add_false-unreach-call_true-no-overflow.c line 77 function mp_add
[score: 8.1] ##file byte_add_false-unreach-call_true-no-overflow.c line 62 function mp_add
[score: 8.05] ##file byte_add_false-unreach-call_true-no-overflow.c line 65 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 66 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 67 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 68 function mp_add
[score: 8.05] ##file byte_add_false-unreach-call_true-no-overflow.c line 71 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 72 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 73 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 74 function mp_add
[score: 8.05] ##file byte_add_false-unreach-call_true-no-overflow.c line 107 function main ##file byte_add_false-unreach-call_true-no-overflow.c line 108 function main ##file byte_add_false-unreach-call_true-no-overflow.c line 110 function main ##file byte_add_false-unreach-call_true-no-overflow.c line 31 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 32 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 33 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 34 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 35 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 36 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 37 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 38 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 40 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 50 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 61 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 64 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 70 function mp_add ##file byte_add_false-unreach-call_true-no-overflow.c line 76

The bug is the conditional expression at line 76, which is now very low on the report. However, in a previous implementation, it was top equal (1st) along with 77. 77 is executed just one line after the conditional expression is executed at line 76.

To solve this (and problems like it), I suggested two possible solutions earlier in this thread:

  1. we would have two columns in the matrix for each conditional expression, one for when it evaluates to true, the other for false. If this was implemented then 77 would have the same suspiciousness as 76 and the bug would be reported top equal suspicious. In general, this seems like the right thing to do in any case. In the current scenario, conditional expressions which are always evaluated turn up as always true in the matrix, where presumably we want to measure the suspiciousness of the events in which the conditional expression are evaluated to true/false respectively.

  2. the conditional expression only gets 1 in the matrix if it evaluates to true in the execution. We ignore it when it is false.

1 i think is the preferred and correct solution, but 2 might be more of a quick fix - but that said I don't know how that will perform down the line. Would either of the these take a long time to implement?

from cbmc.

quiveringlemon avatar quiveringlemon commented on August 16, 2024

On further analysis, either of the above solutions would mean 17 and 18 get the same suspiciousness for insertion_sort, which means the results aren't significantly affected for that benchmark (both would be top equal which is still an excellent result)

from cbmc.

theyoucheng avatar theyoucheng commented on August 16, 2024

from cbmc.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.