GithubHelp home page GithubHelp logo

llmhyy / microbat Goto Github PK

View Code? Open in Web Editor NEW
53.0 6.0 14.0 254.81 MB

A feedback-based debugger for interactively recommending suspicious step in buggy program execution.

Java 99.99% Python 0.01%
debugger-visualizer feedback time-travelling

microbat's People

Contributors

bchenghi avatar dingyuchen avatar griffinw-guanjie avatar llmhyy avatar lylytran avatar nime-sha256 avatar sianghwee avatar songxuezhi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

microbat's Issues

Parse Regular Expression for Allowed Library Class

Suppose we need to allow Microbat to analyze the java classes under java.awt.event package, we should allow user to customize such a restriction in preference page.

For the current implementation, JPDA only allows us to use * to describe excluded library. For example, when feeding a regular expression like java.* as excluded class pattern to JPDA, it will omit all the classes with its name starting with "java.". Therefore, if we want additional library class such as java.awt.event..java, we need to list all the string patterns except java.awt.event..java, for example, java.awt.A*, java.awt.B*, .... Hence, when user input java.awt.event.* in the preference page, we to automatically transfer it into a list of class patterns except java.awt.event.*.

Fail to Retrieve Variable ID When Switching Special Context

Test case: org.apache.commons.math.distribution.NormalDistributionTest#testSetMean,

In the line 50 of NormalDistributionImpl.java, when the non-static field mean is initialized, I can not retrieve its parent object. I observe that the step executed before the initialization is a step in super() method. I probably have not handled such special case.

Space in Directory

Microbat cannot correctly record the trace if there is space in the directory of
(1) Java workspace
(2) installed Eclipse

Returned Variable Name and Array Name May Not Exist

For current implementation, I access returned variable value and array element value (in library code) by their name. The name is accessed by analyzing the java byte code. In general, when I find a *aload/*astore/*return instruction, I search its previous instructions to estimate its name. Nevertheless, if an array/returned variable comes from a method invocation, its name does not exist. JVM simply retrieve a temporary value in JVM frame stack. In such case, I need to do more instrumentation to generate a virtual name so that I can access the value. I leave this feature to the future.

Possible Missing of Some Read Variable

In current implementation of Microbat, it is possible to miss some synonymous variable. For example, in a step running into the statement of a.x = b.x, I can only deal with one x field as the bcel API only contains line number information. As both x fields locates in the same line, therefore, I cannot distinguish the variable.

Such implementation issue should be fixed in the future.

Message for "could not load classfile" Error of Soot

The Soot may report an error like "java.lang.RuntimeException: Could not load classfile". The reason shall lies in that the soot.2.5.0.jar (standard version) is compiled under an old JDK version while the runtime environment is an updated JDK version (like 1.8). Please check https://www.se.informatik.uni-kiel.de/en/research/science-blog/soot-tutorial for more detail.

An observation is that the error disappear when running the program for several times. In my case, 5 times. Then I can generate Jimple code successfully. A speculation is that some class files are loaded and cached when soot runs in previous several time.

JPDA Issue (No Step Event in Static Block)

Some Java code cannot be visited by step event of JDPA. For example, the static block or the method invoked in the initialization of static field. Microbat only record the trace steps which can be visited by step event. I may need to consider this for future tool improvement.

Test case 1: org.apache.commons.math.distribution.NormalDistributionTest#testSetMean

The line 62 of Gamma.java invokes FastMath.log(), however, the code in log() will not be recorded at that time by Microbat.

Another case:
The FastMatch contains a static block which involves some heavy computation. Microbat cannot accurately predicate the progress because the steps executed in the block is not recorded.

JDI Timeout Issue

The correct trace of 4376 steps is to be generated for org.apache.commons.math.util.OpenIntToDoubleHashMapTest#testPutKeysWithCollisions
JVM is started...
progress: ==========10%==========20%==========30%==========40%==========50%==========60%==========70%========
JVM is ended.
org.eclipse.jdi.TimeoutException: Timeout occurred while waiting for packet -1118735817.
at org.eclipse.jdi.internal.connect.PacketReceiveManager.getReply(PacketReceiveManager.java:187)
at org.eclipse.jdi.internal.connect.PacketReceiveManager.getReply(PacketReceiveManager.java:198)
at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:192)
at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:227)
at org.eclipse.jdi.internal.MirrorImpl.requestVM(MirrorImpl.java:243)
at org.eclipse.jdi.internal.ObjectReferenceImpl.referenceType(ObjectReferenceImpl.java:526)
at org.eclipse.jdi.internal.ObjectReferenceImpl.type(ObjectReferenceImpl.java:545)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendClassVarVal(VariableValueExtractor.java:453)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendVarVal(VariableValueExtractor.java:388)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendArrVarVal(VariableValueExtractor.java:542)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendVarVal(VariableValueExtractor.java:362)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendClassVarVal(VariableValueExtractor.java:454)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.appendVarVal(VariableValueExtractor.java:388)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.collectValue(VariableValueExtractor.java:222)
at microbat.codeanalysis.runtime.variable.VariableValueExtractor.extractValue(VariableValueExtractor.java:147)
at microbat.codeanalysis.runtime.ProgramExecutor.extractValuesAtLocation(ProgramExecutor.java:1343)
at microbat.codeanalysis.runtime.ProgramExecutor.collectValueOfPreviousStep(ProgramExecutor.java:1311)
at microbat.codeanalysis.runtime.ProgramExecutor.constructTrace(ProgramExecutor.java:271)
at microbat.codeanalysis.runtime.ProgramExecutor.run(ProgramExecutor.java:143)
at tregression.TraceModelConstructor.constructTraceModel(TraceModelConstructor.java:64)
at tregression.junit.TestCaseAnalyzer.runEvaluationForSingleTrial(TestCaseAnalyzer.java:393)
at tregression.junit.TestCaseAnalyzer.runEvaluationForSingleTestCase(TestCaseAnalyzer.java:319)
at tregression.junit.TestCaseAnalyzer.runEvaluation(TestCaseAnalyzer.java:243)
at tregression.handler.EvaluationAllHandler$1.run(EvaluationAllHandler.java:54)
at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)
test case has exception when generating trace:
Trial [testCaseName=org.apache.commons.math.util.OpenIntToDoubleHashMapTest#testPutKeysWithCollisions, mutatedFile=C:\microbat_evaluation_regression\apache-common-math-2.2\org.apache.commons.math.util.OpenIntToDoubleHashMap_355_21_2\OpenIntToDoubleHashMap.java, isBugFound=false, jumpSteps=null, totalSteps=0]

R/W Abstraction for APIs

Given a variable var, it could be written inside of a method invocation inv(). I think Microbat can support the abstraction that the inv() method writes var. Based on that, the developer may further check the details later.

The implementation of this function is suggested to use dynamic slicing, similar to Whyline code. The effect should also work on the third-party library with source code unavailable.

DB Recording Exception

hi @lylytran ,

It appears again. Probably, it happens for Lang-17

Exception in thread "Thread-17" sav.common.core.SavRtException: javax.xml.transform.TransformerException: org.xml.sax.SAXException: Invalid UTF-16 surrogate detected: d842 ?
java.io.IOException: Invalid UTF-16 surrogate detected: d842 ?
at microbat.handler.xml.VarValueXmlWriter.writeXml(VarValueXmlWriter.java:94)
at microbat.handler.xml.VarValueXmlWriter.generateXmlContent(VarValueXmlWriter.java:69)
at microbat.sql.TraceRecorder.generateXmlContent(TraceRecorder.java:102)
at microbat.sql.TraceRecorder.insertSteps(TraceRecorder.java:92)
at microbat.sql.TraceRecorder.insertTrace(TraceRecorder.java:68)
at tregression.io.RegressionRecorder.record(RegressionRecorder.java:39)
at tregression.empiricalstudy.TrialGenerator$DBRecording.run(TrialGenerator.java:300)
at java.lang.Thread.run(Unknown Source)
Caused by: javax.xml.transform.TransformerException: org.xml.sax.SAXException: Invalid UTF-16 surrogate detected: d842 ?
java.io.IOException: Invalid UTF-16 surrogate detected: d842 ?
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)
at microbat.handler.xml.VarValueXmlWriter.writeXml(VarValueXmlWriter.java:92)
... 7 more
Caused by: org.xml.sax.SAXException: Invalid UTF-16 surrogate detected: d842 ?
java.io.IOException: Invalid UTF-16 surrogate detected: d842 ?
at com.sun.org.apache.xml.internal.serializer.ToStream.characters(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.characters(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.characters(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(Unknown Source)
... 10 more
Caused by: java.io.IOException: Invalid UTF-16 surrogate detected: d842 ?
at com.sun.org.apache.xml.internal.serializer.ToStream.accumDefaultEscape(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToStream.processDirty(Unknown Source)
... 20 more

Create database when initializing the plugin

Before inserting the content in mysql database, we need to first check whether such database exists. If not, we may need to create the database based on sql script under \microbat\ddl\

Simulation optimization

The simulation for the same pair of mutated and correct trace can share a lot of data. No need to recalculate.

Possible Optimization for Simulation

In the simulation, there could be many options for a wrong-variable-value feedbacks. More specifically, if a step has k (k>2) wrong variables, we can simulate k wrong-variable-value feedbacks. I maintain a confusing stack to record all the state before simulating a wrong-variable-value feedback on a certain step if this step has over two wrong variables.

Therefore, if there are m steps with wrong-variable-value feedback, each step has an average of n wrong variables, it is possible to generate n^m attempts to find the mutated bug. It is time-consuming. Currently, I set a threshold h to limit the attempt times. However, it is possible to optimize the process by recording the results of some attempts so that the exponential tree can be pruned.

Enhance handling the case of missing control dominator

In the process of simulation evaluation for microbat, it is possible that some control dominator of certain trace node is missing. Thus, it incurs the failure of finding mutated bug. I need to provide more informative feedback during evaluation.

Possible Buggy Abstract Level Construction

When building the abstract level of trace steps, I should distinguish loop parent and invocation parent. However, Microbat might be buggy when encountering the recursive methods. In my algorithm, I keep a stack to track the influential loop parent, all the trace steps after the influential loop parent will be its direct loop child. Therefore, I need to identify two things:

  1. When the influential loop parent should be popped so that its influential ends?
  2. When a new influential loop parent should be pushed?

For now, I defined a set of heuristics for identifying when a loop parent should be popped out of the stack. However, I may not consider the following case:

recMethod(){
   for(...){
       recMethod();
       if(...){
          break;
       }
   }
}

In above case, the execution of the break statement should remove the influence of the for-loop , however, I may not handle this case because of the recursive method.

Side Effect of toString() Method Invocation

In the process of debugging, the string value of an object variable will be recorded by dynamically invoking its toString() method. However, a side-effect of such an operation is to modify the program state if the toString() writes some fields of the object variable.

A possible solution is to remember the program state before toString() method invocation and restore the state after the invocation is finished.

Same Reference Error

When building data dependency, I use heap address to uniquely identify a reference variable. It works fine in most cases, however, the alias variables make the situation complicated. See the following example.

Integer i1 = a[i2];
Integer i2 = a[i2];
Integer result = i1+i2;

In the 1st line, we should use i1 instead i2, thus, it is a bug. When backward trace the root cause, the user may indicate i1 is wrong at the 3rd line. However, Microbat will return the 2nd line (instead of the 1st line) as the suspicious step as i2 and i1 refer to the same heap address. Microbat will only recommend where the corresponding heap address of the user-selected variable is assigned.

One possible solution could be adding more info in variable table so that I can build data-domianance relation more precisely. An alternative is to build static data dependency and match the dependency to dynamic trace.

Interference Issue

In the simulation, if I simulate non-loop version first and loop version second, the loop inference will lose effect.

clue: the shouldStopOnCheckedNode() method in StepRecommender class may have some bug, as I find the pattern can be detected but the return value of this method impair the effect of jumping by loop inference.

Missing Control/Data Dominance due to Framework

When the program use some OO mechanism, the data dominance relation can be missing.

Test Case:
org.apache.commons.math.MathExceptionTest#testPrintStackTrace

Give an example, in MathException class of apache-common-math-2.2 project, if we mutate line 176, the program execute MathException() first, Exception() second, than it invoke the overridden getLocalizedMessage() where the mutated line, i.e., bug, lies. However, from the top-level, we can only observe one field detailedMessage in ex variable is wrong, and there is no direct data dominance the wrong detailedMessage field in ex variable. It is because the code logic exists in JDK framework, and Microbat does not analyze third-party library. It impairs evaluation but human could manually check the problem.

An example of missing control dominance can be referred to the following case:

Test case: org.apache.commons.math.estimation.GaussNewtonEstimatorTest#testMoreEstimatedParametersUnsorted has exception when simulating debugging
Mutated File: C:\microbat_evaluation\apache-common-math-2.2\1125_13_1\ArrayRealVector.java, unclearRate: 0.0, enableLoopInference: true

The getMessage() method line 401 in MathRuntimeException.java will be called without explicitly invoking it.

Over-long Text for Variable Value

When the string value of variable value is too long, user may not be able to inspect it. May consider text tooltip or something else for the case. It is a user experience issue.

Incorrect Invocation Parent When Exception Happens in Uninteresting Library Code

The invocation parent could not be correct when following case happens:

Background
I listen the method entry/exit event to infer when a step invoke a method or return from a method. In order to keep the invocation parent/child relation, I keep a method stack. When I detect a method entry event, I push the method into the stack. When I detect a method exit event, I pop the method out of the stack. If an exception happens, I can capture the location capturing the exception. Then, I keep popping the stack until the peek method in the stack contains the location catching the exception.

Problem
When the exception happens in library code which we are not interested in, I am not able to know when to stop popping methods. For example, suppose the invocation chain is: a->b->c->d (a is application code, b is uninteresting library code, and c and d are interesting library code). The exception happens in d while it is captured by b. Note that we only keep the method stack of a->c->d as b is not a method we are interested. By right, we should pop c and d in method stack. However, without recording b, we do not know when to stop popping the stack. In this case, I choose that I do not pop at all. By this means, the invocation parent may not be correct when the exception happens in uninteresting library code.

Discussion
It seems the cost we need to pay for not recording all the runtime information. JPDA is really slow if I record everything in the trace. In the future, it is better to replace JPDA for code instrumentation.

[Mutation] Provide New Mutation Function

Generate mutated traces on a given open source project. For now, we can use Apache Math (https://github.com/llmhyy/apache-common-math-2.2). We can start the new code in microbat_trace_predication project.

After running a passed test case, we will have a list of code covered by that test case. We are going to apply three mutations on the covered code: (1) remove a nested if-brackets, (2) remove a if-return block, and (3) remove a variable definition. A threshold will be set for the number of mutations we are going to apply.

Then we are going to compile the mutated code with "javac" command. If the mutated code has no compilation error, we will have a new mutation trace. We will store both the original trace and mutation trace in database.

==from 1 Mar 2018
We need to add one more mutation, which needs to combine dynamic and static code analysis. We can call it aim-oriented mutation.

We first collect the trace of a test case. Then, we detect the following trace steps which should meet the following requirements: (1) the step writes a variable (2) the step is control dominated by a step with condition expression (let's call the conditional expression as CE). If we find such a step, we mutate CE. In general, if CE does not contains '!', we add a not operation '!' for CE, otherwise, we remove the ! operator.

Support Debugging Configurations

It shall be good to imitate Eclipse debugging UI design to record the running configuration for each debugging. For example, a program can be debugged with different VM options. Thus, we can have a list of recorded running configuration for debugging.

I am willing to do this, but I am short of manpower to help me on this issue.

Possible Risky Optimization

When retrieving a read/written variable, I will use ExpressionParser class in ProgramExecutor.retriveExpression. In order to avoid thread deadlock, I disabled all the event request for JVM. However, the disable operation seems to take a lot of time. Thus, in order for the speed, I disable those event disabling function for now.

Multiple ClassPrepareRequest in ProgramExecutor

Please note that there are multiple class prepare request when starting the program executor. Therefore, if I want to stop the listener to such event, I need to keep a LIST of class prepare events to stop suspending JVM.

Update DB Schema of Trace

Hi @lylytran

I add one more column "is_multithread" in Trace table and a field "isMultiThread" in Trace class. Please kindly update the corresponding db recording/retrieving code. Many thanks!

Evaluation for Method Invocation

When a step read some field of returned object, such as m().attr, this version of microbat will report some exception.

The reason is that the invokeMethod API in JDK might be stuck by some deadlock. I will check this issue later after the deadline.

Bug reproduction:

Apache BCEL Project:
String testClassName = "org.apache.bcel.CounterVisitorTestCase";
String testMethodName = "testAnnotationEntryCount";
String mutationFile = "C:\microbat_evaluation\commons-bcel\75_18_1\ConstantPool.java";

JVM TimeOutException

In the runSingleTrial() method, the following code can generate a JVM TimeOutException:

// String testClassName = "org.apache.commons.math.analysis.interpolation.LinearInterpolatorTest";
// String testMethodName = "testInterpolateLinear";
// String mutationFile = "C:\Users\YUNLIN~1\AppData\Local\Temp"
// + "apache-common-math-2.2\2081_22_1\MathUtils.java";
// String mutatedClass = "org.apache.commons.math.util.MathUtils";

Bug for Exclude Classes (Not Package)

hi @lylytran

If I set some class like java.lang.StringBuilder and java.lang.AbstractStringBuilder to be excluded, the code will also exclude java.lang.Runtime. Would you please kindly have a check? Thanks!

Identify Variable Assignment

Given a variable, we would like to know all the places in the project where this variable can be assigned. Let me talk it more about it tomorrow.

Entry/exit of <clinit> is not paired

I am not sure the reason yet. When parsing the method entry/exit event, I need to ignore the method event, i.e., the static method of initializing a class when invoking the static method of this class. With my code, the entry/exit the methods seems to be not paired. Therefore, the hierarchy structure of method caller and callee can be sometimes wrong if the is not ignored.

The effort to find out why this happens is necessary. However, I am too tired now. I will leave this problem in the future.

Test Case:
org.apache.commons.math.analysis.interpolation.BicubicSplineInterpolatorTest#testPreconditions
mutation:
C:\microbat_evaluation\apache-common-math-2.2\129_17_1\BicubicSplineInterpolatingFunction.java

wrong-path feedback can be followed by correct-step feedback

In the past, I think it is not possible as a wrong-path feedback must be caused by wrong condition. However, I am wrong. I miss-consider the case when there could be missing some if condition in between the correct-step and the wrong-path step.

if(){
  if(){
    return;
  }

  print();
}

The print(); statement has two control dominators, in this special case, I need to fix the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.