GithubHelp home page GithubHelp logo

squizz617 / vuddy Goto Github PK

View Code? Open in Web Editor NEW
50.0 50.0 23.0 73.44 MB

VUDDY: A Scalable and Accurate Vulnerable Code Clone Detector (S&P'17)

Home Page: https://iotcube.net

License: MIT License

Python 20.46% Shell 0.45% ANTLR 7.31% Java 14.03% C 57.75%

vuddy's People

Contributors

dngthe93 avatar ied206 avatar ktb88 avatar squizz617 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vuddy's Issues

Parser optimization

ANTLR 4.5.3을 통한 parser generate시 퍼포먼스 문제가 너무 심각하여 검색을 해 본 결과,
antlr/antlr4#192 를 발견함.
sharwell의 코멘트에 따라, 매우 빠르다는 antlr-4.0-opt 버전으로 src 폴더의 문법을 실행하려 하였으나, 아래의 오류가 발생.

[빌드 스크립트 실행 시]

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
	at java.util.ArrayList.rangeCheck(ArrayList.java:653)
	at java.util.ArrayList.get(ArrayList.java:429)
	at org.antlr.v4.misc.OrderedHashMap.getElement(OrderedHashMap.java:46)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.setAltASTPointers(LeftRecursiveRuleTransformer.java:241)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.translateLeftRecursiveRule(LeftRecursiveRuleTransformer.java:162)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.translateLeftRecursiveRules(LeftRecursiveRuleTransformer.java:89)
	at org.antlr.v4.semantics.SemanticPipeline.process(SemanticPipeline.java:94)
	at org.antlr.v4.Tool.processNonCombinedGrammar(Tool.java:399)
	at org.antlr.v4.Tool.process(Tool.java:384)
	at org.antlr.v4.Tool.processGrammarsOnCommandLine(Tool.java:343)
	at org.antlr.v4.Tool.main(Tool.java:190)
./Main.java:261: error: cannot find symbol
	private void _init(FunctionParser parser) {
	                   ^
  symbol:   class FunctionParser
  location: class BodyParser
./Main.java:485: error: cannot find symbol
	private void _init(ModuleParser parser) {
	                   ^
  symbol:   class ModuleParser
  location: class TreeParser
./Main.java:744: error: cannot find symbol
	private void _init(ModuleParser parser) {
	                   ^
  symbol:   class ModuleParser
  location: class TreeParser1
./Main.java:302: error: cannot find symbol
			FunctionParser parser = new FunctionParser(tokens);
			^
  symbol:   class FunctionParser
  location: class BodyParser
./Main.java:302: error: cannot find symbol
			FunctionParser parser = new FunctionParser(tokens);
			                            ^
  symbol:   class FunctionParser
  location: class BodyParser
./Main.java:402: error: package FunctionParser does not exist
				if (p1 instanceof FunctionParser.FuncCallContext) {
				                                ^
./Main.java:529: error: cannot find symbol
			ModuleLexer lexer = new ModuleLexer(antlrFileStream);
			^
  symbol:   class ModuleLexer
  location: class TreeParser
./Main.java:529: error: cannot find symbol
			ModuleLexer lexer = new ModuleLexer(antlrFileStream);
			                        ^
  symbol:   class ModuleLexer
  location: class TreeParser
./Main.java:531: error: cannot find symbol
			ModuleParser parser = new ModuleParser(tokens);
			^
  symbol:   class ModuleParser
  location: class TreeParser
./Main.java:531: error: cannot find symbol
			ModuleParser parser = new ModuleParser(tokens);
			                          ^
  symbol:   class ModuleParser
  location: class TreeParser
./Main.java:788: error: cannot find symbol
			ModuleLexer lexer = new ModuleLexer(antlrFileStream);
			^
  symbol:   class ModuleLexer
  location: class TreeParser1
./Main.java:788: error: cannot find symbol
			ModuleLexer lexer = new ModuleLexer(antlrFileStream);
			                        ^
  symbol:   class ModuleLexer
  location: class TreeParser1
./Main.java:790: error: cannot find symbol
			ModuleParser parser = new ModuleParser(tokens);
			^
  symbol:   class ModuleParser
  location: class TreeParser1
./Main.java:790: error: cannot find symbol
			ModuleParser parser = new ModuleParser(tokens);
			                          ^
  symbol:   class ModuleParser
  location: class TreeParser1
14 errors
*.class : no such file or directory

[java -jar antlr-4.0-complete.jar ./src/Function.g4 실행시]

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
	at java.util.ArrayList.rangeCheck(ArrayList.java:653)
	at java.util.ArrayList.get(ArrayList.java:429)
	at org.antlr.v4.misc.OrderedHashMap.getElement(OrderedHashMap.java:46)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.setAltASTPointers(LeftRecursiveRuleTransformer.java:241)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.translateLeftRecursiveRule(LeftRecursiveRuleTransformer.java:162)
	at org.antlr.v4.analysis.LeftRecursiveRuleTransformer.translateLeftRecursiveRules(LeftRecursiveRuleTransformer.java:89)
	at org.antlr.v4.semantics.SemanticPipeline.process(SemanticPipeline.java:94)
	at org.antlr.v4.Tool.processNonCombinedGrammar(Tool.java:399)
	at org.antlr.v4.Tool.process(Tool.java:384)
	at org.antlr.v4.Tool.processGrammarsOnCommandLine(Tool.java:343)
	at org.antlr.v4.Tool.main(Tool.java:190)

확인 및 4.0-opt 버전을 이용한 퍼포먼스 테스트를 부탁합니다.

Java language support

Does vuddy supports finding vulnerable code in Java programs? Reading the readme, I am unable to see which languages vuddy currently supports.

Parser bugs

Two bugs are found in commit aaf7fcb

  • Cannot identify 'const' between two '*'s. (pointer)
// Cannot recognize function below, because of 'const' between two '*'s.
void module_layout(struct module *mod,
                   struct modversion_info *ver,
                   struct kernel_param *kp,
                   struct kernel_symbol *ks,
                   struct tracepoint * const *tp)
{
}
  • Cannot identify two-dimensional array. (maybe same result with multi-dimensional array)
// cannot find a1 in local variable list
void func(void)
{
        int a0[10];
        int a1[10][10];
        return;
}

is VulnDBGen private?

Hi,
Is your VulnDBGen repository private?
I'm doing a research on vulnerabilities and I could really use your vulnerability database or its generator.
The link to the repository is broken. Is there anyway to get it?
Thanks in advance.

Uploading .hidx to the server

I got the following error message:

Latest server version: 4.0.1
Current local version: 3.1.0 (out-of-date)
[-] Your hmark is not up-to-date.

That's why I bypass the update check and run the tool. After generating .hidx , I uploaded them in https://iotcube.net/process/type/wf1. But it is not showing me any result. Rather asking me to download latest binaries.

I also downloaded [hmark for Linux x64](https://iotcube.net/tools/wf1/hmark_4.0.1_linux_x64.tar.gz) but when I try to run in on my Linux machine it is showing the following error:

[8512] Error loading Python lib '/tmp/_MEIdkjLT1/libpython3.8.so.1.0': dlopen: /lib/x86_64-linux-gnu/libm.so.6: version GLIBC_2.29' not found (required by /tmp/_MEIdkjLT1/libpython3.8.so.1.0)

No functions being processed

Hey
I tried running hmark on my projects and the testcode directory but ctags seems to be unable to parse any file.
I always get more or less following result:

[+] Hash index successfully generated.
[+] Saving hash index to file... (Done)

[+] Elapsed time: 0.07 sec.
Program statistics:
 - 4 files;
 - 0 functions;
 - 0 lines of code.

The command Im using is for instance:

./hmark_4.0.1_linux_x64 -c testcode/ OFF

Ctags output is mostly a non-zero error code and a parsing error:

Parser Error: Command '"/tmp/_MEIbtoICk/ctags" -f - --kinds-C=* --fields=neKSt "testcode/configs.c"' returned non-zero exit status 127.

I tried the prebuilt binaries for linux and Mac OS X. And for Mac OS X I also tried bulding ctags from their new repository.

Am I missing something here?

Thanks in advance and kind regards,
Tom

Tree Parsing error

Sometimes, tree parsing doesn't work correctly.
Below is the error message.

python TreeParser.py async.c
DTYPE
Traceback (most recent call last):
  File "TreeParser.py", line 198, in <module>
    main(sys.argv)
  File "TreeParser.py", line 176, in main
    functionInstanceList = TreeParser().ParseFile(argv[1])
  File "TreeParser.py", line 81, in ParseFile
    ParseTreeWalker().walk(self, tree)
  File "/usr/local/lib/python2.7/dist-packages/antlr4/tree/Tree.py", line 172, in walk
    self.walk(listener, child)
  File "/usr/local/lib/python2.7/dist-packages/antlr4/tree/Tree.py", line 172, in walk
    self.walk(listener, child)
  File "/usr/local/lib/python2.7/dist-packages/antlr4/tree/Tree.py", line 172, in walk
    self.walk(listener, child)
  File "/usr/local/lib/python2.7/dist-packages/antlr4/tree/Tree.py", line 173, in walk
    self.exitRule(listener, t)
  File "/usr/local/lib/python2.7/dist-packages/antlr4/tree/Tree.py", line 189, in exitRule
    listener.exitEveryRule(ctx)
  File "TreeParser.py", line 137, in exitEveryRule
    self.functionInstance.dataTypeList.append(self.typeNameStr.rstrip())
AttributeError: 'NoneType' object has no attribute 'dataTypeList'

HMark path-related error

HMark does not properly handle paths with spaces or parentheses.

error msg)
Parser Error: Command 'java -Xmx1024m -jar /PATH/HMark(1)/FuncParser.jar /TARGETPATH/file.c 0' return non-zero exit status 2

Maybe insert \ if spaces or parentheses are found in the path before concatenating to the command string?

[tbkim] False-Positive from CVE-2016-1879

hash option : abstraction ON

cve-2016-1879
[-] size_t rc;
[+] int rc;

  • this log usually generated from variadic functions
  • original source code has already patched but it is detected (one of the example is as followed)

int __obstack_printf_chk (struct obstack *obstack, int flags, const char *format, ...)
{
int result;
va_list ap;
va_start (ap, format);
result = __obstack_vprintf_chk (obstack, flags, format, ap);
va_end (ap);
return result;
}

"get_source_from_cvepatch.py" file has an error

In the get_source_from_cvepatch python file, there is an error.
(#line 175 to 205)

        for f in hitOldFunctionList:
            for num in range(f.lines[0], f.lines[1]+1):
                try:
                    listIndex = lnList.index(num)
                except ValueError:
                    pass
                else:
                    if lnList.count(num) > 1:
                        listIndex += 1
                    if pmList[listIndex] == '+' or pmList[listIndex] == '-':
                        flag = 0
                        for commentKeyword in ["/*", "*/", "//", "*"]:
                            if chunkLines[listIndex][1:].lstrip().startswith(commentKeyword):
                                flag = 1
                                break
                        if flag:
                            pass
                        else:
                            finalOldFunctionList.append(f)
                            break
                    else:
                        pass
diff --git a/contrib/libarchive/libarchive/archive_ppmd7.c b/contrib/libarchive/libarchive/archive_ppmd7.c
index fe0b031..1aed922 100644
--- a/contrib/libarchive/libarchive/archive_ppmd7.c
+++ b/contrib/libarchive/libarchive/archive_ppmd7.c
@@ -126,6 +126,11 @@ static Bool Ppmd7_Alloc(CPpmd7 *p, UInt32 size, ISzAlloc *alloc)
 {
   if (p->Base == 0 || p->Size != size)
   {
+    /* RestartModel() below assumes that p->Size >= UNIT_SIZE
+       (see the calculation of m->MinContext). */
+    if (size < UNIT_SIZE) {
+      return False;
+    }
     Ppmd7_Free(p, alloc);
     p->AlignOffset =
       #ifdef PPMD_32BIT

In this diff-chunk case, the first line of '+' set is started with "/*", so the flag value change to the 1,
pass, and go to the for loop.

Next, we have to check the second line of '+' set, however, we skip all the '+' lines.........

So, like this case, If the first line of '+' (or '-') set is started with Comment Line, our python code file cannot find vulnerable function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.