GithubHelp home page GithubHelp logo

chubbymaggie / agec Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tos-kamiya/agec

0.0 2.0 0.0 228 KB

Agec, an arbitrary-granularity execution clone detection tool

License: Other

Java 1.43% Python 92.19% Assembly 6.38%

agec's Introduction

Build Status

This project has moved to a new repo -> https://github.com/tos-kamiya/agec2 .

agec

Agec, an arbitrary-granularity execution clone detection tool

Agec generates all possible execution sequences from Java byte code(s) to detect the same execution sub-sequences from the distinct places in source files.

Usage

Agec's core programs are:

gen_ngram.py. Generates n-grams of execution sequences of Java program.

det_clone.py. Identifies the same n-grams and reports them as code clones.

Agec also includes the following utilities:

tosl_clone.py. Converts locations of code clones (from byte-code index) to line numbers of source files.

exp_clone.py. Calculates some metrics from each code clone.

run_disasm.py. Disassembles a jar file (with 'javap' disassembler) and generate disassemble-result files.

gen_ngram.py

gen_ngram.py reads given (disassembled) Java byte-code files, generates n-grams of method invocations from them and outputs n-grams to the standard output.

usage: gen_ngram.py -a asm_directory -n size > ngram

Here, 'asm_directory' is a directory which contains the disassemble result files (*.asm). 'size' is a length of each n-gram (default value is 6).

Note that a disassemble file need to be generated from *.class file with a command 'javap -c -p -l -constants', because gen_ngram.py requires a line number of each byte code.

det_clone.py

det_clone.py reads a n-gram file, identifies the same n-grams, and outputs them as code clones to the standard output.

usage: det_clone.py ngram_file > clone_index

Here, 'ngram_file' is a n-gram file, which has been generated with gen_ngram.py.

Each location in the result is shown in byte-code index. In order to convert locations to line numbers of source files, use tosl_clone.py.

tosl_clone.py

tosl_clone.py reads Java byte-code files and a code-clone detection result, converts each location of code clone into line number, and outputs the converted code-clone data to the standard output.

usage: tosl_clone.py -a asm_directory clone_index > clone-linenum

Here, 'asm_directory' is a directory containing disassembled result files and 'clone_index' is the code-clone detection result that has been generated with det_clone.py.

A Small Example

This sample is to detect code clones from a Java file: ShowWeekday.java.

$ javac ShowWeekday.java
$ javap -c -p -l -constants ShowWeekday > disasm/ShowWeekday.asm
$ gen_ngram.py -a disasm > ngrams.txt
$ det_clone.py ngrams.txt > clone-indices.txt
$ tosl_clone.py -a disasm clone-indices.txt > clone-linenums.txt

Publish

  • Toshihiro Kamiya, "Agec: An Execution-Semantic Clone Detection Tool," Proc. IEEE ICPC 2013, pp. 227-229 link to the paper.

License

Agec is distributed under MIT License.

agec's People

Contributors

tos-kamiya avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.