GithubHelp home page GithubHelp logo

conanoutlooklvtbs / apted Goto Github PK

View Code? Open in Web Editor NEW

This project forked from databasegroup/apted

0.0 1.0 0.0 205 KB

APTED algorithm for the Tree Edit Distance

Home Page: http://tree-edit-distance.dbresearch.uni-salzburg.at/

License: MIT License

Java 100.00%

apted's Introduction

APTED algorithm for the Tree Edit Distance

Information

This is an implementation of the APTED algorithm, the state-of-the-art solution for computing the tree edit distance [1,2], which supersedes the RTED algorithm [3].

You can find more information on our Tree Edit Distance website http://tree-edit-distance.dbresearch.uni-salzburg.at/

Deprecated API

As we've been pointed, our API had incorrect packaging causing some troubles (especially, the util package). We've fixed the packaging. For the sake of current users, we've left also the old one that we've annotated as deprecated in both, source code and javadoc. We're planning on removing it from the repository at some point.

Citing APTED

If you want to refer to APTED in a publication, please cite [1] and [2].

Licence

The source code is published under the MIT licence found in the root directory of the project and in the header of each source file.

Input

Currently, we support only the so-called bracket notation for the input trees, for example, encoding {A{B{X}{Y}{F}}{C}} corresponds to the following tree:

    A
   / \
  B   C
 /|\
X Y F

Output

Our tool computes two outputs:

  • tree edit distance value - the minimum cost of transforming the source tree into the destination tree.
  • tree edit mapping - a mapping between nodes that corresponds to the tree edit distance value. Nodes that are not mapped are deleted (source tree) or inserted (destination tree).

Customising

If the nodes of your trees have labels different from simple strings and you need a more sophisticated cost model than unit cost, you can customise that. There are three elements that you have to consider. See Javadoc documentation for further details.

Parsing the input

Our current parser BracketStringInputParser takes the bracket-encoded input tree as a string and transforms it to tree structure composed of Node objects. If you'd like to use other encoding, you have to write a custom class that implements InputParser interface.

Node data

The parser creates nodes and stores the corresponding information in Node.nodeData. We use StringNodeData to store simple string labels. If you need anything else, you have to implement your own class. It can be anything, we don't provide any interface.

Cost model

The cost model decides on the costs of edit operations for every node (insertion and deletion) and every node pair (rename). We've implemented a simple StringUnitCostModel that returns 1 for deleting and inserting any node. The rename cost depends on label (StringNodeData) equality.

Write a class that implements CostModel interface if you need a more sophisticated cost model. See PerEditOperationStringNodeDataCostModel which allows different costs for each edit operation.

Using customised APTED

When you have all the bricks ready (MyInputParser, MyNodeData, MyCostModel), execute APTED as follows for sourceTree and destinationTree:

// Parse the input and transform to Node objects storing node information in MyNodeData.
MyInputParser parser = new MyInputParser();
Node<MyNodeData> t1 = parser.fromString(sourceTree);
Node<MyNodeData> t2 = parser.fromString(destinationTree);
// Initialise APTED.
APTED<MyCostModel, MyNodeData> apted = new APTED<>(new MyCostModel());
// Execute APTED.
float result = apted.computeEditDistance(t1, t2);

Execution manual

Execute java -jar apted.jar -h for manual and help.

Building APTED

You can clone the code, compile, and build the JAR file the regular command-line way.

We use Gradle for convenience.

  • install Gradle
  • run gradle test for unit tests (currently correctness tests)
  • run gradle build to find the apted.jar file in build/libs/

Gradle wrapper

We intentionally do not put automatically generated Gradle wrapper files in the repository. We don't like that. However, if it helps, we've added wrapper task section to build.gradle file.

Javadoc documentation

Run gradle javadoc to generate documentation. Then, open in your browser build/docs/javadoc/index.html.

The current and future documentation should cover all classes and their members, including private. The internals of the algorithms and methods are documented within the source code. If anything is missing or unclear, please send us a feedback.

References

  1. M. Pawlik and N. Augsten. Tree edit distance: Robust and memory- efficient. Information Systems 56. 2016.

  2. M. Pawlik and N. Augsten. Efficient Computation of the Tree Edit Distance. ACM Transactions on Database Systems (TODS) 40(1). 2015.

  3. M. Pawlik and N. Augsten. RTED: A Robust Algorithm for the Tree Edit Distance. PVLDB 5(4). 2011.

apted's People

Contributors

augsten avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.