GithubHelp home page GithubHelp logo

prey176 / ri Goto Github PK

View Code? Open in Web Editor NEW

This project forked from infomics/ri

0.0 0.0 0.0 53 KB

RI is a general purpose algorithm for one-to-one exact subgraph isomorphism problem maintaining topological constraints.

License: Other

C++ 97.46% Makefile 0.36% C 2.18%

ri's Introduction

RI

Version: 3.6

RI is a general purpose algorithm for one-to-one exact subgraph isomorphism problem maintaining topological constraints.


Description

RI is a general purpose algorithm for one-to-one exact subgraph isomorphism problem maintaining topological constraints. It is both a C++ library and a standalone tool, providing developing API and a command line interface, with no dependencies out of standard GNU C++ library. RI works on Unix and Mac OS X systems with G++ installed, and it can be compiled under Windows using Gygwin. Working graphs may be directed, undirected, multigraphs with optional attributes both on nodes and edges. Customizable features allow user-defined behaviors for attribute comparisons and the algorithm's flow.

RI aims to provide a better search strategy for the common used backtracking approach to the subgraph isomorphism problem. It can be integrated with additional preprocessing steps or it can be used for the verification of candidate structures coming from data mining, data indexing or other filtering techniques. RI is able to find graphs isomorphisms, subgraph isomorphisms and induced subgraph isomorphisms. It is distributed in several versions divide chiefly in two groups respectively for static or dynamically changing attributes. All proposed versions are developed taking into account trade-offs between time and memory requirements. Optional behaviors such as stop at first encountered match, processing of result matches, type of isomorphism and additional features may be enabled thanks to high modularity and library's API.

The RI project also aims to provide a comparison of existing exact subgraph matching algorithms on synthetic and real life graphs. An initial collection of datasets is proposed, it includes synthetic and biological graphs but further types of data regarding other research areas (i.e. engineering, computer vision, etc...) are coming. A list of proposed applications is also given.

Please send us an email to get software sources or datasets (see Contacts).


Usage

RI - CLI (Command Line Interface)

Before using the RI command line interface of all distributions, please rebuilt it by make -B All versions of RI take in input the same parameters:

 ./ri ISO_TYPE INPUT_FORMAT target_graph pattern_graph    
ISO_TYPE specify isomorphism
iso bijective graphs isomorphism.
ind induced subgraph isomorphism
mono monomorphism, the classical subgraph matching
INPUT_FORMAT specify input file format
gfd directed graphs with attributes only on nodes.
gfu undirected graphs with attributes only on nodes.
ged directed graphs with attributes both on nodes and edges.
geu undirected graphs with attributes both on nodes and edges.
vfu Sansone et al. file format for labeled directed graphs with attributes only on nodes.
Query extractor

Before using the query extractor tool please rebuilt it by digiting make -B querygen, then the usage of the built executable is the following:

./querygen [gfu gfd] input_graph number_of_nodes number_of_edges output_file number_of_queries

The query extractor tries to extract a total amount of number_of_queries subgraphs, from the given input_graph, with the specified number_of_nodes and number_of_edges, saving them at the specified output prefix output_file.

You can set the number_of_nodes or number_of_edges to -1 to do not specify them. For example, let the number_of_edges fixed to a value and the number_of_nodes set to the -1, than the extractor tries to extract a subgraph just with the given number of edges not caring the number of nodes. You can also set number_of_nodes to a decimal value between 0 and 1 to specify it as a percentage of the number_of_edges.

Note that if the input parameters do not reflect the properties of the input_graph you can obtain an unwanted resulting subgraph. So, the extractor tries first to generate a graph with the specified number of edges and the maximum possible number of nodes near to the input value. This does not exclude that a subgraph with the specified number of edges will be extracted.

FocusSearch-C++

Before use the FocusSearch-C++ command line interface of all distributions, please rebuilt it by make -B The tool usage is:

./fsearch ISO_TYPE INPUT_FORMAT target_graph pattern_graph
ISO_TYPE specify isomorphism
mono monomorphism, the classical subgraph matching
INPUT_FORMAT specify input file format
gfd directed graphs with attributes only on nodes.
gfu undirected graphs with attributes only on nodes.
vfu Sansone et al. file format for labeled directed graphs with attributes only on nodes.
Default graph file format

The RI project provides two graph file format gfu and gfd, respectively for undirected and directed graphs with attributes only on nodes.

Graphs are stored in text files containing one or more items. The current input format allows the description of undirect graphs with labels on nodes.

#[graph_name]
[number of nodes]
[label_of_first_node]
[label_of_second_node]
...
[number of edges]
[node id] [node id]
[node id] [node id]
...

Node ids are assigned following the order in which they are written in the input file, starting from 0. [graph_name] and labels can not contain blank characters (spaces). Labels are case sensitive.

An example of input file is the following:

#my_graph
4
A
B
C
Br
5
0 1
2 1
2 3
0 3
0 2

Indeed, an example of input file in geu format (undirected graph with labels both on nodes and attibutes) is:

#my_graph
4
A
B
C
Br
5
0 1 a
2 1 n
2 3 m
0 3 k
0 2 a


Datasets

A biochemical dataset originally used to evaluate the performance of RI and RI-DS is availiabe at this repository.

A further syntetic dataset used to evaluate RI and RI-DS in 2018 is available at the following repository.


License

RI is distributed under the MIT license. This means that it is free for both academic and commercial use. Note however that some third party components in RI require that you reference certain works in scientific publications. You are free to link or use RI inside source code of your own program. If do so, please reference (cite) RI and this website. We appreciate bug fixes and would be happy to collaborate for improvements. License


Citation

 Bonnici, V., Giugno, R., Pulvirenti, A., Shasha, D., & Ferro, A. (2013).
 A subgraph isomorphism algorithm and its application to biochemical data. 
 BMC bioinformatics, 14(7), S13.

 Bonnici V, Giugno R. 
 On the variable ordering in subgraph isomorphism algorithms. 
 IEEE/ACM transactions on computational biology and bioinformatics. 2016 Jan 7;14(1):193-203.

References

if you have used any of the RI project software, please cite the following paper:

Vincenzo Bonnici, Rosalba Giugno, Alfredo Pulvirenti, Dennis Shasha and Alfredo Ferro. A subgraph isomorphism algorithm and its application to biochemical data. BMC Bioinformatics 2013, 14(Suppl 7):S13 doi:10.1186/1471-2105-14-S7-S13.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.