GithubHelp home page GithubHelp logo

tonellotto / hashtomin Goto Github PK

View Code? Open in Web Editor NEW

This project forked from teto1992/hashtomin

1.0 2.0 0.0 7.7 MB

A MapReduce implementation of HashToMin for finding Connected Components in a graph.

License: MIT License

Java 100.00%

hashtomin's Introduction

HashToMin

A Hadoop MapReduce implementation of the HashToMin algorithm for finding connected components in a graph, starting from an input file either specifying the edges of the graph or the adjacency lists for each node. Each line of the input file represents either an already formed cluster within the graph G or an edge of the graph. Vertex identifiers must be separated by a space or a tab. The output file will contain one connected component per line, with the first node representing its label, followed by a tab and all the cluster's nodes divided by spaces. Sample input files can be found in the folder inputfiles.

The usage is fairly simple and it is listed below. Instantiate the class

public ConnectedComponents (String input,
                            String output, 
                            int reduceTasksNumber,
                            boolean verifyResult,
                            boolean secondarySort) 

where:

  • input and output specify the input and output file paths,
  • reduceTasksNumber specifies the number of reducers available and to be exploited in all jobs but the Export procedure (that must output a single file),
  • verifyResult that is used to execute the CountNodes and the Verifier job if it is set to true,
  • secondarySort to decide which version of the algorithm to use, HashToMinSecondarySort runs when this attribute is true.

Then call the method run() over the new object.

Alternatively, the jar can be run on some input issuing the command

hadoop jar ./target/HashToMin-1.0.jar <input> <output> <numberOfReducers>

from the project folder.

hashtomin's People

Contributors

teto1992 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.