GithubHelp home page GithubHelp logo

dist-community's Introduction

Distributed Girvan-Newman algorithm for community detection

This repository contains code for paper I wrote during summer seminar at Petnica (2023). Poster for this paper can be seen here and the paper itself can be read here (page 181). Poster and paper are only avalible in Serbian.

Abstract

This paper presents a distributed version of the Girvan-Newman algorithm for community detection. A community is a group of nodes in a graph such that the number of edges between nodes in that group is much larger than the number of edges between nodes in that group and other nodes. They are often found in real networks, and their detection can provide useful insights into the observed network. Girvan-Newman algorithm works by iteratively finding and deleting the most central edge (the edge with the highest centrality value). After each deletion, modularity of the graph is calculated, which represents a measure of the prominence of communities in the graph. Result of the algorithm is a graph for which the maximum modularity has been calculated. Disadvantage of this algorithm is that for graphs with thousands of nodes, execution requires a lot of time. The idea in this paper is to investigate whether it is possible to reduce the execution time if the algorithm is executed on multiple computers. The distributed version of the algorithm works by calculating centrality and modularity on different computers. Computers that calculate centrality form one cluster, while computers that calculate modularity form another cluster. Each computer in the corresponding cluster is assigned an interval of nodes over which it should perform the calculation. In addition to the mentioned clusters, there is also a main computer in the system whose task is to initialize the entire system, coordinate the clusters, and construct the result based on the values calculated by the computers in the clusters. System was tested with clusters containing 1, 2, and 5 computers. Obtained results show that the execution time of the algorithm decreases if the number of computers in the system increases, however, optimal size of the system size was not found, which is a topic for further research.

The system was tested on Google Cloud virutal machines. For coordination of the system ZooKeeper library was used. All important code is located on master branch. Branches cluster2 and cluster5 were used only for easier setup of virtual machines on the cloud.

System diagram Schematic of the system

dist-community's People

Contributors

blin04 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.