GithubHelp home page GithubHelp logo

puzzlef / louvain-communities-openmp-dynamic Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 0.0 261 KB

Design of OpenMP-based Parallel Dynamic Louvain algorithm for community detection.

Home Page: https://arxiv.org/abs/2404.19634

License: MIT License

C++ 98.04% Shell 0.74% JavaScript 1.22%
agglomerative algorithm community detection experiment graph iterative louvain modularity multithreading openmp optimization

louvain-communities-openmp-dynamic's Introduction

Design of OpenMP-based Parallel Dynamic Louvain algorithm for community detection.


Community detection is the problem of recognizing natural divisions in networks. A relevant challenge in this problem is to find communities on rapidly evolving graphs. In this report we present our Parallel Dynamic Frontier (DF) Louvain algorithm, which given a batch update of edge deletions and insertions, incrementally identifies and processes an approximate set of affected vertices in the graph with minimal overhead, while using a novel approach of incrementally updating weighted-degrees of vertices and total edge weights of communities. We also present our parallel implementations of Naive-dynamic (ND) and Delta-screening (DS) Louvain. On a server with a 64-core AMD EPYC-7742 processor, our experiments show that DF Louvain obtains speedups of 179x, 7.2x, and 5.3x on real-world dynamic graphs, compared to Static, ND, and DS Louvain, respectively, and is 183x, 13.8x, and 8.7x faster, respectively, on large graphs with random batch updates. Moreover, DF Louvain improves its performance by 1.6x for every doubling of threads.


Below we illustrate the mean runtime and modularity of communities obtained with our parallel implementation of Static, Naive-dynamic (ND), Delta-screening (DS), and Dynamic Frontier (DF) Louvain on real-world dynamic graphs on batch updates of size 10^-5|Eᴛ| to 10^-3|Eᴛ| (where |Eᴛ| is the number of temporal edges). In (a), the speedup of each approach with respect to Static Louvain is labeled.

Next, we plot the average time taken by Static, ND, DS, and DF Louvain on large graphs with random batch updates of size 10^-7|E| to 0.1|E|. In (a), the speedup of each approach with respect to Static Louvain is labeled.

Refer to our technical report for more details:
DF Louvain: Fast Incrementally Expanding Approach for Community Detection on Dynamic Graphs.


Note

You can just copy main.sh to your system and run it.
For the code, refer to main.cxx.



Code structure

The code structure of our multicore implementation of Dynamic Frontier (DF) Louvain is as follows:

- inc/_algorithm.hxx: Algorithm utility functions
- inc/_bitset.hxx: Bitset manipulation functions
- inc/_cmath.hxx: Math functions
- inc/_ctypes.hxx: Data type utility functions
- inc/_cuda.hxx: CUDA utility functions
- inc/_debug.hxx: Debugging macros (LOG, ASSERT, ...)
- inc/_iostream.hxx: Input/output stream functions
- inc/_iterator.hxx: Iterator utility functions
- inc/_main.hxx: Main program header
- inc/_mpi.hxx: MPI (Message Passing Interface) utility functions
- inc/_openmp.hxx: OpenMP utility functions
- inc/_queue.hxx: Queue utility functions
- inc/_random.hxx: Random number generation functions
- inc/_string.hxx: String utility functions
- inc/_utility.hxx: Runtime measurement functions
- inc/_vector.hxx: Vector utility functions
- inc/batch.hxx: Batch update generation functions
- inc/bfs.hxx: Breadth-first search algorithms
- inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions
- inc/dfs.hxx: Depth-first search algorithms
- inc/duplicate.hxx: Graph duplicating functions
- inc/Graph.hxx: Graph data structure functions
- inc/louvain.hxx: Louvain algorithm functions
- inc/main.hxx: Main header
- inc/mtx.hxx: Graph file reading functions
- inc/properties.hxx: Graph Property functions
- inc/selfLoop.hxx: Graph Self-looping functions
- inc/symmetrize.hxx: Graph Symmetrization functions
- inc/transpose.hxx: Graph transpose functions
- inc/update.hxx: Update functions
- main.cxx: Experimentation code
- process.js: Node.js script for processing output logs

Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.



References




ORG

louvain-communities-openmp-dynamic's People

Contributors

wolfram77 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

louvain-communities-openmp-dynamic's Issues

Preallocate memory for improving performance

Pass jump seems to cause more louvainAggregate() which is not fast most likely due to memory not being allocated ahead of time. This includes memory for storing community vertices comv as well as memory needed to store the aggregated graph. comv can be preallocated if we count the number of vertices in each community before hand (this can be done either before aggregate, or before starting local-moving phase and keep track of it). But how do you preallocate memory for storing the graph?

  • Preallocate memory for storing community vertices (need prefix sum?)
  • Preallocate memory for storing graph (CSR like?)

Use refine-based local-moving phase for the first pass

Given a community membership, performing local-moving phase directly may prevent the splitting of exisiting communities due to the absence of any vertex belonging to another community. In the refine-based local-moving phase, the community membership of vertices is reset, and then local-moving is performed within each community. This allows existing communities to split, and is likely to discover communities of higher quality (as with Leiden algorithm). Of course it is likely to have higher overhead than just doing a plain local-moving phase. Further, the refine-based local-moving phase can be performed as a randomized algorithm, instead of as a greedy algorithm. When using a randomized local-moving the chance of moving to a given community can be proportional to the delta modularity of moving the vertex to that community.

  • Performance of greedy refine-based local-moving phase
  • Performance of randomized refine-based local-moving phase

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.