GithubHelp home page GithubHelp logo

gbbs's Introduction

GBBS: Graph Based Benchmark Suite Bazel build

Organization

This repository contains code for our SPAA paper "Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable" (SPAA'18). It includes implementations of the following parallel graph algorithms:

Clustering Problems

  • SCAN Graph Clustering
  • Graph-Based Hierarchical Agglomerative Clustering (Graph HAC)

Connectivity Problems

  • Low-Diameter Decomposition
  • Connectivity
  • Spanning Forest
  • Biconnectivity
  • Minimum Spanning Tree
  • Strongly Connected Components

Covering Problems

  • Coloring
  • Maximal Matching
  • Maximal Independent Set
  • Approximate Set Cover

Eigenvector Problems

  • PageRank

Substructure Problems

  • Triangle Counting
  • Approximate Densest Subgraph
  • k-Core (Coreness)
  • Degeneracy Ordering (Low-Outdegree Orientation)
  • k-Clique Counting
  • 5-Cycle Counting
  • k-Truss

Shortest Path Problems

  • Unweighted SSSP (Breadth-First Search)
  • General Weight SSSP (Bellman-Ford)
  • Integer Weight SSSP (Weighted Breadth-First Search)
  • Single-Source Betweenness Centrality
  • Single-Source Widest Path
  • k-Spanner

The code for these applications is located in the benchmark directory. The implementations are based on the Ligra/Ligra+/Julienne graph processing frameworks. The framework code is located in the src directory.

If you use our work, please cite our paper:

@inproceedings{dhulipala2018theoretically,
  author    = {Laxman Dhulipala and
               Guy E. Blelloch and
               Julian Shun},
  title     = {Theoretically Efficient Parallel Graph Algorithms Can Be Fast and
               Scalable},
  booktitle = {ACM Symposium on Parallelism in Algorithms and Architectures (SPAA)},
  year      = {2018},
}

Compilation

Compiler:

  • g++ >= 7.4.0 with support for Cilk Plus
  • g++ >= 7.4.0 with pthread support (Homemade Scheduler)

Build system:

  • Bazel 2.1.0
  • Make --- though our primary build system is Bazel, we also maintain Makefiles for those who wish to run benchmarks without installing Bazel.

The default compilation uses a lightweight scheduler developed at CMU (Homemade) for parallelism, which results in comparable performance to Cilk Plus. The half-lengths for certain functions such as histogramming are lower using Homemade, which results in better performance for codes like KCore.

The benchmark supports both uncompressed and compressed graphs. The uncompressed format is identical to the uncompressed format in Ligra. The compressed format, called bytepd_amortized (bytepda) is similar to the parallelByte format used in Ligra+, with some additional functionality to support efficiently packs, filters, and other operations over neighbor lists.

To compile codes for graphs with more than 2^32 edges, the GBBSLONG command-line parameter should be set. If the graph has more than 2^32 vertices, the GBBSEDGELONG command-line parameter should be set. Note that the codes have not been tested with more than 2^32 vertices, so if any issues arise please contact Laxman Dhulipala.

To compile with the Cilk Plus scheduler instead of the Homegrown scheduler, use the Bazel configuration --config=cilk. To compile using OpenMP instead, use the Bazel configuration --config=openmp. To compile serially instead, use the Bazel configuration --config=serial. (For the Makefiles, instead set the environment variables CILK, OPENMP, or SERIAL respectively.)

To build:

# Load external libraries as submodules. (This only needs to be run once.)
git submodule update --init

# For Bazel:
$ bazel build  //...  # compiles all benchmarks

# For Make:
# First set the appropriate environment variables, e.g., first run
# `export CILK=1` to compile with Cilk Plus.
# After that, build using `make`.
$ cd benchmarks/BFS/NonDeterministicBFS  # go to a benchmark
$ make

Note that the default compilation mode in bazel is to build optimized binaries (stripped of debug symbols). You can compile debug binaries by supplying -c dbg to the bazel build command.

The following commands cleans the directory:

# For Bazel:
$ bazel clean  # removes all executables

# For Make:
$ make clean  # removes executables for the current directory

Running code

The applications take the input graph as input as well as an optional flag "-s" to indicate a symmetric graph. Symmetric graphs should be called with the "-s" flag for better performance. For example:

# For Bazel:
$ bazel run //benchmarks/BFS/NonDeterministicBFS:BFS_main -- -s -src 10 ~/gbbs/inputs/rMatGraph_J_5_100
$ bazel run //benchmarks/IntegralWeightSSSP/JulienneDBS17:wBFS_main -- -s -w -src 15 ~/gbbs/inputs/rMatGraph_WJ_5_100

# For Make:
$ ./BFS -s -src 10 ../../../inputs/rMatGraph_J_5_100
$ ./wBFS -s -w -src 15 ../../../inputs/rMatGraph_WJ_5_100

Note that the codes that compute single-source shortest paths (or centrality) take an extra -src flag. The benchmark is run four times by default, and can be changed by passing the -rounds flag followed by an integer indicating the number of runs.

On NUMA machines, adding the command "numactl -i all " when running the program may improve performance for large graphs. For example:

$ numactl -i all bazel run [...]

Running code on compressed graphs

We make use of the bytePDA format in our benchmark, which is similar to the parallelByte format of Ligra+, extended with additional functionality. We have provided a converter utility which takes as input an uncompressed graph and outputs a bytePDA graph. The converter can be used as follows:

# For Bazel:
bazel run //utils:compressor -- -s -o ~/gbbs/inputs/rMatGraph_J_5_100.bytepda ~/gbbs/inputs/rMatGraph_J_5_100
bazel run //utils:compressor -- -s -w -o ~/gbbs/inputs/rMatGraph_WJ_5_100.bytepda ~/gbbs/inputs/rMatGraph_WJ_5_100

# For Make:
./compressor -s -o ../inputs/rMatGraph_J_5_100.bytepda ../inputs/rMatGraph_J_5_100
./compressor -s -w -o ../inputs/rMatGraph_WJ_5_100.bytepda ../inputs/rMatGraph_WJ_5_100

After an uncompressed graph has been converted to the bytepda format, applications can be run on it by passing in the usual command-line flags, with an additional -c flag.

# For Bazel:
$ bazel run //benchmarks/BFS/NonDeterministicBFS:BFS_main -- -s -c -src 10 ~/gbbs/inputs/rMatGraph_J_5_100.bytepda

# For Make:
$ ./BFS -s -c -src 10 ../../../inputs/rMatGraph_J_5_100.bytepda
$ ./wBFS -s -w -c -src 15 ../../../inputs/rMatGraph_WJ_5_100.bytepda

When processing large compressed graphs, using the -m command-line flag can help if the file is already in the page cache, since the compressed graph data can be mmap'd. Application performance will be affected if the file is not already in the page-cache. We have found that using -m when the compressed graph is backed by SSD results in a slow first-run, followed by fast subsequent runs.

Running code on binary-encoded graphs

We make use of a binary-graph format in our benchmark. The binary representation stores the representation we use for in-memory processing (compressed sparse row) directly on disk, which enables applications to avoid string-conversion overheads associated with the adjacency graph format described below. We have provided a converter utility which takes as input an uncompressed graph (e.g., in adjacency graph format) and outputs this graph in the binary format. The converter can be used as follows:

# For Bazel:
bazel run //utils:compressor -- -s -o ~/gbbs/inputs/rMatGraph_J_5_100.binary ~/gbbs/inputs/rMatGraph_J_5_100

# For Make:
./compressor -s -o ../inputs/rMatGraph_J_5_100.binary ../inputs/rMatGraph_J_5_100

After an uncompressed graph has been converted to the binary format, applications can be run on it by passing in the usual command-line flags, with an additional -b flag. Note that the application will always load the binary file using mmap.

# For Bazel:
$ bazel run //benchmarks/BFS/NonDeterministicBFS:BFS_main -- -s -b -src 10 ~/gbbs/inputs/rMatGraph_J_5_100.binary

# For Make:
$ ./BFS -s -b -src 10 ../../../inputs/rMatGraph_J_5_100.binary

Note that application performance will be affected if the file is not already in the page-cache. We have found that using -m when the binary graph is backed by SSD or disk results in a slow first-run, followed by fast subsequent runs.

Input Formats

We support the adjacency graph format used by the Problem Based Benchmark suite and Ligra.

The adjacency graph format starts with a sequence of offsets one for each vertex, followed by a sequence of directed edges ordered by their source vertex. The offset for a vertex i refers to the location of the start of a contiguous block of out edges for vertex i in the sequence of edges. The block continues until the offset of the next vertex, or the end if i is the last vertex. All vertices and offsets are 0 based and represented in decimal. The specific format is as follows:

AdjacencyGraph
<n>
<m>
<o0>
<o1>
...
<o(n-1)>
<e0>
<e1>
...
<e(m-1)>

This file is represented as plain text.

Weighted graphs are represented in the weighted adjacency graph format. The file should start with the string "WeightedAdjacencyGraph". The m edge weights should be stored after all of the edge targets in the .adj file.

Using SNAP graphs

Graphs from the SNAP dataset collection are commonly used for graph algorithm benchmarks. We provide a tool that converts the most common SNAP graph format to the adjacency graph format that GBBS accepts. Usage example:

# Download a graph from the SNAP collection.
wget https://snap.stanford.edu/data/wiki-Vote.txt.gz
gzip --decompress ${PWD}/wiki-Vote.txt.gz
# Run the SNAP-to-adjacency-graph converter.
# Run with Bazel:
bazel run //utils:snap_converter -- -s -i ${PWD}/wiki-Vote.txt -o <output file>
# Or run with Make:
#   cd utils
#   make snap_converter
#   ./snap_converter -s -i <input file> -o <output file>

gbbs's People

Contributors

akatsukis avatar dnezam avatar eisenstatdavid avatar hrl20 avatar jeshi96 avatar ldhulipala avatar qqliu avatar tomtseng avatar wangyiqiu avatar xiaoenj avatar yushangdi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gbbs's Issues

No makefiles

The readme says you can use make to build, and that there should be a makefile inside each benchmark's folder. However, the repo includes none of those.

Bugs in Computing the In-degree of the Last Vertex

In gbbs/graph_io.cc file, line 187-191, computing the v_in_data[n-1].degree uses tOffsets[n], but tOffsets only has size n, so the answer for v_in_data[n-1].degree is wrong. It can be fixed by adding v_in_data[n-1].degree = m - tOffsets[n-1]; after the parallel_for.

collect2: error ld returned 1 exit status

Hi, I am trying to use your BFS and PageRank but received an exit status returned 1 when I enter 'make'. I am trying to run this in Ubuntu 20.04 for WSL2 and run the following code.

$ sudo apt update
$ sudo apt upgrade -y
$ sudo apt-get install g++ -y
$ sudo apt install make -y
$ sudo apt update
$ mkdir graph_benchmarks
$ git clone https://github.com/ParAlg/gbbs ~/graph_benchmarks
$ export OPENMP=1
$ cd graph_benchmarks/benchmarks/BFS/NonDeterministicBFS
$ make

Please let me know if there is any other information you may need from me?

I see these notes and warnings occur during the make process. I have the full results attached.

../../../gbbs/edge_map_blocked.h:446:23: warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]
446 | return std::move(ret);
| ^
../../../gbbs/edge_map_blocked.h:446:23: note: remove ‘std::move’ call

BFS_Error.txt

Sage - Performance is off

Hey there,

I have a system that is similar to the one described in the paper and was trying to compare some of the results on the WDC'12 graph but found really poor results. In particular, I found a ~15x difference for BFS (11 vs I50) in 1LM. I was wondering if it could be due to specific system settings (BIOS, etc.), myself not using the software correctly, or the input graph itself.

As an example of running Sage, I used:
NUM_THREADS=96 ./sage/benchmarks/BFS/SemiAsymmetric/BFS -src 551626522 -c -m -f1 /mnt/pmem0/csr-wdc12.bytepda -f2 /mnt/pmem1/csr-wdc12.bytepda

An important difference from the paper is that I'm using the directed form at the moment, but should the performance be that drastically worse? I also turned on THP, but it didn't seem to make much of a difference. Any and all help is much appreciated!

Unsupported Input

I have an issue with unsupported input using files from snap.stanford.edu then converting them through the snap_coverter in utils. I downloaded both wiki-Vote.txt.gz and roadNet-TX.txt.gz and tested them through BFS, PageRank, and Color Graph. I have my steps attached, did I miss something?

Thank you
Unsupported_Input_Steps.txt

Utils (and compressor by extension) don't build

Hello.

The tools in the utils directory do not build. It seems to be some kind of signature mismatch.

Additionally, the make file refers to "converter", which currently doesn't exist any more in the directory. I assume it should be "compressor" (which also has build problems).

I've attached build error output fro the add_weights executable here. This is a result of running "make" in the utils directory with gcc/6.1.

build_error.txt

Regards,
Loc Hoang

Missing return value in graph.h (Undefined Behavior?)

Hello,

while trying to implement an algorithm using your library, I have come across issues when trying to use std::move on a symmetric graph, namely bad_alloc or stack smashing.

After running the code with sanitizers (address and undefined), I got the following message:
gbbs/graph.h:150:20: runtime error: execution reached the end of a value-returning function without returning a value
It seems that the move assignments in graph.h are missing return *this. This might be undefined behaviour, but I am not sure as I am not too familiar with C++. After adding return *this to the end of symmetric_graph& operator=(symmetric_graph&& other) noexcept , the algorithm ran without crashing.

Let me know if you'd like more details regarding when exactly these issues occurred.

Thanks,
Daniel

clustering with average linkage

Hi,

I've come across the preprint on hierarchical clustering (https://arxiv.org/abs/2106.05610), and this method looks seems to be exactly what I need.
I managed to install the gbbs library using bazel and also to run HierarchicalAgglomerativeClustering using the python bindings, but only for single and complete linkage.
from HAC_api.h in benchmarks/Clustering/SeqHAC it looks like these are the only ones exported to the API, but an earlier commit (1ecf43c) used to have the other methods in there. I failed to successfully use the library on that commit, though, because graph input and output changed and HierarchicalAgglomerativeClustering is not available as method.
I also did not manage to include the other linkage options in HAC_api.h, apparently because the call signatures changed somewhat in between.

is there a way to make average linkage clustering available via python bindings, or alternatively from the command line? I did not understand how to run the clustering this way.

any help would be greatly appreciated!

Thanks!

deletion_fn() is called too late, resulting in SEGFAULT

Hello,

after compiling my code with fsanitizer=address to debug a SEGFAULT, I found out that the code (which mainly consists of library calls to parlay and gbbs) tries to access a freed location. While investigating, it seems that the pointer is freed during the move assignment:

  symmetric_graph& operator=(symmetric_graph&& other) noexcept {
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn();  <--
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
    return *this;
  }

To my understanding, deletion_fn() (marked with <--) frees the pointer of other, and not this. At the end of the move, the original data of this is leaked + the current data of this is invalid, which probably leads to the SEGFAULT in my code. A possible fix would be to move the call to the beginning, i.e.:

  symmetric_graph& operator=(symmetric_graph&& other) noexcept {
    deletion_fn();  <--
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
    return *this;
  }

With this, my code doesn't SEGFAULT.

In particular, the graph that is being moved in and out is generated by symmetric_graph<symmetric_vertex, Wgh> sym_graph_from_edges in graph.h:

static inline symmetric_graph<symmetric_vertex, Wgh> sym_graph_from_edges(
  ...
  return symmetric_graph<symmetric_vertex, Wgh>(v_data, n, m,
                                                [=]() {
                                                  gbbs::free_array(v_data, n);
                                                  gbbs::free_array(edges, m);
                                                },
                                                (edge_type*)edges);
}

Best,
Daniel

Puzzle: Confused about the kcore reordering used in CliqueCounting

In benchmarks/CliqueCounting/Clique.h, I tried to get the reordered graph after relabel_graph like this.
image

And I got the following log after running cmd:
bazel run benchmarks/CliqueCounting:Clique_main -- -rounds 1 -k 4 --directType KCORE --parallelType VERT -s /root/gbbs/inputs/com-lj.ungraph.adj

image

What makes me confused is that k_{max} = 360, but the max_degree in reordered graph is 9022. Also, in one paper, it says that the com-lj's max degree is 360 after KCORE pre-processing. (Actually, I haven't fully understood what KCORE does to reduce the max degree, so I am quite confused.)

Hope that you could help me clarify my confusion. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.