GithubHelp home page GithubHelp logo

graph-benchmarks's Introduction

Benchmark of popular graph / network packages

A comparison of 5 different packages:

For a more detailed description of the process and results, please refer to the following blog post.

Results

The benchmark was run using Google's Compute n1-standard-16 instance (16vCPU Haswell 2.3GHz, 60 GB memory).

Each algorithm was run 100 times on the Amazon and Google dataset and 10 times on the Pokec dataset, with the exception of Networkx.

The median run time is shown in the table below. Due to differences in profiling techniques and code implementation, the results may differ. Please refer to the respective code bases for implementation details.

Setup

Setup and installation instructions can be found in setup.md.

Data

Datasets are downloaded from https://snap.stanford.edu/data/ and is stored in the data folder. Amazon refers to amazon0302, google to web-Google and pokec to soc-Pokec. A download_data.sh script is provided in the data folder to automate the download and pre-processing of the SNAP datasets.

Code

Profiling code are located in the code folder. A particular benchmark code can be run using the helper bash script run_profiler.sh [profiling code] [dataset path] [number of repetitions] [output path]. For example, to replicate the igraph benchmark on the amazon dataset with 100 repetitions run run_profiler.sh code/igraph_profile.py data/amazon0302.txt 100 output/igraph_amazon.txt.

graph-benchmarks's People

Contributors

sbromberger avatar timlrx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

graph-benchmarks's Issues

NetworKit shortest path benchmark

Hi, I've noticed that the NetworKit shortest-path benchmark is executed with

distance.BFS(g, node_index).run()

However, with this API NetworKit will also store all shortest paths from node_index to all the other nodes, which implies a significant memory and time overhead. This behavior is a bit counterintuitive, and should be better documented, one does not expect BFS to store by default all shortest paths.
Since the other tools only compute the shortest distances, for a fair comparison you should run this:

distance.BFS(g, node_index, storePaths=False).run()

Graph construction performance

Hi, I read your blog post about benchmarking graph network packages; nice work. Have you run any tests on performance of building out a graph, node by node, or do you know of any? thanks

Any thoughts on memory consumption?

I know you said that memory consumption are out of scope of your study, but I am curious about your intuition on this. I am looking for a package that's the most memory efficient. My raw list of edges (in numpy) takes 16 GB, but when creating a networkx instance from the edge list, it requires more than 64GB of memory :(

Could you include Weighted Betweenness Centrality?

I am wondering if we can infer the differences in performance while calculating Weighted Betweenness Centrality from the Shortest Path results you show.

If one algorithm is faster on the shortest path, does this mean it is faster also on betweenness?
Does the shortest path algorithm consider arc weights?

It would be great if you could include betweenness (in the version that considers arc weights) in the next benchmarks!

Use pagerank_scipy instead of pagerank [networkx]?

In terms of raw performance networkx.pagerank_scipy can be 4-5X faster than networkx.pagerank. For the google.txt file on my local machine.

In [4]: %%timeit
   ...: nx.pagerank(G, alpha=0.85, tol=1e-3, max_iter=10000000)
   ...:
40.1 s ± 5.19 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %%timeit
   ...: nx.pagerank_scipy(G, alpha=0.85, tol=1e-3, max_iter=10000000)
   ...:
8.89 s ± 48.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)```

LightGraphs version in the benchmark

Hi @timlrx, thanks for the very interesting work. I am trying to run the benchmarks following your instructions but have some troubles running lightgraphs.jl. I am using the LightGraphs master branch as suggested in the file (and also the master branch of graph-benchmarks), but it seems to have different implementations about functions like ShortestPaths and Centrality with the ones expected in lightgraphs.jl. For example, I found LightGraphs.ShortestPaths as LightGraphs.Experimental.ShortestPaths, but could not find ThreadedBFS and Centrality. Could you provide suggestions on which version I should use to run the lightgraphs.jl benchmark? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.