GithubHelp home page GithubHelp logo

gunrock / mini Goto Github PK

View Code? Open in Web Editor NEW
17.0 17.0 5.0 110 KB

mini is mini

License: Apache License 2.0

Makefile 6.75% C++ 81.90% Cuda 10.99% Shell 0.36%
cuda gpu graph-primitives gunrock mini-gunrock traversal-operators workload-mapping-strategies

mini's Introduction

Gunrock: CUDA/C++ GPU Graph Analytics

Ubuntu Windows Code Quality

Examples Project Template Documentation GitHub Actions

Gunrock1 is a CUDA library for graph-processing designed specifically for the GPU. It uses a high-level, bulk-synchronous/asynchronous, data-centric abstraction focused on operations on vertex or edge frontiers. Gunrock achieves a balance between performance and expressiveness by coupling high-performance GPU computing primitives and optimization strategies, particularly in the area of fine-grained load balancing, with a high-level programming model that allows programmers to quickly develop new graph primitives that scale from one to many GPUs on a node with small code size and minimal GPU programming knowledge.

Branch Purpose Version Status
main Default branch, ported from gunrock/essentials, serves as the official release branch. $\geq$ 2.x.x Active
develop Development feature branch, ported from gunrock/essentials. $\geq$ 2.x.x Active
master Previous release branch for gunrock/gunrock version 1.x.x interface, preserves all commit history. $\leq$ 1.x.x Deprecated
dev Previous development branch for gunrock/gunrock. All changes now merged in master. $\leq$ 1.x.x Deprecated

Quick Start Guide

Before building Gunrock make sure you have CUDA Toolkit2 installed on your system. Other external dependencies such as NVIDIA/thrust, NVIDIA/cub, etc. are automatically fetched using cmake.

git clone https://github.com/gunrock/gunrock.git
cd gunrock
mkdir build && cd build
cmake .. 
make sssp # or for all algorithms, use: make -j$(nproc)
bin/sssp ../datasets/chesapeake/chesapeake.mtx

Implementing Graph Algorithms

For a detailed explanation, please see the full documentation. The following example shows simple APIs using Gunrock's data-centric, bulk-synchronous programming model, we implement Breadth-First Search on GPUs. This example skips the setup phase of creating a problem_t and enactor_t struct and jumps straight into the actual algorithm.

We first prepare our frontier with the initial source vertex to begin push-based BFS traversal. A simple f->push_back(source) places the initial vertex we will use for our first iteration.

void prepare_frontier(frontier_t* f,
                      gcuda::multi_context_t& context) override {
  auto P = this->get_problem();
  f->push_back(P->param.single_source);
}

We then begin our iterative loop, which iterates until a convergence condition has been met. If no condition has been specified, the loop converges when the frontier is empty.

void loop(gcuda::multi_context_t& context) override {
  auto E = this->get_enactor();   // Pointer to enactor interface.
  auto P = this->get_problem();   // Pointer to problem (data) interface.
  auto G = P->get_graph();        // Graph that we are processing.

  auto single_source = P->param.single_source;  // Initial source node.
  auto distances = P->result.distances;         // Distances array for BFS.
  auto visited = P->visited.data().get();       // Visited map.
  auto iteration = this->iteration;             // Iteration we are on.

  // Following lambda expression is applied on every source,
  // neighbor, edge, weight tuple during the traversal.
  // Our intent here is to find and update the minimum distance when found.
  // And return which neighbor goes in the output frontier after traversal.
  auto search = [=] __host__ __device__(
                      vertex_t const& source,    // ... source
                      vertex_t const& neighbor,  // neighbor
                      edge_t const& edge,        // edge
                      weight_t const& weight     // weight (tuple).
                      ) -> bool {
    auto old_distance =
      math::atomic::min(&distances[neighbor], iteration + 1);
    return (iteration + 1 < old_distance);
  };

  // Execute advance operator on the search lambda expression.
  // Uses load_balance_t::block_mapped algorithm (try others for perf. tuning.)
  operators::advance::execute<operators::load_balance_t::block_mapped>(
    G, E, search, context);
}

include/gunrock/algorithms/bfs.hxx

How to Cite Gunrock & Essentials

Thank you for citing our work.

@article{Wang:2017:GGG,
  author =	 {Yangzihao Wang and Yuechao Pan and Andrew Davidson
                  and Yuduo Wu and Carl Yang and Leyuan Wang and
                  Muhammad Osama and Chenshan Yuan and Weitang Liu and
                  Andy T. Riffel and John D. Owens},
  title =	 {{G}unrock: {GPU} Graph Analytics},
  journal =	 {ACM Transactions on Parallel Computing},
  year =	 2017,
  volume =	 4,
  number =	 1,
  month =	 aug,
  pages =	 {3:1--3:49},
  doi =		 {10.1145/3108140},
  ee =		 {http://arxiv.org/abs/1701.01170},
  acmauthorize = {https://dl.acm.org/doi/10.1145/3108140?cid=81100458295},
  url =		 {http://escholarship.org/uc/item/9gj6r1dj},
  code =	 {https://github.com/gunrock/gunrock},
  ucdcite =	 {a115},
}
@InProceedings{Osama:2022:EOP,
  author =	 {Muhammad Osama and Serban D. Porumbescu and John D. Owens},
  title =	 {Essentials of Parallel Graph Analytics},
  booktitle =	 {Proceedings of the Workshop on Graphs,
                  Architectures, Programming, and Learning},
  year =	 2022,
  series =	 {GrAPL 2022},
  month =	 may,
  pages =	 {314--317},
  doi =		 {10.1109/IPDPSW55747.2022.00061},
  url =          {https://escholarship.org/uc/item/2p19z28q},
}

Copyright & License

Gunrock is copyright The Regents of the University of California. The library, examples, and all source code are released under Apache 2.0.

Footnotes

  1. This repository has been moved from https://github.com/gunrock/essentials and the previous history is preserved with tags and under master branch. Read more about gunrock and essentials in our vision paper: Essentials of Parallel Graph Analytics.

  2. Recommended CUDA v11.5.1 or higher due to support for stream ordered memory allocators.

mini's People

Contributors

slashspirit avatar yzhwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mini's Issues

invalid __shared__ read in moderngpu/src/moderngpu/cta_load_balance.hxx:171

I compiled pr and run with dataset soc-LiveJournal1.mtx and got the error:

cuda-memcheck ./bin/test_pr__x86_64 --file=../../../../gunrock/dataset/large/soc-LiveJournal1/soc-LiveJournal1.mtx

========= Invalid __shared__ read of size 4
=========     at 0x00002010 in /mnt/daisy-mount/mini/gunrock/tests/pr/../../../external/moderngpu/src/moderngpu/cta_load_balance.hxx:171:_ZN4mgpu16launch_box_cta_kINS_12launch_box_tIJNS_7arch_20INS_12launch_cta_tILi128ELi11ELi8ELi0EEENS_7empty_tEEENS_7arch_35INS3_ILi128ELi7ELi5ELi0EEES5_EENS_7arch_52IS4_S5_EEEEEZNS_13lbs_segreduceIS5_ZNS_13lbs_segreduceIS5_ZN7gunrock5oprtr12neighborhood19neighborhood_kernelINSF_2pr12pr_problem_tENSJ_12pr_functor_tEfNS_6plus_tIfEELb0ELb0EEEiSt10shared_ptrIT_ERSO_INSF_10frontier_tIiEEESU_PT1_SV_iRNS_18standard_context_tEEUliiiE_PiPfSN_fJEEEvT0_iSV_iT2_T3_T4_RNS_9context_tEDpT5_EUliiiNS_5tupleIJEEEE_S10_S1B_S11_SN_fJEEEvS12_iSV_iS13_S14_S15_S18_S17_DpT6_EUliiE_JEEEvS12_iDpSV_
=========     by thread (126,0,0) in block (29205,0,0)
=========     Address 0x09015178 is out of bounds

It seems a bug in moderngpu

Compile Mini Gunrock with ptx assemble issue

I tried compile Mini Gunrock bfs and got following issue:

ptxas /tmp/tmpxft_00009acd_00000000-5_test_bfs.ptx, line 8902; error   : Instruction 'shfl' without '.sync' is not supported on .target sm_70 and higher from PTX ISA version 6.4
ptxas /tmp/tmpxft_00009acd_00000000-5_test_bfs.ptx, line 8906; error   : Instruction 'shfl' without '.sync' is not supported on .target sm_70 and higher from PTX ISA version 6.4
ptxas fatal   : Ptx assembly aborted due to errors
Makefile:18: recipe for target 'bin/test_bfs__x86_64' failed
make: *** [bin/test_bfs__x86_64] Error 255

I compiled with CUDA10 on daisy V100:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168

I suspect this is the moderngpu problem.

Thanks!
Yuxin

dead lock problem for ballot function

ballot function also has the similar problem to issue: #3

https://github.com/yzhwang/moderngpu/blob/9a6c3167fc12ed8b459b7f4376dd89069cad3eb1/src/moderngpu/cta_segscan.hxx#L39

    if(tid < num_warps) {
      int cta_bits = ballot(0 != storage.delta[tid]);
      int warp_segment = 31 - clz(cta_mask & cta_bits);
      int start = (-1 != warp_segment) ?
        (31 - clz(storage.delta[warp_segment]) + 32 * warp_segment) : 0;
      storage.delta[num_warps + tid] = start;
    }

should be modified to:

    if(tid < num_warps) {
      unsigned mask = __activemask();
      int cta_bits = ballot(0 != storage.delta[tid], mask);
      int warp_segment = 31 - clz(cta_mask & cta_bits);
      int start = (-1 != warp_segment) ?
        (31 - clz(storage.delta[warp_segment]) + 32 * warp_segment) : 0;
      storage.delta[num_warps + tid] = start;
    }

moderngpu stuck in a dead lock on Volta GPU

I am trying to run mini Gunrock BFS on daisy which has Volta GPU.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168

gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

However, it is stuck in a deadlock. We found the code is stuck at moderngpu transform_scan.

If I run the same code on Luigi which has Tesla K40

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Mini Gunrock BFS runs correctly and successfully.

Segmentation Fault 139 (core dumped) occurs when building the code

When building the code I receive this error:

$make
mkdir -p bin
"/usr/local/cuda-7.5/bin/nvcc" -gencode=arch=compute_30,code="sm_30,compute_30" -std=c++11 -ccbin=/usr/bin/g++-4.8 -Xcompiler="-Wundef" -O2 -g -Xcompiler="-Werror" -lineinfo --expt-extended-lambda -use_fast_math -Xptxas="-v" -o bin/test_bfs__x86_64 test_bfs.cu -m64 -I.. -I../../src -I"../../../external/moderngpu/src"
Segmentation fault (core dumped)
make: *** [bin/test_bfs__x86_64] Error 139

My system:
Ubuntu14.04 64 Bit,
GeForce GT 740 : 1058.500 Mhz (Ordinal 0) Compute Capability sm_30 Mem Clock: 2500.000 Mhz x 128 bits ( 80.0 GB/s)

And I can build gunrock main project successfully. I can not figure out what the problem is.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.