GithubHelp home page GithubHelp logo

pixiesunky / mgg_osdi23 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yukewang96/mgg_osdi23

0.0 0.0 0.0 1.08 MB

MGG-Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.

Shell 0.36% C++ 6.60% Python 8.33% C 0.05% Cuda 80.05% CMake 4.62%

mgg_osdi23's Introduction

Artifact for OSDI'23 paper

Yuke Wang, et al. Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms. OSDI'23.

[Paper] [Bibtex] DOI

1. Setup (Skip to Section-2 if evaluated on provided GCP)

1.1. Clone this project from Github.

git clone --recursive [email protected]:YukeWang96/MGG-OSDI23-AE.git

1.2. Download libraries and datasets.

  • Download libraries (cudnn-v8.2, nvshmem_src_2.0.3-0, openmpi-4.1.1).
wget https://storage.googleapis.com/mgg_data/local.tar.gz
tar -zxvf local.tar.gz && rm local.tar.gz
tar -zxvf local/nvshmem_src_2.0.3-0/build_cu112.tar.gz
wget https://storage.googleapis.com/mgg_data/dataset.tar.gz && tar -zxvf dataset.tar.gz && rm dataset.tar.gz
  • Setup baseline DGL
cd dgl_pydirect_internal
wget https://storage.googleapis.com/mgg_data/graphdata.tar.gz && tar -zxvf graphdata.tar.gz && rm graphdata.tar.gz
cd ..
  • Setup baseline ROC
cd roc-new
git submodule update --init --recursive
wget https://storage.googleapis.com/mgg_data/data.tar.gz && tar -zxvf data.tar.gz && rm -rf data.tar.gz

or

gsutil cp -r gs://mgg_data/roc-new/ .

1.3. Launch Docker for MGG.

cd docker 
./launch.sh

1.4. Compile implementation.

mkdir build && cd build && cmake .. && cd ..
./build.sh

2. Run initial test experiment.

  • Please try study experiments in below Section-3.4 and Section-3.5

3. Reproduce the major results from paper.

3.1 Compare with UVM on 4xA100 and 8xA100 (Fig.8a and Fig.8b).

./0_run_MGG_UVM_4GPU_GCN.sh
./0_run_MGG_UVM_4GPU_GIN.sh
./0_run_MGG_UVM_8GPU_GCN.sh
./0_run_MGG_UVM_8GPU_GIN.sh

Note that the results can be found at Fig_8_UVM_MGG_4GPU_GCN.csv, Fig_8_UVM_MGG_4GPU_GIN.csv, Fig_8_UVM_MGG_8GPU_GCN.csv, and Fig_8_UVM_MGG_8GPU_GIN.csv.

3.2 Compare with DGL on 8xA100 for GCN and GIN (Fig.7a and Fig.7b).

./launch_docker.sh
cd gcn/
./0_run_gcn.sh
cd ../gin/
./0_run_gin.sh

Note that the results can be found at 1_dgl_gin.csv and 1_dgl_gcn.csv and our MGG reference is in MGG_GCN_8GPU.csv and MGG_8GPU_GIN.csv.

3.3 Compare with ROC on 8xA100 (Fig.9).

cd roc-new/docker
./launch.sh
./run_all.sh

Note that the results can be found at Fig_9_ROC_MGG_8GPU_GCN.csv and Fig_9_ROC_MGG_8GPU_GIN.csv.

Results of ROC is similar as

Dataset Time (ms)
reddit 425.67
enwiki-2013 619.33
it-2004 5160.18
paper100M 8179.35
ogbn-products 529.74
ogbn-proteins 423.82
com-orkut 571.62

3.4 Compare NP with w/o NP (Fig.10a).

python 2_MGG_NP.py

Note that the results can be found at MGG_NP_study.csv. Similar to following table.

Dataset MGG_WO_NP MGG_W_NP Speedup (x)
Reddit 76.797 16.716 4.594
enwiki-2013 290.169 88.249 3.288
ogbn-product 86.362 26.008 3.321

3.5 Compare WL with w/o WL (Fig.10b).

python 3_MGG_WL.py

Note that the results can be found at MGG_WL_study.csv. Results are similar to

Dataset MGG_WO_NP MGG_W_NP Speedup (x)
Reddit 75.035 18.92 3.966
enwiki-2013 292.022 104.878 2.784
ogbn-product 86.632 29.941 2.893

3.6 Compare API (Fig.10c).

python 4_MGG_API.py

Note that the results can be found at MGG_API_study.csv. Results are similar to

Norm.Time w.r.t. Thread MGG_Thread MGG_Warp MGG_Block
Reddit 1.0 0.299 0.295
enwiki-2013 1.0 0.267 0.263
ogbn-product 1.0 0.310 0.317

3.7 Design Space Search (Fig.11a)

python 5_MGG_DSE_4GPU.py

Note that the results can be found at Reddit_4xA100_dist_ps.csv and Reddit_4xA100_dist_wpb.csv. Results similar to

  • Reddit_4xA100_dist_ps.csv
dist\ps 1 2 4 8 16 32
1 17.866 17.459 16.821 16.244 16.711 17.125
2 17.247 16.722 16.437 16.682 17.053 17.808
4 16.826 16.41 16.583 17.217 17.627 18.298
8 16.271 16.725 17.193 17.655 18.426 18.99
16 16.593 17.214 17.617 18.266 19.009 19.909
  • Reddit_4xA100_dist_wpb.csv
dist\wpb 1 2 4 8 16
1 34.773 23.164 16.576 15.235 16.519
2 34.599 23.557 17.254 15.981 19.56
4 34.835 23.616 17.674 17.034 22.084
8 34.729 23.817 18.302 18.708 25.656
16 34.803 24.161 18.879 23.44 32.978
python 5_MGG_DSE_8GPU.py

Note that the results can be found at Reddit_8xA100_dist_ps.csv and Reddit_8xA100_dist_wpb.csv.

4. Use MGG as a Tool or Library for your project.

Building a new design based on MGG with NVSHMEM is simple, there are only several steps:

4.1 Build the C++ design based on our existing examples

  • Create a new .cu file under src/. An example is shown below.

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/src/mgg_np_div_kernel.cu#L78-L87

4.2 Build the CUDA kernel design based on our existing examples.

  • Add a kernel design in include/neighbor_utils.cuh. An example is shown below.

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/include/neighbor_utils.cuh#L770-L785

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/include/neighbor_utils.cuh#L1281-L1389

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/include/neighbor_utils.cuh#L277-L292

4.3 Register the new design to CMake.

  • Add a compilation entry in CMakeLists.txt).
  • Add a command make filename.cu in 0_mgg_build.cu.
  • An example is shown below. Note that please match the filename with your newly created .cu in step-1.

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/CMakeLists.txt#L60-L64

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/CMakeLists.txt#L218-L249

4.4 Launch the MGG docker and recompile,

  • The compiled exectuable will be located under build/.
cd docker 
./launch.sh
cd build && cmake ..
cd .. && ./0_mgg_build.sh

4.5 Run the compiled executable.

https://github.com/YukeWang96/MGG-OSDI23-AE/blob/0024bdd68d9684b0434547d69462b01e225fe420/bench_MGG.py#L5-L51

Reference

mgg_osdi23's People

Contributors

yukewang96 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.