GithubHelp home page GithubHelp logo

cea-hpc / harp Goto Github PK

View Code? Open in Web Editor NEW
8.0 10.0 1.0 1.42 MB

Small tool for profiling the performance of hardware-accelerated Rust code using OpenCL and CUDA

License: Apache License 2.0

Cuda 12.34% C 4.38% Python 6.18% Rust 77.10%
cuda gpgpu-computing hpc opencl rust

harp's People

Contributors

dssgabriel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

codingonion

harp's Issues

Panic with `IllegalAddress` error in `scan` kernel

Description

When running the scan algorithm benchmark, the program panics with an IllegalAddress error while trying to synchronize the first call to the scan kernel (before doing any recursion).

Expected behavior

The scan kernel should not cause an IllegalAddress panic and it should compute the correct result.

Current behavior

The program panics with the following error:

thread 'main' panicked at 'failed to synchronize kernel `scan` at depth 0: IllegalAddress', src/kernels.rs:163:14

The relevant code snippet is the following:

HARP/src/kernels.rs

Lines 147 to 163 in 93c9643

// Launch first step of the kernel
unsafe {
launch!(
scan_kernel<<<grid_size, block_size, smem_size * size_of::<i32>() as u32, stream>>>(
d_in.as_device_ptr(),
d_in.len(),
d_out.as_device_ptr(),
block_sums.as_device_ptr(),
max_elems_per_block,
smem_size
)
)
.expect("failed to launch kernel `scan`");
}
stream
.synchronize()
.expect(format!("failed to synchronize kernel `scan` at depth {depth}").as_str());

Additional error information

When running with RUST_BACKTRACE=1, the call stack is the following:

thread 'main' panicked at 'failed to synchronize kernel `scan` at depth 0: IllegalAddress', src/kernels.rs:163:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/panicking.rs:67:14
   2: core::result::unwrap_failed
             at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/result.rs:1651:5
   3: harp::kernels::device::scan
   4: harp::drivers::device::cuda_scan
   5: harp::drivers::scan
   6: harp::main
For the full backtrace:
thread 'main' panicked at 'failed to synchronize kernel `scan` at depth 0: IllegalAddress', src/kernels.rs:163:1
4
stack backtrace:
   0:     0x55f9f71d3891 - std::backtrace_rs::backtrace::libunwind::trace::h4e5cd7155e2ebaac
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/../../backtrac
e/src/backtrace/libunwind.rs:93:5
   1:     0x55f9f71d3891 - std::backtrace_rs::backtrace::trace_unsynchronized::hb4d504f8def07b70
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/../../backtrac
e/src/backtrace/mod.rs:66:5
   2:     0x55f9f71d3891 - std::sys_common::backtrace::_print_fmt::h270ee65403a6a640
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/sys_common/bac
ktrace.rs:65:5
   3:     0x55f9f71d3891 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf
a127bbe4d370ae8
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/sys_common/bac
ktrace.rs:44:22
   4:     0x55f9f71f60cf - core::fmt::rt::Argument::fmt::h975c0825ea1bb836
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/fmt/rt.rs:138
:9
   5:     0x55f9f71f60cf - core::fmt::write::hb200bbda235147d0
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/fmt/mod.rs:10
94:21
   6:     0x55f9f71d1691 - std::io::Write::write_fmt::hf4eeaa80392fd692
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/io/mod.rs:1713
:15
   7:     0x55f9f71d36a5 - std::sys_common::backtrace::_print::h2427e2e0721aca68
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/sys_common/bac
ktrace.rs:47:5
   8:     0x55f9f71d36a5 - std::sys_common::backtrace::print::h8c074174f5a65b94
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/sys_common/bac
ktrace.rs:34:9
   9:     0x55f9f71d4b67 - std::panicking::default_hook::{{closure}}::habdecd03f278805d
  10:     0x55f9f71d4954 - std::panicking::default_hook::hfeee4c9ec6e7984a
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:2
88:9
  11:     0x55f9f71d501c - std::panicking::rust_panic_with_hook::h50748255142a0809
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:7
05:13
  12:     0x55f9f71d4f17 - std::panicking::begin_panic_handler::{{closure}}::h1532befb1017034b
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:5
97:13
  13:     0x55f9f71d3cc6 - std::sys_common::backtrace::__rust_end_short_backtrace::h36f919598d3260ac
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/sys_common/bac
ktrace.rs:151:18
  14:     0x55f9f71d4c62 - rust_begin_unwind
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:593:5
  15:     0x55f9f70f9703 - core::panicking::panic_fmt::h637089c9b9878b43
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/panicking.rs:67:14
  16:     0x55f9f70f9b43 - core::result::unwrap_failed::h18e2f5da912951f3
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/result.rs:1651:5
  17:     0x55f9f7131c3d - harp::kernels::device::scan::h847138cb3248648f
  18:     0x55f9f71028fe - harp::drivers::device::cuda_scan::h8108c82dbccdbc57
  19:     0x55f9f71270f4 - harp::drivers::scan::h89ef3c614e92ed26
  20:     0x55f9f713c4c8 - harp::main::h8849d6b7566eb6a4
  21:     0x55f9f7118d13 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5177598e656e9a5e
  22:     0x55f9f7118d29 - std::rt::lang_start::{{closure}}::h6d198479f9d90738
  23:     0x55f9f71cc225 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hea64a749880f8ff2
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/core/src/ops/function.rs:284:13
  24:     0x55f9f71cc225 - std::panicking::try::do_call::h767f64e3f6e064fb
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:500:40
  25:     0x55f9f71cc225 - std::panicking::try::h90cf534a1e5ea4ae
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:464:19
  26:     0x55f9f71cc225 - std::panic::catch_unwind::h9bac3c528abd1cb9
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panic.rs:142:14
  27:     0x55f9f71cc225 - std::rt::lang_start_internal::{{closure}}::h8709a6d2fd226842
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/rt.rs:148:48
  28:     0x55f9f71cc225 - std::panicking::try::do_call::h1408f9ff8d60cf9d
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:500:40
  29:     0x55f9f71cc225 - std::panicking::try::hc2659d179f01b076
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panicking.rs:464:19
  30:     0x55f9f71cc225 - std::panic::catch_unwind::h8e83755629085503
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/panic.rs:142:14
  31:     0x55f9f71cc225 - std::rt::lang_start_internal::hd3b3887afec46100
                               at /rustc/371994e0d8380600ddda78ca1be937c7fb179b49/library/std/src/rt.rs:148:20
  32:     0x55f9f713c4f5 - main
  33:     0x7f9431229d90 - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  34:     0x7f9431229e40 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:392:3
  35:     0x55f9f70f9e05 - _start
  36:                0x0 - <unknown>

Steps to reproduce

  1. git checkout scan-illegal-address
  2. cargo run --release -- iscan --lengths <VECTOR_LENGTH>
  3. The program panics with the error described above

Environment

OS: Ubuntu 22.04
Kernel: 5.19.0-43
Toolchains:

  • rustc 1.72.0-nightly (371994e0d 2023-06-13)
  • gcc 11.3.0
  • nvcc 12.1.105 (CUDA SDK 12.1)

Additional general information

The equivalent C++ code in the cpp_scan directory does not crash with a similar error. However, it does not produce the expected result either as it does not seem to update the subsequent thread blocks with the computed partial sums.

To run the code:

cd cpp_scan
make run

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.