GithubHelp home page GithubHelp logo

Comments (2)

bmyerz avatar bmyerz commented on July 17, 2024

I think more information is needed, starting with where the signal is thrown.
Try building with Debug mode and running with freeze on error (see https://github.com/uwsampa/grappa/blob/master/doc/debugging.md#debugging).

If the process freezes on the signal, then ssh into the node that had the signal and do gdb attach <pid>. You can find the pid of the running grappa process with something like ps aux | grep grappa. From there you can do a backtrace.

If the process doesn't freeze on the signal then you can have mpirun launch the processes through gdb. (see the #2 answer to question 6 on https://www.open-mpi.org/faq/?category=debugging)

from grappa.

jeffhammond avatar jeffhammond commented on July 17, 2024

Here is a stacktrace

[jrhammon@esgmonster prk-repo]$ mpirun -n 1 gdb GRAPPA/Transpose/transpose 10 3600 32
Excess command line arguments ignored. (3600 ...)
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/jrhammon/Work/INTEL/PCL/ESG/PRK/github-official/GRAPPA/Transpose/transpose...done.
Attaching to program: /home/jrhammon/Work/INTEL/PCL/ESG/PRK/github-official/GRAPPA/Transpose/transpose, process 10
ptrace: Operation not permitted.
/home/jrhammon/Work/INTEL/PCL/ESG/PRK/github-official/10: No such file or directory.
(gdb) run 10 1000 32
Starting program: /home/jrhammon/Work/INTEL/PCL/ESG/PRK/github-official/GRAPPA/Transpose/transpose 10 1000 32
[Thread debugging using libthread_db enabled]
warning: File "/opt/gcc/5.3.0/lib64/libstdc++.so.6.0.21-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "/usr/share/gdb/auto-load:/usr/lib/debug:/usr/bin/mono-gdb.py".
To enable execution of this file add
    add-auto-load-safe-path /opt/gcc/5.3.0/lib64/libstdc++.so.6.0.21-gdb.py
line to your configuration file "/home/jrhammon/.gdbinit".
To completely disable this security protection add
    set auto-load safe-path /
line to your configuration file "/home/jrhammon/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
    info "(gdb)Auto-loading safe path"
I0704 15:54:00.408108 110487 Allocator.hpp:185] Allocator is responsible for addresses from 0 to 0x1f6787000
I0704 15:54:00.408323 110487 GlobalMemory.cpp:67] Initialized GlobalMemory with 8430055424 bytes of shared heap.
I0704 15:54:00.412102 110487 Grappa.cpp:647] 
-------------------------
Shared memory breakdown:
  node total:                   62.8088 GB
  locale shared heap total:     31.4044 GB
  locale shared heap per core:  31.4044 GB
  communicator per core:        0.125 GB
  tasks per core:               0.0156631 GB
  global heap per core:         7.8511 GB
  aggregator per core:          0.00247955 GB
  shared_pool current per core: 4.76837e-07 GB
  shared_pool max per core:     7.8511 GB
  free per locale:              23.4102 GB
  free per core:                23.4102 GB
-------------------------
Parallel Research Kernels version 2.16
Grappa matrix transpose: B = A^T
Parallel Research Kernels version 2.16
Grappa matrix transpose: B = A^T
Number of cores         = 1
Matrix order            = 1000
Number of iterations    = 10
Tile size               = 32
Solution validates
Rate (MB/s): 6500.35 Avg time (s): 0.00246141
Summed errors: 0

Program received signal SIGSEGV, Segmentation fault.
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::c_str() const () at /tmp/gcc-5.3.0/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:1889
    in /tmp/gcc-5.3.0/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h

The code GDB is trying to point to is:

      // String operations:
      /**
       *  @brief  Return const pointer to null-terminated contents.
       *
       *  This is a handle to internal data.  Do not modify or dire things may
       *  happen.
      */
      const _CharT*
      c_str() const _GLIBCXX_NOEXCEPT
      { return _M_data(); }

from grappa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.