GithubHelp home page GithubHelp logo

llnl / caliper Goto Github PK

View Code? Open in Web Editor NEW
339.0 21.0 61.0 8.92 MB

Caliper is an instrumentation and performance profiling library

Home Page: http://software.llnl.gov/Caliper/

License: BSD 3-Clause "New" or "Revised" License

CMake 3.40% C++ 70.38% C 12.42% Python 8.71% Fortran 2.28% CWeb 1.68% Makefile 0.02% TeX 0.51% Shell 0.40% Forth 0.07% XSLT 0.13%
caliper annotation-apis instrumentation performance-analysis performance-monitoring trace hpc performance radiuss cpp

caliper's Introduction

Caliper: A Performance Analysis Toolbox in a Library

Github Actions Build Status Coverage

Caliper is a performance instrumentation and profiling library for HPC (high-performance computing) programs. It provides source-code annotation APIs for marking regions of interest in C, C++, and Fortran code, as well as a set of built-in performance measurement recipes for a wide range of performance engineering use cases, such as lightweight always-on profiling, event tracing, or performance monitoring. Alternatively, users can create custom measurement configurations for specialized use cases.

Caliper can either generate simple human-readable reports or machine-readable JSON or .cali files for automated data processing with user-provided scripts or analysis frameworks like Hatchet and Thicket. It can also generate detailed event traces for timeline visualizations with Perfetto and the Google Chrome trace viewer.

Features include:

  • Low-overhead source-code annotation API
  • Configuration API to control performance measurements from within an application
  • Recording program metadata for analyzing collections of runs
  • Flexible key:value data model to capture application-specific features for performance analysis
  • Fully threadsafe implementation, support for parallel programming models like MPI
  • Event-based as well as sample-based performance measurements
  • Trace and profile recording
  • Connection to third-party tools, e.g. NVidia's NSight tools, AMD ROCProf, or Intel(R) VTune(tm)
  • Measurement and profiling functionality such as timers, PAPI hardware counters, and Linux perf_events
  • Memory annotations to associate performance measurements with memory regions

Documentation

Extensive documentation is available here: https://software.llnl.gov/Caliper/

Usage examples of the C++, C, and Fortran annotation and ConfigManager APIs are provided in the examples directory.

See the "Getting started" section below for a brief tutorial.

Building and installing

You can install Caliper with the spack package manager:

$ spack install caliper

To build Caliper manually, you need cmake 3.12+ and a current C++11-compatible Compiler. Clone Caliper from github and proceed as follows:

$ git clone https://github.com/LLNL/Caliper.git
$ cd Caliper
$ mkdir build && cd build
$ cmake -DCMAKE_INSTALL_PREFIX=<path to install location> ..
$ make
$ make install

Link Caliper to a program by adding libcaliper:

$ g++ -o app app.o -L<path install location>/lib64 -lcaliper

There are many build flags to enable optional features, such as -DWITH_MPI for MPI support. See the "Build and install" section in the documentation for further information.

Getting started

Typically, we integrate Caliper into a program by marking source-code sections of interest with descriptive annotations. Performance profiling can then be enabled through the Caliper ConfigManager API or environment variables. Alternatively, third-party tools can connect to Caliper and access information provided by the source-code annotations.

Source-code annotations

Caliper's source-code annotation API allows you to mark source-code regions of interest in your program. Much of Caliper's functionality depends on these region annotations.

Caliper provides macros and functions for C, C++, and Fortran to mark functions, loops, or sections of source-code. For example, use CALI_CXX_MARK_FUNCTION to mark a function in C++:

#include <caliper/cali.h>

void foo()
{
    CALI_CXX_MARK_FUNCTION;
    // ...
}

You can mark arbitrary code regions with the CALI_MARK_BEGIN and CALI_MARK_END macros or the corresponding cali_begin_region() and cali_end_region() functions:

#include <caliper/cali.h>

// ...
CALI_MARK_BEGIN("my region");
// ...
CALI_MARK_END("my region");

The cxx-example, c-example, and fortran-example example apps show how to use Caliper in C++, C, and Fortran, respectively.

Recording performance data

With the source-code annotations in place, we can run performance measurements. By default, Caliper does not record data - we have to activate performance profiling at runtime. An easy way to do this is to use one of Caliper's built-in measurement recipes. For example, the runtime-report config prints out the time spent in the annotated regions. You can activate built-in measurement configurations with the ConfigManager API or with the CALI_CONFIG environment variable. Let's try this on Caliper's cxx-example program:

$ cd Caliper/build
$ make cxx-example
$ CALI_CONFIG=runtime-report ./examples/apps/cxx-example
Path       Min time/rank Max time/rank Avg time/rank Time %
main            0.000119      0.000119      0.000119  7.079120
  mainloop      0.000067      0.000067      0.000067  3.985723
    foo         0.000646      0.000646      0.000646 38.429506
  init          0.000017      0.000017      0.000017  1.011303

The runtime-report config works for MPI and non-MPI programs. It reports the minimum, maximum, and average exclusive time (seconds) spent in each marked code region across MPI ranks (the values are identical in non-MPI programs).

You can customize the report with additional options. Some options enable additional Caliper functionality, such as profiling MPI and CUDA functions in addition to the user-defined regions, or additional metrics like memory usage. Other measurement configurations besides runtime-report include:

  • loop-report: Print summary and time-series information for loops.
  • mpi-report: Print time spent in MPI functions.
  • callpath-sample-report: Print a time spent in functions using call-path sampling.
  • event-trace: Record a trace of region enter/exit events in .cali format.
  • hatchet-region-profile: Record a region time profile for processing with Hatchet or cali-query.

See the "Builtin configurations" section in the documentation to learn more about different configurations and their options.

You can also create entirely custom measurement configurations by selecting and configuring Caliper services manually. See the "Manual configuration" section in the documentation to learn more.

ConfigManager API

A distinctive Caliper feature is the ability to enable performance measurements programmatically with the ConfigManager API. For example, we often let users activate performance measurements with a command-line argument.

With the C++ ConfigManager API, built-in performance measurement and reporting configurations can be activated within a program using a short configuration string. This configuration string can be hard-coded in the program or provided by the user in some form, e.g. as a command-line parameter or in the programs's configuration file.

To use the ConfigManager API, create a cali::ConfigManager object, add a configuration string with add(), start the requested configuration channels with start(), and trigger output with flush():

#include <caliper/cali-manager.h>
// ...
cali::ConfigManager mgr;
mgr.add("runtime-report");
// ...
mgr.start(); // start requested performance measurement channels
// ... (program execution)
mgr.flush(); // write performance results

The cxx-example program uses the ConfigManager API to let users specify a Caliper configuration with the -P command-line argument, e.g. -P runtime-report:

$ ./examples/apps/cxx-example -P runtime-report
Path       Min time/rank Max time/rank Avg time/rank Time %
main            0.000129      0.000129      0.000129  5.952930
  mainloop      0.000080      0.000080      0.000080  3.691740
    foo         0.000719      0.000719      0.000719 33.179511
  init          0.000021      0.000021      0.000021  0.969082

See the Caliper documentation for more examples and the full API and configuration reference.

Authors

Caliper was created by David Boehme, [email protected].

A complete list of contributors is available on GitHub.

Major contributors include:

Citing Caliper

To reference Caliper in a publication, please cite the following paper:

On GitHub, you can copy this citation in APA or BibTeX format via the "Cite this repository" button. Or, see the comments in CITATION.cff for the raw BibTeX.

Release

Caliper is released under a BSD 3-clause license. See LICENSE for details.

LLNL-CODE-678900

caliper's People

Contributors

aagimene avatar adayton1 avatar adrienbernede avatar alfredo-gimenez avatar cdwdirect avatar daboehme avatar davidbeckingsale avatar davidpoliakoff avatar davis274 avatar garretthooten avatar gonsie avatar ibaned avatar jrmadsen avatar junghans avatar khuck avatar kshoga1 avatar mplegendre avatar pearce8 avatar rblake-llnl avatar rombur avatar slabasan avatar srini009 avatar t-nojiri avatar tepperly avatar termi-official avatar timmah avatar vsurjadidjaja avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

caliper's Issues

Segmentation fault on Intel Xeon Phi Knight Landing

I try to run LULESH with caliper annotations on Intel Xeon Phi Knight Landing.

OpenMP setting is as following:
export OMP_NUM_THREADS=144
export OMP_PROC_BIND=spread

There is always a segmentation fault, with the calling stack as follows.

#0  0x00002ab98923605f in ?? ()
#1  0x00002ab988c4a041 in (anonymous namespace)::cali_pthread_create_wrapper (thread=0x7ffd41457058, attr=0x7ffd414572b8,
    fn=0x2ab9892eaeb0 <_INTERNAL_26_______src_z_Linux_util_cpp_313effc4::__kmp_launch_worker(void*)>, arg=0x2ab98c736400)
    at /work/03915/taoncsu/stampede2/Caliper/src/services/pthread/PthreadService.cpp:83
#2  0x00002ab9892eb9c9 in __kmp_create_worker (gtid=1095069784, th=0x7ffd414572b8, stack_size=46976351903408) at ../../src/z_Linux_util.cpp:878
#3  0x00002ab9892b8579 in __kmp_allocate_thread (root=0x7ffd41457058, team=0x7ffd414572b8, new_tid=-2000380240)
    at ../../src/kmp_runtime.cpp:4521
#4  0x00002ab9892bd2a7 in __kmp_allocate_team (root=0x7ffd41457058, new_nproc=1095070392, max_nproc=-2000380240, new_proc_bind=31192640,
    new_icvs=0x0, argc=16, master=0x2ab98c737200) at ../../src/kmp_runtime.cpp:5138
#5  0x00002ab9892bba18 in __kmp_fork_call (loc=0x7ffd41457058, gtid=1095070392, call_context=(unknown: 2294587056), argc=31192640,
    microtask=0x0, invoker=0x10, ap=0x7ffd41457730) at ../../src/kmp_runtime.cpp:2150
#6  0x00002ab989291e2a in __kmpc_fork_call (loc=0x7ffd41457058, argc=1095070392,
    microtask=0x2ab988c49eb0 <(anonymous namespace)::thread_wrapper(void*)>) at ../../src/kmp_csupport.cpp:328
#7  0x000000000040438d in main ()

LULESH is compiled with the following information:

icpc version 17.0.4

Library information:
	linux-vdso.so.1 =>  (0x00007ffd987a1000)
	/opt/apps/xalt/1.7/lib64/libxalt_init.so (0x00002ad1b40a2000)
	libm.so.6 => /usr/lib64/libm.so.6 (0x00002ad1b42a9000)
	libcaliper.so => /work/03915/taoncsu/tools/caliper.debug/lib64/libcaliper.so (0x00002ad1b45ab000)
	libstdc++.so.6 => /opt/apps/gcc/5.4.0/lib64/libstdc++.so.6 (0x00002ad1b4981000)
	libiomp5.so => /opt/intel/compilers_and_libraries/linux/lib/intel64/libiomp5.so (0x00002ad1b4cfb000)
	libgcc_s.so.1 => /opt/apps/gcc/5.4.0/lib64/libgcc_s.so.1 (0x00002ad1b509f000)
	libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00002ad1b52b5000)
	libc.so.6 => /usr/lib64/libc.so.6 (0x00002ad1b54d1000)
	libdl.so.2 => /usr/lib64/libdl.so.2 (0x00002ad1b5892000)
	libuuid.so.1 => /usr/lib64/libuuid.so.1 (0x00002ad1b5a96000)
	/lib64/ld-linux-x86-64.so.2 (0x00002ad1b3e7f000)
	libgotcha.so.0 => /work/03915/taoncsu/tools/caliper.debug/lib64/libgotcha.so.0 (0x00002ad1b5c9b000)
	librt.so.1 => /usr/lib64/librt.so.1 (0x00002ad1b5eab000)
	libcaliper-reader.so => /work/03915/taoncsu/tools/caliper.debug/lib64/libcaliper-reader.so (0x00002ad1b60b3000)
	libcaliper-common.so => /work/03915/taoncsu/tools/caliper.debug/lib64/libcaliper-common.so (0x00002ad1b639a000)
	libimf.so => /opt/intel/compilers_and_libraries/linux/lib/intel64/libimf.so (0x00002ad1b6634000)
	libsvml.so => /opt/intel/compilers_and_libraries/linux/lib/intel64/libsvml.so (0x00002ad1b6b21000)
	libirng.so => /opt/intel/compilers_and_libraries/linux/lib/intel64/libirng.so (0x00002ad1b7a3a000)
	libintlc.so.5 => /opt/intel/compilers_and_libraries/linux/lib/intel64/libintlc.so.5 (0x00002ad1b7daf000)

Possibly broken builds on regular systems

@jarusified mentioned an attempted build on a regular system at school.

We should attempt to reproduce this, notably that even though we were building with the MPI compilers, we didn't detect the presence of MPI. If we can't reproduce this in three months, we should close this, but I'm worried our FindMPI is brittle

Runtime with latest master using PGI compiler

Hello,

I am building one of our application with latest master and PGI compiler:

$ spack spec coreneuron@develop~nmodl %pgi ^caliper@develop
Input spec
--------------------------------
coreneuron@develop%pgi~nmodl
    ^caliper@develop

Concretized
--------------------------------
coreneuron@develop%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0" ~amd~arm build_type=RelWithDebInfo +caliper~debug~derivimplicit~gpu~ispc~knl+mpi~nmodl~openmp~profile+report+shared~skl+sympy~sympyopt~tests arch=linux-rhel7-x86_64
    ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0"  arch=linux-rhel7-x86_64
    ^caliper@develop%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0"  build_type=RelWithDebInfo ~callpath~dyninst~gotcha~libpfm+mpi~papi~sampler~sosflow arch=linux-rhel7-x86_64
        ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0" ~doc+ncurses+openssl+ownlibs~qt arch=linux-rhel7-x86_64
        ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0"  arch=linux-rhel7-x86_64
        ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0" +dbm~optimizations patches=123082ab3483ded78e86d7c809e98a804b3465b4683c96bd79a2fd799f572244 +pic+pythoncmd+shared~tk~ucs4 arch=linux-rhel7-x86_64
    ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0" +lex arch=linux-rhel7-x86_64
    ^[email protected]%[email protected] cxxflags="-D_GLIBCXX_USE_CXX11_ABI=0"  build_type=RelWithDebInfo ~profile+shared~tests arch=linux-rhel7-x86_64

At the end of execution, I get following :

== CALIPER: (0): Finishing ...
== CALIPER: (0): default: Flushing Caliper data
MPT ERROR: Rank 0(g:0) received signal SIGBUS(7).
        Process ID: 271342, Host: r2i3n5, Program: /gpfs/my-path//review/soft/install/linux-rhel7-x86_64/pgi-19.4/coreneuron-develop-j6
yl2e/bin/coreneuron_exec
        MPT Version: HPE MPT 2.16  06/02/17 01:08:38

MPT: --------stack traceback-------
MPT: Attaching to program: /proc/271342/exe, process 271342
MPT: [Thread debugging using libthread_db enabled]
MPT: Using host libthread_db library "/usr/lib64/libthread_db.so.1".
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/da/3deca31a7ad9416897c1a4a430798d77146ad4.debug
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: (no debugging symbols found)...done.
MPT: 0x00002aaaac010189 in waitpid () from /usr/lib64/libpthread.so.0
MPT: warning: File "/gpfs/my-path-soft/linux-rhel7-x86_64/gcc-4.8.5/gcc-6.4.0-i6lyqfscua/lib64/libstdc++.so.6.0.22-gdb.py" auto-loading ha
s been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
MPT: To enable execution of this file add
MPT:    add-auto-load-safe-path /gpfs/my-path-soft/linux-rhel7-x86_64/gcc-4.8.5/gcc-6.4.0-i6lyqfscua/lib64/libstdc++.so.6.0.22-gdb.py
MPT: line to your configuration file "/gpfs/my-path//review/.gdbinit".
MPT: To completely disable this security protection add
MPT:    set auto-load safe-path /
MPT: line to your configuration file "/gpfs/my-path//review/.gdbinit".
MPT: For more information about this security protection see the
MPT: "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
MPT:    info "(gdb)Auto-loading safe path"
MPT: Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 libbitmask-2.0-sgi716r63.rhel73.x86_64 libcpuset-1.0-sgi716r94.rhel73.x86_64 libibverbs-41mlnx1
-OFED.4.1.0.1.1.41102.x86_64 libmlx4-41mlnx1-OFED.4.1.0.1.0.41102.x86_64 libmlx5-41mlnx1-OFED.4.1.0.1.5.41102.x86_64 libnl-1.1.4-3.el7.x86_64 xpmem-1.6-sgi716r125.rhel73.x86_64
MPT: (gdb) #0  0x00002aaaac010189 in waitpid () from /usr/lib64/libpthread.so.0
MPT: #1  0x00002aaaac975806 in mpi_sgi_system (
MPT: #2  MPI_SGI_stacktraceback (
MPT:     header=header@entry=0x7fffffffa900 "MPT ERROR: Rank 0(g:0) received signal SIGBUS(7).\n\tProcess ID: 271342, Host: r2i3n5, Program: /gpfs/bbp.cscs.ch/project/proj20/pramod_s
cratch/SC_BENCHMARKS/review/soft/install/linux-rhel7-x86_64/pgi-1"...) at sig.c:339
MPT: #3  0x00002aaaac975a08 in first_arriver_handler (signo=signo@entry=7,
MPT:     stack_trace_sem=stack_trace_sem@entry=0x2aaab0280500) at sig.c:488
MPT: #4  0x00002aaaac975deb in slave_sig_handler (signo=7, siginfo=<optimized out>,
MPT:     extra=<optimized out>) at sig.c:563
MPT: #5  <signal handler called>
MPT: #6  0x00002aaaae39881d in __memset_sse2 () from /usr/lib64/libc.so.6
MPT: #7  0x00002aaaab438cbb in ~_Hashtable ()
MPT:     at /gpfs/my-path-soft/linux-rhel7-x86_64/gcc-4.8.5/gcc-6.4.0-i6lyqfscua/include/c++/6.4.0/bits/hashtable.h:1902
MPT: #8  cali::MpiTracing::~MpiTracing ()
MPT:     at /gpfs/my-path//review/spack/var/spack/stage/caliper-develop-vrxygjo4crqc3leut3xikhjfyi5mnq7i/Caliper/mpi/mpi-rt/services/mpiwrap/MpiTracing.cpp:347
MPT: #9  0x00002aaaab461e21 in ~MpiWrapperConfig ()
MPT:     at /gpfs/my-path//review/soft/.stage/kumbhar/spack-stage/spack-stage-9VnH6N/Caliper/spack-build/mpi/mpi-rt/services/mpiwrap/Wrapper.cpp:867
MPT: #10 cali::mpiwrap_init(cali::Caliper*, cali::Channel*)::{lambda(cali::Caliper*, cali::Channel*)#1} (c=<optimized out>, chn=<optimized out>)
MPT:     at /gpfs/my-path//review/soft/.stage/kumbhar/spack-stage/spack-stage-9VnH6N/Caliper/spack-build/mpi/mpi-rt/services/mpiwrap/Wrapper.cpp:14605
MPT: #11 0x00002aaaab6db202 in cali::Caliper ()
MPT:     at /gpfs/my-path//review/spack/var/spack/stage/caliper-develop-vrxygjo4crqc3leut3xikhjfyi5mnq7i/Caliper/src/caliper/Caliper.cpp:31
MPT: #12 0x00002aaaab6dd61f in cali::Caliper::ThreadData ()
MPT:     at /gpfs/my-path//review/spack/var/spack/stage/caliper-develop-vrxygjo4crqc3leut3xikhjfyi5mnq7i/Caliper/src/caliper/Caliper.cpp:275
MPT: #13 0x00002aaaab6d513a in std::unique_ptr::~unique_ptr ()
MPT:     at /gpfs/my-path-soft/linux-rhel7-x86_64/gcc-4.8.5/gcc-6.4.0-i6lyqfscua/include/c++/6.4.0/bits/unique_ptr.h:76
MPT: #14 0x00002aaaae347dda in __cxa_finalize () from /usr/lib64/libc.so.6
MPT: #15 0x00002aaaab6b7c53 in __do_global_dtors_aux ()
MPT:    from /gpfs/my-path//review/soft/install/linux-rhel7-x86_64/pgi-19.4/caliper-develop-vrxygj/lib64/libcaliper.so.2
MPT: #16 0x00007fffffffc410 in ?? ()
MPT: #17 0x00002aaaaaabab3a in _dl_fini () from /lib64/ld-linux-x86-64.so.2
MPT: Backtrace stopped: frame did not save the PC
MPT: (gdb) A debugging session is active.

Everything works fine if I use 1.9.1 instead.

Issue building to get CPU Information

Hi,

I was trying to build on Centos 6.10 with the following compiler and this line needed to change from

if (syscall(SYS_getcpu, &cpu, &node, NULL) == 0)

to:

if (syscall(SYS_get_cpu, &cpu, &node, NULL) == 0)

I am compiling with GCC 7.3.0.

Don't know whether its just specific to my system or it has changed recently.

Kind Regards,

Dean

if (syscall(SYS_getcpu, &cpu, &node, NULL) == 0) {

MacOS cali-query output broken

Cali file attached as txt to placate Github:

I'm trying to get JSON formatted cali-query outputs, and get correct results on Linux, incorrect on MacOS.

qyburn:cali_interactive_explorer poliakoff1$ cali-query -q "SELECT time.inclusive.duration format json()" *.cali
,
{"time.inclusive.duration":5001708},
{"time.inclusive.duration":5001878}
]

Interestingly

qyburn:cali_interactive_explorer poliakoff1$ cali-query -q "SELECT * format json()" *.cali

Works fine, and it all works on Linux, and the table outputter is fine on both operating systems

171110-133459_16234_soqJ6T0qUh3E.cali.txt

CMake Process Stalls when unable to get to Internet

Caliper appears to download source code during configuration for build. This is very unhelpful for machines without internet connectivity. Can you provide a build system that doesn't require this (maybe with recursive or something?)?

Limiting annotation depth in CalQL

This is more of a question than issue report. Is there a way to limit the depth of annotations in CalQL? So for instance if I have:

annotation mpi.rank time.offset time.inclusive.duration
main
main/foo
main/foo/bar
main/foo
main/foo/baz
main/foo
main
...

And I want to print out only sum total time for main and main/foo, is there some magic argument to achieve this? I guess one way to express this is to limit the depth, but another way would be to explicitly specify annotation prefixes I'm interested in.

If not, perhaps this is a feature request after all.

A little more rigor in prefixing CALI in CMake options

I've been looking at integrating Caliper into someone else's large build. As part of that, I found myself having to set BUILD_TESTING to OFF in order to make Caliper not integrate GTest. Could you do a once-over and make that (and anything else that might not be Caliper-exclusive nomenclature) CALI_BUILD_TESTING? I worry that somebody else has BUILD_TESTING as an option that means something else, and we wind up being the source of a nasty bug.

Build breaks on MacOS with default settings

Could we guard the WITH_GOTCHA setting based on whether we're on a system that supports Gotcha? Right now on a Mac if I just do CMake and let Caliper figure out what's appropriate it just dies.

Also I don't trust those Gotcha developers.

Error while enabling TAU

Hello,

I was trying to enable TAU support but getting following error:

cd /tmp/kumbhar/spack-stage/spack-stage-2V5hVI/Caliper/spack-build/src/common && /gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/lib/spack/env/intel/icpc  -Dcaliper_common_EXPORTS -I/tmp/kumbhar/spack-stage/spack-stage-2V5hVI/Caliper/spack-build/include -I/gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/include -I/tmp/kumbhar/spack-stage/spack-stage-2V5hVI/Caliper/spack-build -I/gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/src  -std=c++11 -O2 -g -DNDEBUG -fPIC   -o CMakeFiles/caliper-common.dir/SnapshotBuffer.cpp.o -c /gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/src/common/SnapshotBuffer.cpp
/gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/src/services/tau/tau.cpp(95): error: expected a ")"
              Tau_stop((const char*)(value.data());
                                                  ^

/gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/include/caliper/AnnotationBinding.h(204): error #140: too many arguments in function call
                  binding->finalize(c,chn);
                                      ^
          detected during instantiation of "void cali::AnnotationBinding::make_binding<BindingT>(cali::Caliper *, cali::Channel *) [with BindingT=<unnamed>::TAUBinding]" at line 105 of "/gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/src/services/tau/tau.cpp"

compilation aborted for /gpfs/bbp.cscs.ch/project/proj7/kumbhar/spack/var/spack/stage/caliper-master-iuak4wkzhfp5ohn3ttyoa43gqgfrqcvj/Caliper/src/services/tau/tau.cpp (code 2)
make[2]: *** [src/services/tau/CMakeFiles/caliper-tau.dir/tau.cpp.o] Error 2
make[2]: Leaving directory `/tmp/kumbhar/spack-stage/spack-stage-2V5hVI/Caliper/spack-build'
make[1]: *** [src/services/tau/CMakeFiles/caliper-tau.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

I have used caliper instrumentation in the application and I get summary like below:

== CALIPER: (0): Flushing Caliper data
== CALIPER: (0): Aggregate: flushed 58 snapshots.
Path                         Min time/rank  Max time/rank  Avg time/rank  Time % (total)
main                         1734832.000000 1854442.000000 1800110.055556      85.460410
  MPI_Finalized                    1.000000       2.000000       1.472222       0.000070
  MPI_Initialized                  1.000000       9.000000       2.972222       0.000141
  MPI_Finalize                    38.000000   18670.000000    6490.166667       0.308121
  simulation                    1955.000000    3952.000000    2304.138889       0.109389
    spike-exchange                 1.000000       4.000000       2.194444       0.000104
      communication                1.000000       4.000000       2.166667       0.000103
        MPI_Allgather              6.000000      10.000000       7.972222       0.000378
      imbalance                    2.000000       5.000000       3.222222       0.000153
        MPI_Barrier                4.000000       7.000000       6.361111       0.000302
    timestep                   16869.000000   31784.000000   18752.777778       0.890290
      state-update             10661.000000   19779.000000   11818.472222       0.561083
        state-hh                6243.000000   11275.000000    6926.583333       0.328840
        state-ExpSyn            3481.000000    6472.000000    3896.888889       0.185005
        state-pas               2673.000000    5033.000000    2998.777778       0.142367
      update                    3548.000000    6855.000000    4053.083333       0.192421
      second_order_cur          2620.000000    4989.000000    2947.277778       0.139922
      matrix-solver             6605.000000   13429.000000    8186.250000       0.388643
      setup_tree_matrix        28000.000000   52004.000000   31500.194444       1.495475
        cur-hh                  5729.000000   10515.000000    6361.777778       0.302026
        cur-ExpSyn              5645.000000   10426.000000    6270.027778       0.297670
        cur-k_ion               2610.000000    4897.000000    2914.444444       0.138364
        cur-na_ion              2720.000000    5105.000000    3088.611111       0.146632
        cur-pas                 7176.000000   13420.000000    8167.000000       0.387729
      deliver_events            6171.000000   11643.000000    6917.055556       0.328388
        spike-exchange           149.000000     322.000000     178.527778       0.008476
          communication          144.000000     275.000000     173.444444       0.008234
            MPI_Allgatherv      1727.000000    1914.000000    1842.527778       0.087474
            MPI_Allgather        633.000000     927.000000     823.777778       0.039109
          imbalance              113.000000     233.000000     132.555556       0.006293
            MPI_Barrier          446.000000     643.000000     582.333333       0.027646
        MPI_Barrier             1362.000000   98208.000000   85062.916667       4.038371
    MPI_Barrier                   13.000000   22021.000000   13972.194444       0.663331
  MPI_Gatherv                      5.000000      14.000000       6.722222       0.000319

All good. I was wondering if I could be able to export this information to TAU profilers so that I don't have to use different instrumentation mechanism. Is this intended use of TAU support in Caliper?

Can caliper only record time for OMP master thread?

I have a code as follows:

#pragma OMP parallel for
for()
{
for loop1
for loop2
}

I want to measure runtime of loop1 and loop2 with caliper.
But simply adding instrumentation CALI_MARK_LOOP_BEGIN/END would not work, because the issue in #77.

So, I wonder whether there is caliper runtime configuration such that only the master thread runtime is measured. In this way, I can achieve my initial goal.

[ feature request ] recorder filename patterns

As it stands, the recorder filename can either be auto-generated, or it can be a single static value. It would be quite helpful to be able to specify a pattern from which to generate the name, or to be able to specify it through an API at runtime.

Sequoia to BlueOS architecture/compiler testing

Via Mike Collette:

We're noticing when we get outside of a Cab build environment that the builds fail. Sampling and callpaths don't work on either, I believe. At the very least smartly disabling those so that people don't get errors, but hopefully just porting to those architectures and regularly testing across all their compilers (PGI to ICC) so that we don't fall on our faces in that way would be nice.

(Note, on both this and #100, Mike said the ideas a lot more politely)

Infinite recursion in Filter.h

Self-assigning so I don't forget to fix this. In Filter.h, if a class deriving from Filter doesn't implement initialize, we get an infinite loop. Oops.

Minimal resolution: hotfix so we don't get an infinite loop
Preferred resolution: Refactoring of Filter to be more effective

Edit: I can't assign myself as responsible. But this will remind @daboehme to bother me about it

"Attempt to free invalid pointer" on Cray

Hi,

I'm trying to use Caliper on a Cray XC30 (ARCHER). The environment is as follows:

modules/3.2.10.6
eswrap/1.3.3-1.020200.1280.0
switch/1.0-1.0502.60522.1.61.ari
cce/8.6.5
craype-network-aries
craype/2.5.10
cray-libsci/16.11.1
udreg/2.3.2-1.0502.10518.2.17.ari
ugni/6.0-1.0502.10863.8.29.ari
pmi/5.0.13
dmapp/7.0.1-1.0502.11080.8.76.ari
gni-headers/4.0-1.0502.10859.7.8.ari
xpmem/0.1-2.0502.64982.5.3.ari
dvs/2.5_0.9.0-1.0502.2188.1.116.ari
alps/5.2.4-2.0502.9774.31.11.ari
rca/1.0.0-2.0502.60530.1.62.ari
atp/2.1.0
PrgEnv-cray/5.2.82
pbs/12.2.401.141761
craype-ivybridge
cray-mpich/7.7.0
packages-archer
bolt/0.6
nano/2.2.6
leave_time/1.3.0
quickstart/1.0
ack/2.14
xalt/0.6.0
openssl/1.1.0g_build1
curl/7.58.0_build1
git/2.16.2_build1
epcc-tools/8.0
cmake/3.10.2
pcre/8.35

I build Caliper like this (all static libraries as the platform doesn't advise dynamic linking):

git clone https://github.com/LLNL/Caliper caliper
cd caliper
git checkout v1.7.0
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE="Release" -DCMAKE_INSTALL_PREFIX=$HOME -DBUILD_SHARED_LIBS=OFF -DWITH_MPI=ON -DWITH_GOTCHA=OFF ..
make && make install

(I disable GOTCHA as Cray doesn't like the -fvisibility flags. I've tried building Caliper with GOTCHA and manually disabling these flags, and it builds but gives the same end problem).

I've successfully used Caliper with my application on other machines, but here when I run the application terminates immediately with the following errors and backtrace:

Error:

craylibs//google-perftools/src/tcmalloc.cc:644] Attempt to free invalid pointer: 0x4000000000000000
craylibs//google-perftools/src/tcmalloc.cc:644] Attempt to free invalid pointer: 0x4000000000000000
craylibs//google-perftools/src/tcmalloc.cc:644] Attempt to free invalid pointer: 0x4000000000000000
craylibs//google-perftools/src/tcmalloc.cc:644] Attempt to free invalid pointer: 0x4000000000000000
_pmiu_daemon(SIGCHLD): [NID 02550] [c5-1c0s13n2] [Sat Jun 30 15:37:31 2018] PE RANK 0 exit signal Aborted
[NID 02550] 2018-06-30 15:37:31 Apid 31272615: initiated application termination

Backtrace:

#0  0x000000000069e8ab in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x0000000000adde31 in abort () at abort.c:92
#2  0x0000000000acd1a9 in TCMalloc_CRASH_internal(bool, char const*, int, char const*, __va_list_tag*) ()
#3  0x0000000000acd444 in TCMalloc_CrashReporter::PrintfAndDie(char const*, ...) ()
#4  0x0000000000ac9ebe in (anonymous namespace)::InvalidFree(void*) ()
#5  0x0000000000b6f07d in tc_deletearray ()
#6  0x0000000000403f8b in void std::_Destroy_aux<false>::__destroy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
    at /opt/gcc/6.1.0/snos/include/g++/bits/stl_construct.h:103
#7  0x0000000000403cf9 in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::~vector() () at /opt/gcc/6.1.0/snos/include/g++/bits/stl_vector.h:426
#8  0x00000000005e4a06 in AggregateDB::init_static_data() () at /.../timrlaw/repos/caliper/src/services/aggregate/Aggregate.cpp:415
#9  0x00000000005e70bc in AggregateDB::aggregate_register(cali::Caliper*) () at /.../timrlaw/repos/caliper/src/services/aggregate/Aggregate.cpp:844
#10 0x00000000005d1a53 in cali::Services::ServicesImpl::register_services(cali::Caliper*) () at /.../timrlaw/repos/caliper/src/services/Services.cpp:106
#11 0x00000000005b9e44 in cali::Caliper::GlobalData::init() () at /.../timrlaw/repos/caliper/src/caliper/Caliper.cpp:327
#12 0x00000000005b909f in cali::Caliper::GlobalData::GlobalData() () at /.../timrlaw/repos/caliper/src/caliper/Caliper.cpp:269
#13 0x00000000005b825f in cali::Caliper::instance() () at /.../timrlaw/repos/caliper/src/caliper/Caliper.cpp:1358
#14 0x00000000005b50da in cali::Caliper::Caliper() () at /.../timrlaw/repos/caliper/src/caliper/Caliper.cpp:1324
#15 0x00000000005acba1 in cali::Function::Function(char const*) () at /.../timrlaw/repos/caliper/src/caliper/Annotation.cpp:59
#16 0x0000000000401cad in main () at ...

Any ideas?

Thanks,
Tim

Standardize Unit Metadata

Your daily issue from me! Also pinging @alfredo-gimenez for no reason whatsoever:

qyburn:build poliakoff1$ cali-query -q "SELECT * FORMAT JSON()" --list-attributes *.cali
[
...
{"class.aggregatable":"true","time.unit":"usec","cali.attribute.prop":"85","cali.attribute.type":"uint","cali.attribute.name":"time.inclusive.duration","attribute.id":55}
...
]

Here we do something that is so close to perfect, we record the units of our metrics. However, the way in which they're recorded is (I imagine) nonstandard, we have "time.unit" for this unit, somebody else might have (the equivalent of) "time.inclusive.duration.unit," and that might be correct. I'd like it if in the same way we know that time.inclusive.duration is a "uint", we know that it's in units of "usec."

My end goal is to replace (augment?) "time.inclusive.duration" with "usec" on the y-axis labels

screen shot 2017-11-15 at 9 56 52 am

Record Caliper configuration in output

I'm writing processing pipelines for Caliper profiles. One of the tricky things is that a lot of my current processing is based on serial-trace configured profiles, which is quite possibly wrong. One thing I don't know how to know is what the configuration of the Caliper profiles I'm looking at were. It would be nice if there was a way to see this in the profiles. Specifically, pipelines will be very curious whether the profile they're examining was a trace or a profile, as this will impact whether we should do aggregation in cases like this
frame

Where we have trace entries which are essentially "intermediate" ones created as metric Annotations are started and stopped and set.

Caliper not saving profiles when MPI enabled

Hello,

I have seen this few times and really confused by this behaviour:

I have correctly linked caliper to my application:

$ ldd /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/coreneuron-develop-npo4yx/bin/coreneuron_exec
	linux-vdso.so.1 =>  (0x00007fffedb05000)
	libcorenrnmech.so => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/coreneuron-develop-npo4yx/lib64/libcorenrnmech.so (0x00007fffed8ff000)
	libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007fffed6a9000)
	libmpi.so => /opt/hpe/hpc/mpt/mpt-2.16/lib/libmpi.so (0x00007fffed2d4000)
	libmpi++abi1002.so => /opt/hpe/hpc/mpt/mpt-2.16/lib/libmpi++abi1002.so (0x00007fffed098000)
	libcoreneuron.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/coreneuron-develop-npo4yx/lib64/libcoreneuron.so.1 (0x00007fffecdac000)
	libcaliper-mpi.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/caliper-1.9.1-esvxyt/lib64/libcaliper-mpi.so.1 (0x00007fffecb50000)
	libcaliper.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/caliper-1.9.1-esvxyt/lib64/libcaliper.so.1 (0x00007fffec885000)
	libcaliper-mpi-common.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/caliper-1.9.1-esvxyt/lib64/libcaliper-mpi-common.so.1 (0x00007fffec674000)
	libcaliper-reader.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/caliper-1.9.1-esvxyt/lib64/libcaliper-reader.so.1 (0x00007fffec3ec000)
	libcaliper-common.so.1 => /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/caliper-1.9.1-esvxyt/lib64/libcaliper-common.so.1 (0x00007fffec1b1000)
	libstdc++.so.6 => /gpfs/bbp.cscs.ch/apps/compilers/install/gcc-6.4.0/lib64/libstdc++.so.6 (0x00007fffebe30000)
	libm.so.6 => /usr/lib64/libm.so.6 (0x00007fffebb2e000)
	libgcc_s.so.1 => /gpfs/bbp.cscs.ch/apps/compilers/install/gcc-6.4.0/lib64/libgcc_s.so.1 (0x00007fffeb917000)
	libc.so.6 => /usr/lib64/libc.so.6 (0x00007fffeb554000)
	libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007fffeb34f000)
	libimf.so => /gpfs/bbp.cscs.ch/apps/compilers/install/intel-18.0.1/lib/intel64_lin/libimf.so (0x00007fffeadc1000)
	libsvml.so => /gpfs/bbp.cscs.ch/apps/compilers/install/intel-18.0.1/lib/intel64_lin/libsvml.so (0x00007fffe970d000)
	libirng.so => /gpfs/bbp.cscs.ch/apps/compilers/install/intel-18.0.1/lib/intel64_lin/libirng.so (0x00007fffe9399000)
	libintlc.so.5 => /gpfs/bbp.cscs.ch/apps/compilers/install/intel-18.0.1/lib/intel64_lin/libintlc.so.5 (0x00007fffe912c000)
	/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
	librt.so.1 => /usr/lib64/librt.so.1 (0x00007fffe8f23000)
	libcpuset.so.1 => /usr/lib64/libcpuset.so.1 (0x00007fffe8d16000)
	libbitmask.so.1 => /usr/lib64/libbitmask.so.1 (0x00007fffe8b10000)

If I run with non-mpi configuration, I see correct profiles:

$ CALI_CONFIG_PROFILE=runtime-report srun -n 1 /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/coreneuron-develop-npo4yx/bin/coreneuron_exec -mpi -e 10 -d coredat/
 num_mpi=1
 num_omp_thread=1

 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2015
.....

== CALIPER: (0): Flushing Caliper data
== CALIPER: (0): Aggregate: flushed 33 snapshots.
Path                      Inclusive time (usec) Exclusive time (usec) Time %
main                             2065686.000000           1105.000000  0.053478
  checkpoint                           7.000000              7.000000  0.000339
  output-spike                      5387.000000           5387.000000  0.260713
  simulation                     2025676.000000           4591.000000  0.222190
    spike-exchange                   370.000000             29.000000  0.001404
      communication                  314.000000            314.000000  0.015197
      imbalance                       27.000000             27.000000  0.001307
....

All good. But with CALI_CONFIG_PROFILE=mpi-runtime-report:

$ CALI_CONFIG_PROFILE=mpi-runtime-report srun -n 2 /gpfs/bbp.cscs.ch/project/proj7/kumbhar/HPCTM-1227/soft/install/linux-rhel7-x86_64/intel-18.0.1/coreneuron-develop-npo4yx/bin/coreneuron_exec -mpi -e 10 -d coredat/
 num_mpi=2
 num_omp_thread=1


 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2015
 version id unimplemented

 Additional mechanisms from files
 halfgap.mod

== CALIPER: (0): Registered aggregation service
== CALIPER: (0): Registered event trigger service
== CALIPER: (0): Registered timestamp service
== CALIPER: (0): Registered MPI service
== CALIPER: (0): Initialized
 Memory (MBs) :             After mk_mech : Max 9.4883, Min 9.2969, Avg 9.3926
 num_mpi=2
 num_omp_thread=1
....


== CALIPER: (0): Flushing Caliper data
== CALIPER: (0): Aggregate: flushed 33 snapshots.

We do have MPI_Finalize in the application. Any hints to debug this?

How to "set" nonstandard datatypes

Right now cali::Annotation will set an int, double, const char*, or Variant. Are there plans to expand this group? I worry about casting (for example) uint64_t's to doubles all over the place.

Does cali-query do multi-threaded processing?

cali-query: processing 1 files using 1 thread.
== CALIPER: Trace: Flushed 13 snapshots.
== CALIPER: Recorder: Wrote 90 records.

Does it mean there is a switch to enable multi-threaded processing of cali-query?
When the caliper output is large, it would be nice to parallelize the processing.

NVPROF does not cycle range colors

When using the caliper nvprof backend, and writing out a file, the ranges all show up as green in NVVP. This makes the usability limited when importing the data into NVVP.

Example run on LLNL SIERRA:
lrun -n 1 nvprof -o output.nvprof myexecutable
(with caliper linked in and using NVPROF backend)

Importing the output.nvprof file into NVVP shows the ranges as all green. @DavidPoliakoff mentioned that Caliper should be cycling the choice of color for each new Caliper annotation in code, so you could differentiate these in NVVP.

Thanks

Data tracking not recording active memory in aggregation configuration

I have two configurations

  if(memoryEvents){
    caliper_kokkos_track_memory = true;
    additional_services+=":alloc";
  }
  cali::config_map_t default_config {
   {"CALI_RECORDER_FILENAME",fileOutput},
   {"CALI_ALLOC_RECORD_ACTIVE_MEM","true"},
   {"CALI_SERVICES_ENABLE","timestamp:event:aggregate:recorder" + additional_services}
  };
  cali::config_map_t trace_config {
   {"CALI_RECORDER_FILENAME",fileOutput},
   {"CALI_SERVICES_ENABLE","timestamp:event:trace:recorder" + additional_services}
  };

When I run in the trace mode, I see memory events and allocation sizes, everything works, but in aggregation mode I'm not seeing the allocation sizes. Any thoughts?

CALI_ATTR_HIDDEN seems underutilized

I was writing up a PR to introduce this new attribute for Caliper Attributes, CALI_ATTR_INTERNAL, which wouldn’t show up in output services, when I realized it already exists as CALI_ATTR_HIDDEN. Why aren’t cali.event.[begin/set/end] CALI_ATTR_HIDDEN, also cali.caliper.version and maybe cali.channel? This would reduce the number of times you need to specify a query for the report service.

Am I missing something there?

Outdated reference to SigsafeRWLock.h

It looks like NetOut.cpp still references SigsafeRWLock.h, which was removed in commit 207cb46. I ran into this with libcurl enabled:

[ 47%] Building CXX object src/services/netout/CMakeFiles/caliper-netout.dir/NetOut.cpp.o
/home/jistone/src/caliper/src/services/netout/NetOut.cpp:10:27: fatal error: SigsafeRWLock.h: No such file or directory
 #include <SigsafeRWLock.h>
                           ^
compilation terminated.
make[2]: *** [src/services/netout/CMakeFiles/caliper-netout.dir/NetOut.cpp.o] Error 1
make[1]: *** [src/services/netout/CMakeFiles/caliper-netout.dir/all] Error 2
make: *** [all] Error 2

Segmentation fault linking Caliper in Fortran

I'm trying to call caliper in some fortran code. I put in some basic initialization like this:

#if(USE_CALIPER==1)
         CALL caliper_mp_cali_begin_byname('initialization')
         cali_ret = 0
         CALL caliper_mp_cali_end_byname('initialization')
#endif

And yet when I run this, I get the following:

#0  0x0000000000adf6d7 in __intel_avx_rep_memcpy ()
#1  0x0000000000a65ee3 in for_trim ()
#2  0x0000000000a042a8 in caliper_mp_cali_begin_byname_ ()
#3  0x0000000000404d42 in hf_odd () at hfodd.f:1218
#4  0x0000000000404c42 in main ()
#5  0x000015554c9e4b97 in __libc_start_main (main=0x404c10 <main>, argc=1, argv=0x7fffffffb1b8, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffb1a8) at ../csu/libc-start.c:310
#6  0x0000000000404b2a in _start ()

To be clear, I'm compiling this with -lcaliper and -cxxlib on intel 2019 fortran compiler. I build the caliper.o file separately and link it in to the main executable.

Need simpler API for Caliper config

@daboehme as we discussed in person, I need a simpler API for Caliper configuration for application integration. The use case is that an application may get configured from the command line or input deck, ala:

./myapp --caliper=

An example of might be "spot,mpi,topdown"

I'd like to get the parsing and configuration settings based on inside Caliper, rather than have it in the application. That way we can expand Caliper capabilities without having to go back to the option parsing of each application that integrated Caliper.

We'll also need some of kind of error checking for , which is separate from the configuration. Many apps will want to check whether they were passed a valid string before they move on to initialize phase, and only Caliper will know whether the string is valid.

caliper-config.cmake.in needs update

set(caliper_LIB_DIR        ${caliper_INSTALL_PREFIX}/lib)

This will often need to be caliper_INSTALL_PREFIX/lib64, not sure how you want to make that happen

Incorrect runtime-report for `CALI_ATTR_NESTED | CALI_SCOPE_PROCESS`

Problem Description

  • Start "total" marker at top of function
  • Create lambda that runs fibonacci calc
  • Start "worker" marker before launching thread that runs fibonacci(n)
  • Stop "worker" marker after launching thread but before join()
    • Thus should report ~0.0
  • Start "master" marker that runs fibonacci(n-1)
    • This should finish before the worker thread finishes by a fairly significant factor
  • Join worker thread
  • Stop "total" marker

Caliper Usage

cali_id_t id = cali_create_attribute("timemory", CALI_TYPE_STRING, (CALI_ATTR_NESTED | CALI_ATTR_SCOPE_PROCESS));
cali_begin(id, label);
// ...
cali_end(id);

Pseudo-code

intmax_t
fibonacci(intmax_t n)
{
    return (n < 2) ? n : fibonacci(n - 1) + fibonacci(n - 2);
}

intmax_t
time_fibonacci(intmax_t n, const std::string& scope_tag, const std::string& type_tag)
{
    // caliper marker here
    return fibonacci(n);
}

intmax_t
test_caliper(intmax_t nfib, const std::string& scope_tag)
{
    std::atomic<int64_t> ret;
    auto run_fibonacci = [&](long n, const std::string& type_tag) {
        ret += time_fibonacci(n, scope_tag, type_tag);
    };

    // START MARKER:    "total"

    // START MARKER:    "worker"
    std::thread t(run_fibonacci, nfib, "worker");
    // STOP MARKER:     "worker" 

    // START MARKER:    "master"
    run_fibonacci(nfib - 1, "master");
    // STOP MARKER:     "master"

    t.join();

    // STOP MARKER:     "total"

    return ret.load();
}

Command

  • time ./test_caliper 47
    • runs fibonacci(47) on worker thread in ~8.2 seconds
    • runs fibonacci(46) on master thread in ~5 seconds

Result from Caliper

== CALIPER: Finishing ...
== CALIPER: default: Flushing Caliper data
== CALIPER: default: Aggregate: flushed 8 snapshots.
Path                                          Inclusive time Exclusive time Time %    
test_caliper[total-process-scope]                  16.431375       3.266690 19.879825 
  test_caliper[worker-thread-process-scope]         0.000032       0.000032  0.000195 
  test_caliper[master-thread-process-scope]        13.164653       8.215328 49.995341 
    time_fibonacci[process-master]                  4.949325       0.000020  0.000122 
      time_fibonacci[process-worker]                4.949305       4.949305 30.119576 
== CALIPER: Finished

Result from time

real	0m8.256s
user	0m13.113s
sys	0m0.037s

Result from timemory

> [cxx] test_caliper[total-process-scope]           :     8.216059 sec real, 1 laps, depth 0 (exclusive:  39.8%)
> [cxx] |_test_caliper[worker-thread-process-scope] :     0.000031 sec real, 1 laps, depth 1
> [cxx] |_test_caliper[master-thread-process-scope] :     4.949330 sec real, 1 laps, depth 1 (exclusive:   0.0%)
> [cxx]   |_time_fibonacci[process-master]          :     4.949297 sec real, 1 laps, depth 2
> [cxx]     |_time_fibonacci[process-worker]        :     8.215310 sec real, 1 laps, depth 3

[user]> Outputting 'timemory-test-caliper-output/user.txt'... Done

> [cxx] test_caliper[total-process-scope]           :    13.100000 sec user, 1 laps, depth 0 (exclusive:  24.7%)
> [cxx] |_test_caliper[worker-thread-process-scope] :     0.000000 sec user, 1 laps, depth 1
> [cxx] |_test_caliper[master-thread-process-scope] :     9.860000 sec user, 1 laps, depth 1 (exclusive:   0.0%)
> [cxx]   |_time_fibonacci[process-master]          :     9.860000 sec user, 1 laps, depth 2
> [cxx]     |_time_fibonacci[process-worker]        :    13.100000 sec user, 1 laps, depth 3

[sys]> Outputting 'timemory-test-caliper-output/sys.txt'... Done

> [cxx] test_caliper[total-process-scope]           :     0.020000 sec sys, 1 laps, depth 0 (exclusive:   0.0%)
> [cxx] |_test_caliper[worker-thread-process-scope] :     0.000000 sec sys, 1 laps, depth 1
> [cxx] |_test_caliper[master-thread-process-scope] :     0.020000 sec sys, 1 laps, depth 1 (exclusive:   0.0%)
> [cxx]   |_time_fibonacci[process-master]          :     0.020000 sec sys, 1 laps, depth 2 (exclusive:   0.0%)
> [cxx]     |_time_fibonacci[process-worker]        :     0.020000 sec sys, 1 laps, depth 3

[cpu_util]> Outputting 'timemory-test-caliper-output/cpu_util.txt'... Done

> [cxx] test_caliper[total-process-scope]           :   159.687500 % cpu_util, 1 laps, depth 0
> [cxx] |_test_caliper[worker-thread-process-scope] :     0.000000 % cpu_util, 1 laps, depth 1
> [cxx] |_test_caliper[master-thread-process-scope] :   199.622787 % cpu_util, 1 laps, depth 1
> [cxx]   |_time_fibonacci[process-master]          :   199.623795 % cpu_util, 1 laps, depth 2 (exclusive:  20.0%)
> [cxx]     |_time_fibonacci[process-worker]        :   159.701706 % cpu_util, 1 laps, depth 3

Error Description

This is incorrect:

test_caliper[total-process-scope]                  16.431375       3.266690 19.879825 

The CPU time was ~13 seconds, the wall-clock time was ~8.2 seconds so there is ~3.3 seconds of CPU time unaccounted for.

  test_caliper[master-thread-process-scope]        13.164653       8.215328 49.995341 

actually reports the correct values.

If I make the problem larger (time ./test_caliper 48), this results in 4.35 seconds extra time reported:

Caliper

test_caliper[total-process-scope]                  24.625204       4.345012 17.643928 
  test_caliper[worker-thread-process-scope]         0.000064       0.000064  0.000260 
  test_caliper[master-thread-process-scope]        20.280128      12.312165 49.996398 
    time_fibonacci[process-master]                  7.967963       0.000019  0.000077 
      time_fibonacci[process-worker]                7.967944       7.967944 32.355682 

time

real	0m12.325s
user	0m20.253s
sys	0m0.020s

time ./test_caliper 49 results in ~8.3 seconds of extra time reported, time ./test_caliper 50 results in ~11.9 seconds of extra time reported, and so on...

Side Note

The labels Inclusive time and Exclusive time are misleading in this mode. In general, the representation of Inclusive time is much closer to CPU time than a wall-clock time and Exclusive time is closer to a thread's wall-clock time.

How to Reproduce

  • Issue exists on both macOS and Linux (Ubuntu)
  • Assumes Caliper is in CMAKE_PREFIX_PATH
  • Depending on when this is reproduced, you may have to remove -b caliper below
git clone -b caliper https://github.com/NERSC/timemory.git
mkdir build-timemory
cd build-timemory
export CALI_CONFIG_PROFILE=runtime-report
cmake -DTIMEMORY_BUILD_EXAMPLES=ON ../timemory
make test_caliper && time ./test_caliper 47

mpi-trace analysis

I have a MPI program P, instrumented with Caliper.
W/ CALI_CONFIG_PROFILE=mpi-trace, P produces #MPI rank .cali files.

Since the file names are randomized, I want to know how to determine which one should be used in order to determine caliper timing of different hot loops within P, and how to analyze these files?

Thanks a lot!

A question about accessing memory information

Hello there!
After I finished Caliper documentation, I still did not know how to use Caliper flexibly.
If I have a program, and after running this program, how to see memory access information of the program through Caliper? The more information I hope to get the memory access, the better, the more detailed the better.
Thanks very much!

[ feature request ] API for printing report

Hello,

I didn't find it in the documentation, is there an API to print the runtime-report from within the code.
I would like to print it to a file at different step of my code to be able to follow the evolution of it.

Thanks for your work,

regards,

XL.

Attribute filtering by name patterns (re or glob or..?)

Using the current whitelist feature gets us a good bit of the way, but it would be quite useful for some of the flux-core use-cases to be able to specify a whitelist, or blacklist, pattern for attribute names to include rather than the full list of explicit names. It would actually make the need for #6 rather less pressing.

Odd fork/exec/pthread interaction behavior

It appears that some interaction between pthreads and fork/exec is causing caliper to emit records in the child during the exec phase as well as in the parent, causing duplicate records in the output. A small reduced example using just fork and exec does not exhibit this behavior, so it may be related to finalization methods on a per-thread basis.

Unknown reason for more than 5X overhead

Caliper measured runtime is more than 5X than the actual.

The setting for measurement is as follows.

Application: Cloverleaf, https://github.com/UK-MAC/CloverLeaf
Caliper configuration: CALI_CONFIG_PROFILE=thread-trace
OMP configuration:
export OMP_NUM_THREADS=8
export KMP_AFFINITY=verbose,granularity=fine,proclist=[0-7],explicit
Compiler: Intel icc/ifort
Machine: a Broadwell node

Data processing:
cali-query -s time.inclusive.duration --table a.cali

Then, aggregate by loop names.

Sample output: format is <loop name, per-iteration runtime, loop total runtime, #loop iterations>
accelerate 7877.680 236330.400 30
cell1 4471.017 536522.000 120
cell2 3791.833 455020.000 120
cell3 5261.455 1262749.100 240
cell4 9991.950 2398068.100 240
cell5 4459.765 535171.800 120
cell6 3822.062 458647.500 120
cell7 6235.129 1496431.000 240
cell8 10099.415 2423859.600 240
dt 12053.170 361595.100 30
field 6103.430 183102.900 30
flux 6289.313 188679.400 30
gas 3801.574 250903.900 66
mom10 1879.398 902111.100 480
mom11 2650.991 1272475.900 480
mom12 2572.809 1234948.500 480
mom13 3468.235 1664752.900 480
mom14 3229.828 1550317.600 480
mom1 4523.797 1085711.400 240
mom2 4513.701 1083288.300 240
mom3 3827.700 918647.900 240
mom4 3845.044 922810.600 240
mom5 1886.059 905308.200 480
mom6 2651.195 1272573.600 480
mom7 2570.735 1233952.800 480
mom8 3288.488 1578474.000 480
mom9 3222.277 1546693.100 480
pdv1 9435.020 2264404.800 240
pdv2 10848.792 2603710.000 240
revert 3050.347 91510.400 30
viscosity 7645.947 229378.400 30

Sum of loop total runtime: 3.31482e+07 ms = 33.1482 seconds
Total runtime measured by timer: 5.5733
Runtime ratio of all the measured loops is 92% of total runtime (Obtained by profiling)

I used similar settings for another benchmark before, and my impression is that caliper overhead is very low and neglectable.

Anybody knows why the overhead is so high for Cloverleaf?

Thanks a lot.

High overhead on KNL for LULESH2.0

@tgamblin @davidbeckingsale
Caliper overhead on KNL (LULESH2.0) is 20%. What would be the possible cause?

#configurations
CALI_CONFIG_PROFILE = serial-trace
ICC 17.0.4
Linked w/ iomp5 or gomp (results similar)

#threads | O3 | O3 w/ caliper instrumentation | overhead
------------ | -------------
8 | 33.77 | 42.18 | -0.199
16 | 21.55 | 25.44 | -0.153
32 | 21.81 | 24.84 | -0.122
64 | 24.83 | 28.13 | -0.117
128 | 24.98 | 27.01 | -0.075
256 | 30.72 | 32.92 | -0.067

How to sort at each node level by inclusive time or name?

Hello,

I have following config :

CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timestamp
CALI_TIMER_INCLUSIVE_DURATION=true
CALI_MPIREPORT_CONFIG="select avg(sum#time.inclusive.duration) GROUP BY annotation,function FORMAT tree"

and that gives me following result :

Path                             avg#sum#time.inclusive.duration
main                                                    5.000000
  finitialize                                       53828.000000
    cur-na3                                            13.000000
    cur-kmb                                            10.000000
    cur-kdr                                            72.000000
    cur-kdrb                                           15.000000
    cur-kdb                                             2.000000
    cur-kd2                                             2.000000
    cur-kca                                            82.000000
    cur-kap                                            20.000000
    cur-kad                                            81.000000
    cur-hd                                             57.000000
    cur-nax                                            93.000000
  load-model                                       241803.000000
  checkpoint                                            4.000000
  output-spike                                       1578.000000
  simulation                                      4748402.000000
    timestep                                      4746831.000000
      update                                         9292.000000
      second_order_cur                                539.000000
      matrix-solver                                 54642.000000
      deliver_events                                 3128.000000
      setup_tree_matrix                           1395976.000000
        cur-ProbGABAAB_EMS                         142826.000000
        cur-ProbAMPANMDA_EMS                        96334.000000
        cur-nax                                     33397.000000
        cur-na3                                      3433.000000
        cur-kmb                                      3595.000000
        cur-kdr                                     25821.000000
        cur-kdrb                                     5230.000000
        cur-kdb                                       604.000000
        cur-kd2                                       674.000000
        cur-kca                                     29835.000000
      state-update                                3277329.000000
        state-can                                  455254.000000
        state-cal                                  358501.000000
        state-cagk                                 309065.000000
        state-cacum                                 35340.000000
        state-cacumb                                 3803.000000
        state-kdrb                                  40839.000000
        state-kdb                                     663.000000
        state-kd2                                     690.000000

How can I sort the children of finitialize, setup_tree_matrix and state-update by inclusive time or even name? I tried adding ORDER BY time.duration but that didn't help.

MPI not saving to correct file

Hi,

I have an issue where caliper doesn't export to the correct filename.

In my submission script I have:

export CALI_RECORDER_FILENAME=$PBS_O_WORKDIR/caliper/caliper-%mpi.rank%.cali

After the application has run all I have in caliper is 1 file called: "caliper-.cali"

I am running this on a Cray XC50 system.

My submission script looks like:

#!/bin/bash
#PBS -q arm
#PBS -l select=2
#PBS -l walltime=00:10:00


export OMP_NUM_THREADS=1

export LD_LIBRARY_PATH=$HOME/local/caliper/lib64:$LD_LIBRARY_PATH
export PBS_O_WORKDIR=$(readlink -f $PBS_O_WORKDIR)
cd $PBS_O_WORKDIR
mkdir "$PBS_O_WORKDIR/caliper"

export CALI_SERVICES_ENABLE=trace,event,mpi,timestamp,recorder
export CALI_TIMER_SNAPSHOT_DURATION=true
export CALI_TIMER_INCLUSIVE_DURATION=true
export CALI_MPI_WHITELIST=all
export CALI_RECORDER_FILENAME=$PBS_O_WORKDIR/caliper/caliper-%mpi.rank%.cali
#echo $CALI_RECORDER_FILENAME
aprun -n 56 ./tealeaf

In my stderr file I have lines that have been overwritten so I think this could be down to how the job is being run from aprun.

Caliper was configured like so:

cmake -DCMAKE_INSTALL_PREFIX=$HOME/local/caliper/ -DCMAKE_C_COMPILER=/opt/cray/pe/craype/2.5.18/bin/cc -DCMAKE_CXX_COMPILER=/opt/cray/pe/craype/2.5.18/bin/CC -DWITH_MPI=On -DMPI_C_COMPILER=/opt/cray/pe/craype/2.5.18/bin/cc -DWITH_TOOLS=On -DWITH_MPI=On -DWITH_GOTCHA=Off -DWITH_SAMPLER=On ..

Any ideas what might be going on?

How to average timings across MPI ranks?

Hello,

Thank you for this tool, looks promising!

I am new to Caliper and going through documentation. It will be great if you could help me with following questions :

  • How to sum timings across MPI ranks? I did instrumentation and now I see below. I assume I am see record for each mpi rank :
$ CALI_CONFIG_PROFILE=runtime-report mpirun -n 2 my_mpi_app

== CALIPER: (0): Flushing Caliper data
== CALIPER: (0): Aggregate: flushed 12 snapshots.
Path                       sum#time.duration
MAIN                            16782.000000
  CHECKPOINT                        2.000000
  OUTPUT_SPIKE                   8285.000000
  SOLVER                         2204.000000
    FIXED_STEP_THREAD            2060.000000
      UPDATE                      331.000000
      SOLVE_MINIMAL            317211.000000
      SETUP_TREE_MATRIX           258.000000
      LAST_PART                   420.000000
      DELIVER_NET_EVENTS          832.000000
  LOAD_MODEL                    53419.000000
Path                       sum#time.duration
MAIN                            17115.000000
  CHECKPOINT                        2.000000
  OUTPUT_SPIKE                   8303.000000
  SOLVER                         2875.000000
    FIXED_STEP_THREAD            2206.000000
      UPDATE                      311.000000
      SOLVE_MINIMAL            316482.000000
      SETUP_TREE_MATRIX           265.000000
      LAST_PART                   418.000000
      DELIVER_NET_EVENTS          761.000000
  LOAD_MODEL                    53493.000000
  • I would like to see inclusive timings, what I need to change?

  • I am using CALI_MARK_BEGIN/END(name) my application. By looking at examples I am creating caliper.config as:

→ cat caliper.config
CALI_SERVICES_ENABLE=aggregate,event,report,timestamp
CALI_EVENT_ENABLE_SNAPSHOT_INFO=false
CALI_TIMER_SNAPSHOT_DURATION=true
CALI_TIMER_INCLUSIVE_DURATION=true
CALI_REPORT_FILENAME=stderr

But I don't thinking I am seeing inclusive timers :

== CALIPER: (0): Aggregate: flushed 12 snapshots.
min#time.duration max#time.duration sum#time.duration avg#time.duration count annotation
       829.000000        829.000000        829.000000        829.000000     1
         4.000000      14314.000000      25887.000000       5177.400000     5 MAIN
    111795.000000     111795.000000     111795.000000     111795.000000     1 MAIN/LOAD_MODEL
         0.000000        127.000000        718.000000          0.896380   801 MAIN/SOLVER
         0.000000         26.000000       4248.000000          0.885000  4800 MAIN/SOLVER/FIXED_STEP_THREAD
         0.000000         11.000000       1414.000000          1.767500   800 MAIN/SOLVER/FIXED_STEP_THREAD/DELIVER_NET_EVENTS
         0.000000         12.000000        596.000000          0.745000   800 MAIN/SOLVER/FIXED_STEP_THREAD/SETUP_TREE_MATRIX
       909.000000       2693.000000     815823.000000       1019.778750   800 MAIN/SOLVER/FIXED_STEP_THREAD/SOLVE_MINIMAL
         0.000000         18.000000        911.000000          1.138750   800 MAIN/SOLVER/FIXED_STEP_THREAD/UPDATE
         0.000000          4.000000        605.000000          0.756250   800 MAIN/SOLVER/FIXED_STEP_THREAD/LAST_PART
     10638.000000      10638.000000      10638.000000      10638.000000     1 MAIN/OUTPUT_SPIKE
         2.000000          2.000000          2.000000          2.000000     1 MAIN/CHECKPOINT
  • I tried adding CALI_REPORT_CONFIG but I didnt get useful output :
CALI_REPORT_CONFIG="SELECT function, annotation, sum(time.duration) ORDER BY time.duration FORMAT table"
# OR
CALI_REPORT_CONFIG="SELECT *, sum(time.duration) FORMAT table ORDER BY time.inclusive.duration DESC"
== CALIPER: (0): Flushing Caliper data
== CALIPER: (0): Aggregate: flushed 12 snapshots.
annotation
MAIN
MAIN/LOAD_MODEL
annotation
MAIN
MAIN/LOAD_MODEL
MAIN/SOLVER
MAIN/SOLVER/FIXED_STEP_THREAD                    MAIN/SOLVER
MAIN/SOLVER/FIXED_STEP_THREAD
MAIN/SOLVER/FIXED_STEP_THREAD/DELIVER_NET_EVENTS
MAIN/SOLVER/FIXED_STEP_THREAD/LAST_PART
MAIN/SOLVER/FIXED_STEP_THREAD/SETUP_TREE_MATRIX
MAIN/SOLVER/FIXED_STEP_THREAD/SOLVE_MINIMAL
MAIN/SOLVER/FIXED_STEP_THREAD/DELIVER_NET_EVENTS
MAIN/SOLVER/FIXED_STEP_THREAD/LAST_PART
MAIN/SOLVER/FIXED_STEP_THREAD/SETUP_TREE_MATRIX
MAIN/SOLVER/FIXED_STEP_THREAD/SOLVE_MINIMAL
MAIN/SOLVER/FIXED_STEP_THREAD/UPDATE

MAIN/SOLVER/FIXED_STEP_THREAD/UPDATE
MAIN/OUTPUT_SPIKE
MAIN/CHECKPOINT
MAIN/OUTPUT_SPIKE
MAIN/CHECKPOINT
  • I am trying to understand how to define configurations but not able to understand the syntax. For example: CALI_SERVICES_ENABLE=event:recorder:timestamp:mpi. What each colon separate value mean and how those are ordered? (note that this services documentation use comma separate values). Some more examples for common use cases would be great!

Issue while building on OS X : "error: constexpr constructor never produces a constant expression"

I am building Caliper with Spack and see following :

$ spack install -j 1 -v caliper ^[email protected]
....
→ spack install -v caliper ^[email protected]
==> [email protected] : externally installed in /usr/local-fake
==> [email protected] : already registered in DB
==> [email protected] : externally installed in /usr/local/Cellar/open-mpi/4.0.0
==> [email protected] : already registered in DB
==> [email protected] : externally installed in /System/Library/Frameworks/Python.framework/Versions/2.7
==> [email protected] : already registered in DB
==> Installing caliper
==> Searching for binary cache of caliper
==> Warning: No Spack mirrors are currently configured
==> No binary for caliper found: installing from source
==> Using cached archive: /Users/kumbhar/workarena/software/sources/spack/var/spack/cache/caliper/caliper-2.0.1.tar.gz
==> Warning: Fetching from mirror without a checksum!
  This package is normally checked out from a version control system, but it has been archived on a spack mirror.  This means we cannot know a checksum for the tarball in advance. Be sure that your connection to this mirror is secure!
==> Staging archive: /Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/caliper-2.0.1.tar.gz
==> Created stage in /Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv
==> No patches needed for caliper
==> Building caliper [CMakePackage]
==> Executing phase: 'cmake'
==> [2019-03-19-20:27:18.052209] 'cmake' '/Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper' '-G' 'Unix Makefiles' '-DCMAKE_INSTALL_PREFIX:PATH=/Users/kumbhar/workarena/software/sources/spack/opt/spack/darwin-mojave-x86_64/clang-9.0.0-apple/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv' '-DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo' '-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON' '-DCMAKE_FIND_FRAMEWORK:STRING=LAST' '-DCMAKE_FIND_APPBUNDLE:STRING=LAST' '-DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=FALSE' '-DCMAKE_INSTALL_RPATH:STRING=/Users/kumbhar/workarena/software/sources/spack/opt/spack/darwin-mojave-x86_64/clang-9.0.0-apple/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/lib;/Users/kumbhar/workarena/software/sources/spack/opt/spack/darwin-mojave-x86_64/clang-9.0.0-apple/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/lib64;/usr/local/Cellar/open-mpi/4.0.0/lib' '-DCMAKE_PREFIX_PATH:STRING=/System/Library/Frameworks/Python.framework/Versions/2.7;/usr/local/Cellar/open-mpi/4.0.0;/usr/local-fake' '-DBUILD_TESTING=Off' '-DBUILD_DOCS=Off' '-DBUILD_SHARED_LIBS=On' '-DWITH_DYNINST=Off' '-DWITH_CALLPATH=Off' '-DWITH_GOTCHA=Off' '-DWITH_PAPI=Off' '-DWITH_LIBPFM=Off' '-DWITH_SOSFLOW=Off' '-DWITH_SAMPLER=Off' '-DWITH_MPI=On' '-DMPI_C_COMPILER=/usr/local/Cellar/open-mpi/4.0.0/bin/mpicc' '-DMPI_CXX_COMPILER=/usr/local/Cellar/open-mpi/4.0.0/bin/mpic++'
-- The C compiler identification is AppleClang 9.0.0.9000037
-- The CXX compiler identification is AppleClang 9.0.0.9000037
-- Check for working C compiler: /Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang
-- Check for working C compiler: /Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang++
-- Check for working CXX compiler: /Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Found MPI_C: /usr/local/Cellar/open-mpi/4.0.0/lib/libmpi.dylib (found version "3.1")
-- Found MPI_CXX: /usr/local/Cellar/open-mpi/4.0.0/lib/libmpi.dylib (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- Found PythonInterp: /System/Library/Frameworks/Python.framework/Versions/2.7/bin/python (found version "2.7.10")
-- Caliper configuration summary:
-- Caliper version           : 2.0.1
-- Build type                : RelWithDebInfo
-- Compiler                  : AppleClang 9.0.0.9000037 (/Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang++)
-- System                    : Darwin-18.0.0 (x86_64)
-- Install dir               : /Users/kumbhar/workarena/software/sources/spack/opt/spack/darwin-mojave-x86_64/clang-9.0.0-apple/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv
-- Build shared libs         : On
-- Build Caliper tools       : ON
-- GOTCHA support            : No
-- PAPI support              : No
-- Libpfm support            : No
-- Libunwind support         : No
-- Dyninst support           : No
-- Sampler support           : No
-- SOSFlow support           : No
-- MPI support               : Yes, using /usr/local/Cellar/open-mpi/4.0.0/lib/libmpi.dylib
-- MPIWRAP support           : Yes, using PMPI
-- MPIT support              : No
-- OMPT support              : No
-- NVProf support            : No
-- CUpti support             : No
-- TAU support               : No
-- VTune support             : No
-- Configuring done
-- Generating done
..........

[ 40%] Building CXX object src/services/CMakeFiles/caliper-services.dir/aggregate/Aggregate.cpp.o
cd /tmp/kumbhar/spack-stage/spack-stage-IWqgNK/Caliper/spack-build/src/services && /Users/kumbhar/workarena/software/sources/spack/lib/spack/env/clang/clang++   -I/tmp/kumbhar/spack-stage/spack-stage-IWqgNK/Caliper/spack-build/include -I/Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/include -I/tmp/kumbhar/spack-stage/spack-stage-IWqgNK/Caliper/spack-build -I/Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/src -I/tmp/kumbhar/spack-stage/spack-stage-IWqgNK/Caliper/spack-build/src/services  -O2 -g -DNDEBUG -fPIC   -std=gnu++11 -o CMakeFiles/caliper-services.dir/aggregate/Aggregate.cpp.o -c /Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/src/services/aggregate/Aggregate.cpp
In file included from /Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/src/services/aggregate/Aggregate.cpp:49:
/Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/include/caliper/common/util/spinlock.hpp:17:15: error: constexpr constructor never produces a constant expression [-Winvalid-constexpr]
    constexpr spinlock()
              ^
/Users/kumbhar/workarena/software/sources/spack/var/spack/stage/caliper-2.0.1-abbxhgovgy4z7s3lw36rrbd4uvejyzfv/Caliper/include/caliper/common/util/spinlock.hpp:18:11: note: non-constexpr constructor 'atomic_flag' cannot be used in a constant expression
        : m_lock { 0 }
          ^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/atomic:1700:5: note: declared here
    atomic_flag(bool __b) _NOEXCEPT : __a_(__b) {} // EXTENSION
    ^
1 error generated.
make[2]: *** [src/services/CMakeFiles/caliper-services.dir/aggregate/Aggregate.cpp.o] Error 1
make[1]: *** [src/services/CMakeFiles/caliper-services.dir/all] Error 2
make: *** [all] Error 2
==> Error: ProcessError: Command exited with status 2:
    'make'

Unexpected query result when annotating code

I have a halo exchange annotated like so:

#ifdef INSTRUMENT
    CALI_MARK_BEGIN("CG_HALO_EXCHANGE"); 
#endif
    halo_update_driver(chunks, settings, settings->halo_depth);
#ifdef INSTRUMENT
    CALI_MARK_END("CG_HALO_EXCHANGE"); 
#endif

I was hoping to just query with the annotation to filter out my MPI function calls with a query like so:
cali-query -t -q "SELECT * WHERE annotation=CG_HALO_EXCHANGE, NOT mpi.function" caliper-0.cali

Yet these results seem to return quite a lot of profiled results:

annotation mpi.world.size time.duration mpi.rank time.inclusive.duration
CG_HALO_EXCHANGE 28 53 0
CG_HALO_EXCHANGE 28 5 0
CG_HALO_EXCHANGE 28 2 0
CG_HALO_EXCHANGE 28 14 0
CG_HALO_EXCHANGE 28 1 0
CG_HALO_EXCHANGE 28 1 0
CG_HALO_EXCHANGE 28 5 0 135
CG_HALO_EXCHANGE 28 70 0
CG_HALO_EXCHANGE 28 2 0
CG_HALO_EXCHANGE 28 2 0
CG_HALO_EXCHANGE 28 26 0
CG_HALO_EXCHANGE 28 1 0
CG_HALO_EXCHANGE 28 1 0
CG_HALO_EXCHANGE 28 5 0 665

When I perform a count I get numbers like so(query: "SELECT Count() WHERE annotation=CG_HALO_EXCHANGE, NOT mpi.function"):

count
6764
6764
6764
3382

I was expecting to see 1 line saying 3382. I get the expected out when I set an MPI function I have captured:

cali-query -t -q "SELECT Count() WHERE annotation=CG_HALO_EXCHANGE, mpi.function=MPI_Isend" caliper-0.cali
count
6764

Where this rank communicates with 2 other processes.

My caliper configuration looks like:

export CALI_SERVICES_ENABLE=trace,event,mpi,timestamp,recorder
export CALI_TIMER_SNAPSHOT_DURATION=true
export CALI_TIMER_INCLUSIVE_DURATION=true
export CALI_MPI_WHITELIST=all
export CALI_RECORDER_FILENAME=caliper-%mpi.rank%.cali

So what are the additional lines in the query without the MPI functions?

how caliper measures code region runtime?

Using caliper instrumentation to measure end-to-end runtime, I am confused by the following 2 cases.

Suppose the program is instrumented as following:

main()
{
cali_begin_string_byname('loop', 'main')
..

cali_begin_string_byname('loop', 'loop1')
loop1 code
cali_end_byname('loop')

...
cali_begin_string_byname('loop', 'loop2')
loop1 code
cali_end_byname('loop')
..
cali_end_byname('loop')
}

After data collection, I process the data with
cali-query -s time.inclusive.duration --table a.cali

My expectation is that aggregation of caliper entry "main" should be the end-to-end runtime.
But, I observe two cases:

  1. for some apps, the result is as expected. The entries are like:
    main
    main/loop1
    main/loop2

  2. for others, there is no "main" entry, but only entries like
    main/loop1
    main/loop2

I am not sure why I see the differences between case 1 and case 2.
Could anyone explain this?

Complete source tree for release versions

Via Mike Collette:

In order to be usable in production applications, we need release versions of Caliper to have a full source tree. The pattern in which we fetch Gotcha (who writes that, anyway) from Github is not a valid way under their release paradigms.

The most likely alternative would be something like a git submodule or just including a version of the source wholesale.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.