GithubHelp home page GithubHelp logo

opencl-kernel-profiler's Introduction

OpenCL Kernel Profiler

opencl-kernel-profiler is a perfetto-based OpenCL kernel profiler using the layering capability of the OpenCL-ICD-Loader

Legal

opencl-kernel-profiler is licensed under the terms of the Apache 2.0 license.

Dependencies

opencl-kernel-profiler depends on the following:

opencl-kernel-profiler also (obviously) depends on a OpenCL implementation.

Building

opencl-kernel-profiler uses CMake for its build system.

To compile it, please run:

cmake -B <build_dir> -S <path-to-opencl-kernel-profiler> -DOPENCL_HEADER_PATH=<path-to-opencl-header> -DPERFETTO_SDK_PATH<path-to-perfetto-sdk>
cmake --build <build_dir>

For real life examples, have a look at:

Build options

  • OPENCL_HEADER_PATH (REQUIRED): path to OpenCL-Headers.
  • PERFETTO_SDK_PATH (REQUIRED): path to perfetto sdk (opencl-kernel-profiler is looking for PERFETTO_SDK_PATH/perfetto.cc and PERFETTO_SDK_PATH/perfetto.h).
  • PERFETTO_LIBRARY: name of a perfetto library already available (avoid having to compile perfetto.cc).
  • BACKEND: perfetto backend to use
    • InProcess (default): the application will generate the traces (perfetto documentation). Build options and environment variables can be used to control the maximum size of traces and the destination file where the traces will be recorded.
    • System: perfetto traced daemon will be responsible for generating the traces (perfetto documentation).
  • TRACE_MAX_SIZE (only with InProcess backend): Maximum size (in KB) of traces that can be recorded. Can be overriden at runtime using the following environment variable: CLKP_TRACE_MAX_SIZE (Default: 1024).
  • TRACE_DEST (only with InProcess backend): File where the traces will be recorded. Can be overriden at runtime using the following environment variable: CLKP_TRACE_DEST (Default: opencl-kernel-profiler.trace).

Running with OpenCL Kernel Profiler

To run an application with the opencl-kernel-profiler, one need to ensure the following point

  • The application will link with the OpenCL-ICD-Loader. If not the case, one can override LD_LIBRARY_PATH to point to where the libOpenCL.so coming from the ICD Loader is.
  • The ICD Loader is build with layers enable (ENABLE_OPENCL_LAYERS=ON).
  • The ICD Loader is using the correct OpenCL implementation. If not the case, one can override OCL_ICD_FILENAMES to point to the appropriate OpenCL implementation library.

On ChromeOS

Make sure to have emerged and deployed the opencl-icd-loader as well as the opencl-kernel-profiler.

Then run the application using opencl-kernel-profiler.sh. This script will take care of setting all the environment variables needed to run with the opencl-kernel-profiler.

Using the trace

Once traces have been generated, on can view them using the perfetto trace viewer.

It is also possible to make SQL queries using the trace_processor tool of perfetto. Link to perfetto quickststart with SQL-based analysis.

Here is simple example to extract every kernel source code from the trace:

echo "SELECT EXTRACT_ARG(arg_set_id, 'debug.string') FROM slice WHERE slice.name='clCreateProgramWithSource-args'" | ./trace_processor -q /dev/stdin <opencl-kernel-profiler.trace>

Extracting the kernel sources without perfetto

Running an application without perfetto but with the opencl-kernel-profiler layer enabled will dump the kernel sources code inside the directory pointed by CLKP_KERNEL_DIR. If CLKP_KERNEL_DIR is not set, nothing get written on disk.

How does it work

opencl-kernel-profiler intercept to following calls to generate perfetto traces:

  • clCreateCommandQueue: it modifies properties to enable profiling (CL_QUEUE_PROFILING_ENABLE).
  • clCreateCommandQueueWithProperties: it adds CL_QUEUE_PROPERTIES with CL_QUEUE_PROFILING_ENABLE, or just set CL_QUEUE_PROFILING_ENABLE if CL_QUEUE_PROPERTIES is already set.
  • clCreateProgramWithSource: it creates instant traces with the program source strings and initializes internal structures.
  • clCreateKernel: it initializes internal structures.
  • clEnqueueNDRangekernel: it creates a callback on the kernel completion. The callback will create traces with the proper timestamp for the kernel using timestamp coming from clGetEventProfilinginfo.

Every intercept call also generates a trace for the function.

opencl-kernel-profiler's People

Contributors

rjodinchr avatar

Stargazers

Romaric Jodin avatar  avatar chunhui avatar Kévin Petit avatar

Watchers

 avatar

Forkers

aytenaker

opencl-kernel-profiler's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.