USM with sycl::half support

oneAPI Math Kernel Library (oneMKL) Interfaces

oneMKL Interfaces is an open-source implementation of the oneMKL Data Parallel C++ (DPC++) interface according to the oneMKL specification. It works with multiple devices (backends) using device-specific libraries underneath.

oneMKL is part of the UXL Foundation.

User Application	oneMKL Layer	Third-Party Library	Hardware Backend
oneMKL interface	oneMKL selector	Intel(R) oneAPI Math Kernel Library (oneMKL)	x86 CPU, Intel GPU
		NVIDIA cuBLAS	NVIDIA GPU
		NVIDIA cuSOLVER	NVIDIA GPU
		NVIDIA cuRAND	NVIDIA GPU
		NVIDIA cuFFT	NVIDIA GPU
		NETLIB LAPACK	x86 CPU
		AMD rocBLAS	AMD GPU
		AMD rocSOLVER	AMD GPU
		AMD rocRAND	AMD GPU
		AMD rocFFT	AMD GPU
		portBLAS	x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU
		portFFT	x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU

Support and Requirements

Supported Usage Models:

Host API

There are two oneMKL selector layer implementations:

Run-time dispatching: The application is linked with the oneMKL library and the required backend is loaded at run-time based on device vendor (all libraries should be dynamic).

Example of app.cpp with run-time dispatching:

#include "oneapi/mkl.hpp"

...
cpu_dev = sycl::device(sycl::cpu_selector());
gpu_dev = sycl::device(sycl::gpu_selector());

sycl::queue cpu_queue(cpu_dev);
sycl::queue gpu_queue(gpu_dev);

oneapi::mkl::blas::column_major::gemm(cpu_queue, transA, transB, m, ...);
oneapi::mkl::blas::column_major::gemm(gpu_queue, transA, transB, m, ...);

How to build an application with run-time dispatching:

if OS is Linux, use icpx compiler. If OS is Windows, use icx compiler. Linux example:

$> icpx -fsycl –I$ONEMKL/include app.cpp
$> icpx -fsycl app.o –L$ONEMKL/lib –lonemkl

Compile-time dispatching: The application uses a templated backend selector API where the template parameters specify the required backends and third-party libraries and the application is linked with the required oneMKL backend wrapper libraries (libraries can be static or dynamic).

Example of app.cpp with compile-time dispatching:

#include "oneapi/mkl.hpp"

...
cpu_dev = sycl::device(sycl::cpu_selector());
gpu_dev = sycl::device(sycl::gpu_selector());

sycl::queue cpu_queue(cpu_dev);
sycl::queue gpu_queue(gpu_dev);

oneapi::mkl::backend_selector<oneapi::mkl::backend::mklcpu> cpu_selector(cpu_queue);

oneapi::mkl::blas::column_major::gemm(cpu_selector, transA, transB, m, ...);
oneapi::mkl::blas::column_major::gemm(oneapi::mkl::backend_selector<oneapi::mkl::backend::cublas> {gpu_queue}, transA, transB, m, ...);

How to build an application with compile-time dispatching:

$> clang++ -fsycl –I$ONEMKL/include app.cpp
$> clang++ -fsycl app.o –L$ONEMKL/lib –lonemkl_blas_mklcpu –lonemkl_blas_cublas

Refer to Selecting a Compiler for the choice between icpx/icx and clang++ compilers.

Device API

Header-based and backend-independent Device API can be called within sycl kernel or work from Host code (device-rng-usage-model-example). Currently, the following domains support the Device API:

RNG. To use RNG Device API functionality it's required to include oneapi/mkl/rng/device.hpp header file.

Supported Configurations:

Supported domains include: BLAS, LAPACK, RNG, DFT, SPARSE_BLAS

Supported compilers include:

Intel(R) oneAPI DPC++ Compiler: Intel proprietary compiler that supports CPUs and Intel GPUs. Intel(R) oneAPI DPC++ Compiler will be referred to as "Intel DPC++" in the "Supported Compiler" column of the tables below.
oneAPI DPC++ Compiler: Open source compiler that supports CPUs and Intel, NVIDIA, and AMD GPUs. oneAPI DPC++ Compiler will be referred to as "Open DPC++" in the "Supported Compiler" column of the tables below.
AdaptiveCpp Compiler (formerly known as hipSYCL): Open source compiler that supports CPUs and Intel, NVIDIA, and AMD GPUs.
Note: The source code and some documents in this project still use the previous name hipSYCL during this transition period.

Linux*

Domain	Backend	Library	Supported Compiler	Supported Link Type
BLAS	x86 CPU	Intel(R) oneMKL	Intel DPC++ AdaptiveCpp	Dynamic, Static
		NETLIB LAPACK	Intel DPC++ Open DPC++ AdaptiveCpp	Dynamic, Static
		portBLAS	Intel DPC++ Open DPC++	Dynamic, Static
	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	Intel GPU	portBLAS	Intel DPC++ Open DPC++	Dynamic, Static
	NVIDIA GPU	NVIDIA cuBLAS	Open DPC++ AdaptiveCpp	Dynamic, Static
	NVIDIA GPU	portBLAS	Open DPC++	Dynamic, Static
	AMD GPU	AMD rocBLAS	Open DPC++ AdaptiveCpp	Dynamic, Static
	AMD GPU	portBLAS	Open DPC++	Dynamic, Static
LAPACK	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	NVIDIA GPU	NVIDIA cuSOLVER	Open DPC++	Dynamic, Static
	AMD GPU	AMD rocSOLVER	Open DPC++	Dynamic, Static
RNG	x86 CPU	Intel(R) oneMKL	Intel DPC++ AdaptiveCpp	Dynamic, Static
	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	NVIDIA GPU	NVIDIA cuRAND	Open DPC++ AdaptiveCpp	Dynamic, Static
	AMD GPU	AMD rocRAND	Open DPC++ AdaptiveCpp	Dynamic, Static
DFT	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	x86 CPU	portFFT (limited API support)	Intel DPC++	Dynamic, Static
	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	Intel GPU	portFFT (limited API support)	Intel DPC++	Dynamic, Static
	NVIDIA GPU	NVIDIA cuFFT	Open DPC++	Dynamic, Static
	NVIDIA GPU	portFFT (limited API support)	Open DPC++	Dynamic, Static
	AMD GPU	AMD rocFFT	Open DPC++	Dynamic, Static
	AMD GPU	portFFT (limited API support)	Open DPC++	Dynamic, Static
SPARSE_BLAS	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
SPARSE_BLAS	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static

Windows*

Domain	Backend	Library	Supported Compiler	Supported Link Type
BLAS	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
	x86 CPU	NETLIB LAPACK	Intel DPC++ Open DPC++	Dynamic, Static
	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
LAPACK	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
LAPACK	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
RNG	x86 CPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static
RNG	Intel GPU	Intel(R) oneMKL	Intel DPC++	Dynamic, Static

Hardware Platform Support

CPU
- Intel Atom(R) Processors
- Intel(R) Core(TM) Processor Family
- Intel(R) Xeon(R) Processor Family
Accelerators
- Intel(R) Arc(TM) A-Series Graphics
- Intel(R) Data Center GPU Max Series
- NVIDIA(R) A100 (Linux* only)
- AMD(R) GPUs see here tested on AMD Vega 20 (gfx906)

Supported Operating Systems

Linux*

Backend	Supported Operating System
x86 CPU	Red Hat Enterprise Linux* 9 (RHEL* 9)
Intel GPU	Ubuntu 22.04 LTS
NVIDIA GPU	Ubuntu 22.04 LTS

Windows*

Backend	Supported Operating System
x86 CPU	Microsoft Windows* Server 2022
Intel GPU	Microsoft Windows* 11

Software Requirements

What should I download?

General:

Functional Testing	Build Only	Documentation
CMake (version 3.13 or newer)
Linux* : GNU* GCC 5.1 or higher Windows* : MSVS* 2017 or MSVS* 2019 (version 16.5 or newer)
Ninja (optional)
GNU* FORTRAN Compiler	-	Sphinx
NETLIB LAPACK	-	-

Hardware and OS Specific:

Operating System	Device	Package
Linux/Windows	x86 CPU	Intel(R) oneAPI DPC++ Compiler or oneAPI DPC++ Compiler
	x86 CPU	Intel(R) oneAPI Math Kernel Library
	Intel GPU	Intel(R) oneAPI DPC++ Compiler
		Intel GPU driver
		Intel(R) oneAPI Math Kernel Library
Linux* only	NVIDIA GPU	oneAPI DPC++ Compiler or AdaptiveCpp with CUDA backend and dependencies
Linux* only	AMD GPU	oneAPI DPC++ Compiler or AdaptiveCpp with ROCm backend and dependencies

Product and Version Information:

Product	Supported Version	License
CMake	3.13 or higher	The OSI-approved BSD 3-clause License
Ninja	1.10.0	Apache License v2.0
GNU* FORTRAN Compiler	7.4.0 or higher	GNU General Public License, version 3
Intel(R) oneAPI DPC++ Compiler	Latest	End User License Agreement for the Intel(R) Software Development Products
AdaptiveCpp	Later than 2cfa530	BSD-2-Clause License
oneAPI DPC++ Compiler binary for x86 CPU	Daily builds	Apache License v2
oneAPI DPC++ Compiler source for NVIDIA and AMD GPUs	Daily source releases	Apache License v2
Intel(R) oneAPI Math Kernel Library	Latest	Intel Simplified Software License
NVIDIA CUDA SDK	12.0	End User License Agreement
AMD rocBLAS	4.5	AMD License
AMD rocRAND	5.1.0	AMD License
AMD rocSOLVER	5.0.0	AMD License
AMD rocFFT	rocm-5.4.3	AMD License
NETLIB LAPACK	5d4180c	BSD like license
portBLAS	0.1	Apache License v2.0
portFFT	0.1	Apache License v2.0

Documentation

Contents
About
Get Started
Developer Reference
- oneMKL Defined Datatypes
- Dense Linear Algebra
Integrating a Third-Party Library

Governance

The oneMKL Interfaces project is governed by the UXL Foundation and you can get involved in this project in multiple ways. It is possible to join the Math Special Interest Group (SIG) meetings where the group discusses and demonstrates work using this project. Members can also join the Open Source and Specification Working Group meetings.

You can also join the mailing lists for the UXL Foundation to be informed of when meetings are happening and receive the latest information and discussions.

Contributing

You can contribute to this project and also contribute to the specification for this project. Please read the CONTRIBUTING page for more information. You can also contact oneMKL developers and maintainers via UXL Foundation Slack using #onemkl channel.

For GitHub questions, issues, RFCs, or PRs you can contact maintainers via one of the following GitHub teams based on the topic:

GitHub team name	Description
@oneapi-src/onemkl-maintain	All oneMKL maintainers
@oneapi-src/onemkl-arch-write	oneMKL Architecture maintainers
@oneapi-src/onemkl-blas-write	oneMKL BLAS maintainers
@oneapi-src/onemkl-dft-write	oneMKL DFT maintainers
@oneapi-src/onemkl-lapack-write)	oneMKL LAPACK maintainers
@oneapi-src/onemkl-rng-write	oneMKL RNG maintainers
@oneapi-src/onemkl-sparse-write	oneMKL Sparse Algebra maintainers
@oneapi-src/onemkl-vm-write	oneMKL Vector Math maintainers

License

Distributed under the Apache license 2.0. See LICENSE for more information.

FAQs

oneMKL

Q: What is the difference between the following oneMKL items?

The oneAPI Specification for oneMKL
The oneAPI Math Kernel Library (oneMKL) Interfaces Project
The Intel(R) oneAPI Math Kernel Library (oneMKL) Product

A:

The oneAPI Specification for oneMKL defines the DPC++ interfaces for performance math library functions. The oneMKL specification can evolve faster and more frequently than implementations of the specification.
The oneAPI Math Kernel Library (oneMKL) Interfaces Project is an open source implementation of the specification. The project goal is to demonstrate how the DPC++ interfaces documented in the oneMKL specification can be implemented for any math library and work for any target hardware. While the implementation provided here may not yet be the full implementation of the specification, the goal is to build it out over time. We encourage the community to contribute to this project and help to extend support to multiple hardware targets and other math libraries.
The Intel(R) oneAPI Math Kernel Library (oneMKL) product is the Intel product implementation of the specification (with DPC++ interfaces) as well as similar functionality with C and Fortran interfaces, and is provided as part of Intel® oneAPI Base Toolkit. It is highly optimized for Intel CPU and Intel GPU hardware.

Q: I'm trying to use oneMKL Interfaces in my project using FetchContent, but I keep running into ONEMKL::SYCL::SYCL target was not found problem when I try to build the project. What should I do?

A: Make sure you set the compiler when you configure your project. E.g. cmake -Bbuild . -DCMAKE_CXX_COMPILER=icpx.

Q: I'm trying to use oneMKL Interfaces in my project using find_package(oneMKL). I set oneMKL/oneTBB and Compiler environment first, then I built and installed oneMKL Interfaces, and finally I tried to build my project using installed oneMKL Interfaces (e.g. like this cmake -Bbuild -GNinja -DCMAKE_CXX_COMPILER=icpx -DoneMKL_ROOT=<path_to_installed_oneMKL_interfaces> .) and I noticed that cmake includes installed oneMKL Interfaces headers as a system include which ends up as a lower priority than the installed oneMKL package includes which I set before for building oneMKL Interfaces. As a result, I get conflicts between oneMKL and installed oneMKL Interfaces headers. What should I do?

A: Having installed oneMKL Interfaces headers as -I instead on system includes (as -isystem) helps to resolve this problem. We use INTERFACE_INCLUDE_DIRECTORIES to add paths to installed oneMKL Interfaces headers (check oneMKLTargets.cmake in lib/cmake to find it). It's a known limitation that INTERFACE_INCLUDE_DIRECTORIES puts headers paths as system headers. To avoid that:

Option 1: Use CMake >=3.25. In this case oneMKL Interfaces will be built with EXPORT_NO_SYSTEM property set to true and you won't see the issue.
Option 2: If you use CMake < 3.25, set PROPERTIES NO_SYSTEM_FROM_IMPORTED true for your target. E.g: set_target_properties(test PROPERTIES NO_SYSTEM_FROM_IMPORTED true).

	option(BUILD_DEBUG "" OFF)

	if(BUILD_DEBUG)
	set(CMAKE_BUILD_TYPE "Debug")
	else()
	set(CMAKE_BUILD_TYPE "Release")
	endif()

oneapi-src / onemkl Goto Github PK

onemkl's Introduction

oneAPI Math Kernel Library (oneMKL) Interfaces

Table of Contents

Support and Requirements

Supported Usage Models:

Host API

Device API

Supported Configurations:

Linux*

Windows*

Hardware Platform Support

Supported Operating Systems

Linux*

Windows*

Software Requirements

General:

Hardware and OS Specific:

Product and Version Information:

Documentation

Governance

Contributing

License

FAQs

oneMKL

onemkl's People

Contributors

Stargazers

Watchers

Forkers

onemkl's Issues

Summary

Problem statement

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Summary

Problem statement

1-D sycl::buffer does not correctly encapsulate matrices or vectors

basic_mdspan could replace the raw pointers interface, but not the sycl::buffer interface

Preferred solution

Replace the raw pointer interface with a basic_mdspan interface

Consider basic_mdarray as a sycl::buffer wrapper

Summary

Problem statement

Details

Proposed solution(s)

Summary

Problem statement

Preferred solution

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Summary

Problem statement

Preferred solution

Summary

Problem statement

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Summary

Version

Environment

Steps to reproduce

1-D `sycl::buffer` does not correctly encapsulate matrices or vectors

`basic_mdspan` could replace the raw pointers interface, but not the `sycl::buffer` interface

Replace the raw pointer interface with a `basic_mdspan` interface

Consider `basic_mdarray` as a `sycl::buffer` wrapper