oneapi-src / oneapi-samples Goto Github PK

Samples for Intel® oneAPI Toolkits

Home Page: https://oneapi-src.github.io/oneAPI-samples/

License: MIT License

Makefile 0.04% C++ 70.08% CMake 0.69% C 1.66% Fortran 0.08% Shell 0.10% Python 0.34% Jupyter Notebook 2.92% HTML 24.00% Cuda 0.04% Verilog 0.01% Tcl 0.03% SystemVerilog 0.01% Batchfile 0.01% PureBasic 0.01% Stata 0.01%

oneapi ai cpu cuda fortran fpga gpu jupyter oneccl onedal

oneapi-samples's People

Contributors

Stargazers

Watchers

Forkers

samples-ci mkitez anxs-04 moushumimaria sravanikonda ethanhirsch jenn500 andrey4latyshev varsha-madananth vmadananth greenrongreen lqnguyen slgogar anjgola akertesz agola2 rscohn2 jkhemka mpanoop barisaktemur terdner petercad dmitriy-sobolev kevinpoleary jessicadavies-intel yinghu5 andreypudov daverous martanav atonkov aelizaro hughes-c tyoungsc jitendra42 wafaat encinasprietofj reikibrain bikerben33 jedxops raoberman donovan680 meiyacha jhashweta1 shailensobhee aaleshina tomlenth corsair-cxs arkhodamoradi hackveda jjfumero takanokage nags2004 tony-- climatepals vsanghavi piatikantrop raghavg9 jingxu10 jayke98912 zenotech citian-intel krisdale sayanta66 akharche rangaramkumar kboyarinov omartiny firmiana11 georgebisbas triode-zyl abbat06 timmiesmith fahimcsebuet wlilyx jharvell yangyang14641 fzou1 jasonjjl pkestene zl-su phiphy-yuan zhuoweisi intei-cloud nagayoshikobayashi lkampoli alexandre-solovyov felipensantos outoftardis skurella m-petrova sknepper isaevil amritaintel diyue-ye hbnworkstation dmitrydenisenko751 mahinlma ayush983 aleksandreremin generalova-kate

oneapi-samples's Issues

Feature update for intrin_dot_sample.cpp

Summary

Include a short summary of the request. Sections below provide guidance on
what factors are considered important for a feature request.

README says that "They provide access to instructions that cannot be generated using the standard constructs of the C and C++ languages, and allow code to leverage performance enhancing features unique to specific processors. "
May we use some SIMD compiler options to achieve vectorization ?

Problem statement

A comparison between the performance of using intrinsics and the performance of using SIMD compiler options

Preferred solution

Add a solution to the problem where the size is not a multiple of 8. Currently, it is 24.
Add a solution, if possible, to adding some SIMD compiler options to achieve vectorization

Thanks

GPU training

Is the oneAPI possible to train AI with gpu?

Current pytorch and tensorflow examples are calling mkldnn. With oneDNN it claims that can run on opencl (AMD), intel xe and nvidia gpu, any example to show that it is running on cpu or gpu devices?

"Simple Model" sample has a broken link

Summary

The "simple model" deep learning sample readme says that the sample "includes a Jupyter Notebook", but the link is broken.

URLs

Readme that needs updating:
https://github.com/oneapi-src/oneAPI-samples/tree/master/Libraries/oneDNN/simple_model

Non-existent notebook:
https://github.com/oneapi-src/oneAPI-samples/blob/master/Libraries/oneDNN/simple_model/simple_model.ipynb

Request to Add a oneAPI Sample OpenMP Offload Features

Summary

This is a request for a new code Sample called OpenMP Offload Features

Purpose

To show the new OpenMP offload features supported by the oneAPI DPC++/C++ compiler. Xinmin Tian and I are doing a webinar on this next week and would like to have the samples available as well.

Domain

Direct Programming C++

Description

The samples here will show the new OpenMP Offload Features supported by the oneAPI compiler.
Initially 4 features will be shown.

Class Member Functor usage in an Offload Region
Function Pointer usage in an Offload Region'
User Defined Mapper
USM and DPC++ Composability

Additional samples can be added to show new features in the future.

Proposed folder Location

DirectProgramming/C++/CompilerInfrastructure/OpenMP_Offload_Features

Checklist

[ ] Samples Working Group Permission accepted on

Default optimization

The README states that the default build for the OpenMP iso3dfd code creates the baseline w/o any optimizations. It appears that the default build is actually level-3 optimization. To build without optimization, you have to use cmake -DUSE_OPT3=0 ...

Cannot compile samples using VS2019 GUI

Hi,

I tries to compile the saple project with the latest version of VS2019 (16.11.4) and the latest oneAPI (2021.4) and the latest samples from Nov. 1st. All samples show the same compile error with "llvm-objcopy.exe". Not shur what the problem is, but I found that I have different version of that file installed (Some in the VS 2019 folder, one other in the oneAPI 2021.4 folder)

1>llvm-objcopy.exe: : error : 'x64\Debug\main.obj': function not supported
1>C:\PROGRA~~2\Intel\oneAPI\compiler\20214~~1.0\windows\bin\clang-offload-bundler: : error : 'llvm-objcopy' tool failed
1>dpcpp: : error : clang-offload-bundler command failed with exit code 1 (use -v to see invocation)
1>Done building project "mandelbrot.vcxproj" -- FAILED.

Any idea what I'm doing wrong?

Thanks,
Daniel

IntelPyTorch_TorchCCL_Multinode_Training testing DLRM with aikit docker pytorch env saw Segment Fault

Summary

Saw segment fault when testing AIKit docker.io/intel/oneapi-aikit:latest with DLRM training
Can any one give some insight? Is there any one we can approach internally?

Version

DLRM training is runnable with

g++8.4
torch 1.5.0a0+b58f89b tags/v1.5.0-rc3
intel-extension-for-pytorch 0.1 checkout tags/v0.2
oneCCL 2021.1-beta07-1
torch_ccl 1.0 2021.1-beta07-1

DLRM will be Failed (segment fault) with
docker.io/intel/oneapi-aikit:latest pytorch env
versions are

g++ 7.5
torch 1.8.0a0+37c1f4a
intel-extension-for-pytorch 1.8.0
oneccl 2021.4
torch_ccl 1.1.0+064d9eb

Environment

docker.io/intel/oneapi-aikit:latest pytorch env

Reproduce

https://github.com/mlperf/training_results_v0.7/tree/master/Intel/benchmarks/dlrm/1-node-4s-cpx-pytorch

Observed behavior

Core dump file

core_dump.zip

dpc_common.hpp path update

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Could you please update the CMake files to include the path to dpc_common.hpp ?

Does %ONEAPI_ROOT%\dev-utilities\latest\include exist ?

Thanks

Additional "," in PipeArray README file

Summary

There is a typo in pipe array tutorial README file. In Example 1: A simple array of pipes section:

using MyPipeArray = PipeArray< // Defined in "pipe_array.h".
class MyPipe, // An identifier for the pipe.
int, // The type of data in the pipe.
32, // The capacity of each pipe.
10, // array dimension. This line has a additional "," after "10".
>;

URLs

https://github.com/oneapi-src/oneAPI-samples/tree/master/DirectProgramming/DPC%2B%2BFPGA/Tutorials/DesignPatterns/pipe_array#example-1-a-simple-array-of-pipes

Additional details

Provide detailed description of the expected changes in documentation
and suggestions you have.

Runfail on CPU device

Summary

Run fail on CPU device.

Version

dpcpp --version
Intel(R) oneAPI DPC++ Compiler Pro 2021.1 (2020.8.0.0819)

Environment

Linux

Steps to reproduce

dpcpp -O3 -fsycl -std=c++17 -O2 -g -DNDEBUG -ltbb -lsycl -lmpi src/main.cpp -o dpc_reduce
SYCL_DEVICE_TYPE=CPU ./dpc_reduce
Number of steps is 1000000
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): OpenCL API failed. OpenCL API returns: -5 (CL_OUT_OF_RESOURCES) -5 (CL_OUT_OF_RESOURCES)
Aborted (core dumped)

Test Bug

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

Cannot override optimization level set for particle-diffusion sample

Summary

particle-diffusion/src/CMakeLists.txt sets default optimization level to -O3, regardless of the CMAKE_BUILD_TYPE

Version

oneAPI Base Tookit 2021.4

Environment

Ubuntu 20.04
Coffee Lake / GEN9 graphics

Steps to reproduce

from particle-diffusion sample directory run:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=DEBUG ..

Observed behavior

Running "cmake -DCMAKE_BUILD_TYPE=DEBUG .." results in both -O0 and -O3 specified in flags.make:
CMakeFiles/motionsim.exe.dir/flags.make:CXX_FLAGS = -g -O0 -O3 -std=c++17 -g
As the result, dpcpp seem to performs -O3 optimization, defeating the purpose of DEBUG build

Expected behavior

cmake uses -O0 -g compiler flags when CMAKE_BUILD_TYPE=DEBUG is set

Request to Add a oneAPI Sample

Description

Three samples to demonstrate the use of the dpct migration tool. These samples have been approved and developed in concert with Tom L., Yury P. and Swapna D.

Dependencies

These samples have no external header or library dependencies that are necessary to be available within this repo. Obviously, they do require installation of the oneAPI dpct tool.

Sample Folder

These three samples will be going into the Tools/Migration folder (the Migration folder is new).

I should be issuing a pull request soon after this issue is submitted, assuming no problems. 😄

BTW - There is no CI section, yet, in the sample.json because the CI system will need to be modified in order to be able to perform automated testing on these samples.

Dependence on dpc_common.hpp

The samples depend on dpc_common.hpp. It comes with the product dpc++ compiler, but is not part of the open source dpc++.

It would be better if the DPC++ samples would work with the open source DPC++, especially since the missing functionality is trivial.

fatal error: 'sycl/ext/intel/fpga_extensions.hpp' file not found

Summary

I am attempting to complete some of the tutorials and reference designs. Each time I run Make in the instructions it gives the following error.

[ 25%] Building CXX object src/CMakeFiles/fpga_reg.fpga.dir/fpga_reg.cpp.o
/home/u86836/oneAPI-samples/DirectProgramming/DPC++FPGA/Tutorials/Features/fpga_reg/src/fpga_reg.cpp:7:10: fatal error: 'sycl/ext/intel/fpga_extensions.hpp' file not found
#include <sycl/ext/intel/fpga_extensions.hpp>

I am running on just the standard OneAPI login node on the DevCloud

Version

Not Sure

Environment

Ubuntu 20 LTS for host machine to ssh to the devcloud

Steps to reproduce

Just follow any of the Reference Designs or Tutorials. The error appears when I use the Make command. I have tried multiple reference designs and tutorials.

Specifically the DPC++FPGA/Tutorials/Features/fpga_reg, DPC++FPGA/ReferenceDesigns/mvdr_beamforming and DPC++FPGA/ReferenceDesigns/crr

Observed behavior

Error when using Make

[ 25%] Building CXX object src/CMakeFiles/fpga_reg.fpga.dir/fpga_reg.cpp.o
/home/u86836/oneAPI-samples/DirectProgramming/DPC++FPGA/Tutorials/Features/fpga_reg/src/fpga_reg.cpp:7:10: fatal error: 'sycl/ext/intel/fpga_extensions.hpp' file not found
#include <sycl/ext/intel/fpga_extensions.hpp>

Expected behavior

Make to complete without error.

Possible logic error in dpc_reduce example code

I think the following code has a logic error in lines 657-666: https://github.com/oneapi-src/oneAPI-samples/blob/master/DirectProgramming/DPC%2B%2B/ParallelPatterns/dpc_reduce/src/main.cpp.

The if-test on line 657 has MPI rank-0 calling the DPC++ kernels to force JIT compilation. The programmer assumes that the rest of the MPI communicator will have access to the master’s cached JIT compilations. The other MPI ranks are separate processes, so they won’t have access. Contrary to the programmer’s expectations (specified on lines 660-662), the other MPI ranks will still perform the JIT compilations when the DPC++ kernels are first invoked.

Test Issue 3 AnjgolaRead

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

Github page for samples status is not active. Was : Failures in automatic testing

Looks like 1dheat_transfer and others are failing:

https://github.com/oneapi-src/oneAPI-samples/blob/gh-pages/README.md

Request to Add a oneAPI Sample: Initial Rendering Toolkit sample

Summary

This is a request for a new code Sample called Initial Rendering Toolkit Sample

Purpose

Answer the following questions

What specifically is this code sample trying to show?
- Provide getting started guide samples for the oneAPI Rendering Toolkit (IRTK) in a similar experience to the base toolkit.
Why is this important to the oneAPI ecosystem?
- Stop users from navigating samples from preexisting Github (more suitable for advanced users).
- This will be much faster to get going than the current gsg flows anywhere.

Domain

Please supply what Domain that you feel represents your Code Sample. (Best Effort)

Likely oneVPL reviewers... Marc Valle @mav-intel Rendering Toolkit domain is not listed.

Description

The Initial Rendering Toolkit Sample will allow the user to build and like the most basic programs for OSPRay, Embree, Open VKL, and Open Image Denoise. Sources are taken from existing samples on Intel-managed library product repositories. Their build is edit to be /edit retrofitted for the oneAPI samples repo edit with this update /edit.

Dependencies

No third-party runtime dependencies are required with proposed sources.
A third-party image viewer program is needed to review the output. ImageMagick is easy to use and spans target platforms.
Open Image Denoise sample denoises the output of the OSPRay sample. ImageMagick program is used independently to prep the OSPRay data for OIDN parsing.
ImageMagick is an extremely common toolset for this interest area.

Proposed folder Location

Please include the proposed folder location for your sample to reside.
I'm using oneapi-src/oneAPI-samples/RenderingToolkit ... This hierarchy and folder name is flexible, thus far it looks like this:

$ find RenderingToolkit/ -type f
RenderingToolkit/embree_gsg/CMakeLists.txt
RenderingToolkit/embree_gsg/minimal.cpp
RenderingToolkit/embree_gsg/ospray.json
RenderingToolkit/embree_gsg/sample.json
RenderingToolkit/oidn_gsg/apps/oidnDenoise.cpp
RenderingToolkit/oidn_gsg/apps/utils/arg_parser.h
RenderingToolkit/oidn_gsg/apps/utils/CMakeLists.txt
RenderingToolkit/oidn_gsg/apps/utils/image_io.cpp
RenderingToolkit/oidn_gsg/apps/utils/image_io.h
RenderingToolkit/oidn_gsg/CMakeLists.txt
RenderingToolkit/oidn_gsg/common/CMakeLists.txt
RenderingToolkit/oidn_gsg/common/platform.cpp
RenderingToolkit/oidn_gsg/common/platform.h
RenderingToolkit/oidn_gsg/common/timer.h
RenderingToolkit/oidn_gsg/sample.json
RenderingToolkit/openvkl_gsg/CMakeLists.txt
RenderingToolkit/openvkl_gsg/sample.json
RenderingToolkit/openvkl_gsg/vklTutorial.c
RenderingToolkit/ospray_gsg/CMakeLists.txt
RenderingToolkit/ospray_gsg/ospTutorial.cpp
RenderingToolkit/ospray_gsg/sample.json
RenderingToolkit/README.md

Checklist

[ ] Samples Working Group Permission accepted on

Offload time of iso2dfd is about 100 times slower than the CPU time

Here is the output:

`
Initializing ...
Grid Sizes: 1000 1000
Iterations: 2000

Computing wavefield in device ..
Running on Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz
The Device Max Work Group Size is : 8192
The Device Max EUCount is : 8
Offload time: 3186.34 s

Computing wavefield in CPU ..
Initializing ...
CPU time: 39.8897 s

Final wavefields from device and CPU are equivalent: Success
Final wavefields (from device and CPU) written to disk
Finished.
`

I am assuming both simulations are run on the CPU because my GPU is unsupported, so what is making it run so much slower?

zero copy example does not compile on devcloud

Summary

I am trying to run the zero-copy example on devcloud. It seems that the default platform specified in the CMakelist.txt cannot be found on devcloud, so I changed it to pac_s10_dc. And I got the compilation error from aoc compiler as detailed below.

Is this example supported on devcloud S10 board? If not, where can I find a board that supports USM and zero-copy?

Version

I am on the latest commit of master branch. Here is the dpcpp version

u68165@s001-n143:~$ dpcpp -v
Intel(R) oneAPI DPC++ Compiler 2021.2.0 (2021.2.0.20210317)

For Quartus and BSP version

u68165@s001-n143:~$ tools_setup -t S10DS
sourcing /glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/init_env.sh
export QUARTUS_HOME=/glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/quartus
export OPAE_PLATFORM_ROOT=/glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/d5005_ias_2_0_1_b237
export AOCL_BOARD_PACKAGE_ROOT=/glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/d5005_ias_2_0_1_b237/opencl/opencl_bsp
Adding $OPAE_PLATFORM_ROOT/bin to PATH
export INTELFPGAOCLSDKROOT=/glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/hld
export ALTERAOCLSDKROOT=/glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/hld
Adding $QUARTUS_HOME/bin to PATH
source /glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/hld/init_opencl.sh

Environment

Intel devcloud cluster. S10 node with oneAPI

--------------------------------------------------------------------------------------
Nodes with Stratix 10 OneAPI: (1 available/3 total)
s001-n143
--------------------------------------------------------------------------------------

Steps to reproduce

Just clone the repo, generate Makefile and run make fpga as mentioned in the instructions.

Observed behavior

Here is what I got after running make fpga

u68165@s001-n144:~/oneAPI-samples/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/zero_copy_data_transfer/build$ make fpgaScanning dependencies of target zero_copy_data_transfer.fpga
[ 50%] Building CXX object src/CMakeFiles/zero_copy_data_transfer.fpga.dir/zero_copy_data_transfer.cpp.o[100%] Linking CXX executable ../zero_copy_data_transfer.fpga
aoc: Running OpenCL parser....
Error: SPIRV to LLVM IR FAILED
dpcpp: error: fpga compiler command failed with exit code 1 (use -v to see invocation)
src/CMakeFiles/zero_copy_data_transfer.fpga.dir/build.make:94: recipe for target 'zero_copy_data_transfer.fpga' failedmake[3]: *** [zero_copy_data_transfer.fpga] Error 1CMakeFiles/Makefile2:229: recipe for target 'src/CMakeFiles/zero_copy_data_transfer.fpga.dir/all' failed
make[2]: *** [src/CMakeFiles/zero_copy_data_transfer.fpga.dir/all] Error 2
CMakeFiles/Makefile2:268: recipe for target 'src/CMakeFiles/fpga.dir/rule' failed
make[1]: *** [src/CMakeFiles/fpga.dir/rule] Error 2
Makefile:183: recipe for target 'fpga' failed
make: *** [fpga] Error 2

Expected behavior

to generate bitstream successfully.

IntelPyTorch_GettingStarted Typos

Summary

Typos in IntelPyTorch_GettingStarted sample, step 4:
-This is a Pytorch sample, and yet step 4 says it is a TensorFlow sample
-The directory specified is missing a '-', which raises an error

URLs

https://github.com/oneapi-src/oneAPI-samples/blob/master/AI-and-Analytics/Getting-Started-Samples/IntelPyTorch_GettingStarted/README.md

Additional details

Typos fixed to avoid errors and confusion.

End-to-end-Workloads

Summary

The examples under End-to-end-Workloads appear to be empty except for Census. Seems like the contents for Morgage, NYTaxi, and Plastic have disappeared.

Version

2021-1.gold

Environment

As seen on github

Steps to reproduce

https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/End-to-end-Workloads/Mortgage

Observed behavior

empty directories: Mortgage, NYTaxi, Plastic only .gitkeep

Expected behavior

Python examples etc

dpl_buffer.cpp seems seg-fault and output not consistent?

Summary

running the examples from oneapi-sample master led to inconsistent sorting output and then led to seg fault.

Version

One API Base tool Kit , ONEDPL policy execution

Environment

Device : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Ubuntu 18.04

Steps to reproduce

Run examples in Jupyter/oneapi-essentials-training/07_DPCPP_Library/oneDPL_Introduction.ipynb and do the same in visual code (compile it using dpcpp -o <binary_name> dpl_buffer.cpp

Observed behavior

first attempt sorted output is correct and second attempt led to seg fault
First Attempt ==> sorting a array of integers in v{2,3,1,4} * 3 should have been 12 9 6 3
Device : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
12
9
6
3
Second Attempt ==>
Device : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
./run_dpl_buffer.sh: line 6: 13425 Segmentation fault (core dumped) ./dpl_buffer

Third Attempt on Visual code on same system, output is incorrect ..

Device : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
0
3
0
6

Fourth Attempt on Visual Code

Device : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Segmentation fault (core dumped)

Expected behavior

no seg fault and output should be consistent and correct ..

Test Bug Agola2

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

Run fail on FPGA emulator device

Summary
Run fail on FPGA emulator device.

Version
dpcpp --version
Intel(R) oneAPI DPC++ Compiler Pro 2021.1 (2020.8.0.0827)

Environment
Linux

Steps to reproduce
dpcpp -O3 -fsycl -std=c++17 -O2 -g -DNDEBUG -ltbb -lsycl -lmpi src/main.cpp -o dpc_reduce
SYCL_DEVICE_TYPE="ACC" ./dpc_reduce
Intel(R) FPGA Emulation Device
Number of steps is 1000000
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): OpenCL API failed. OpenCL API returns: -5 (CL_OUT_OF_RESOURCES) -5 (CL_OUT_OF_RESOURCES)
Aborted (core dumped)

Request to Add a oneAPI Sample OpenMP Offload Jupyter Notebook

Summary

I have some OpenMP Offload tutorial jupyter notebooks and associated code that I would like to add to oneAPI samples (currently they are on the DevCloud). This is similar to the DPC++ essential training that's currently in the samples.
I have both C++ and Fortran jupyter notebooks. The sample should reside in DirectProgramming/C++/Jupyter and DirectProgramming/Fortran/Jupyter

Problem statement

Having these notebooks and code samples would help introduce users to OpenMP Offload which is supported in the oneAPI HPC toolkit.

@pmpeter1 @JoeOster @srdontha

Replace 'dpstd' to 'oneapi::dpl' for dpc_reduce sample test

Summary

Newer oneDPL will get rid of dpstd namespace. So please help update dpc_reduce test to replace 'dpstd' with 'oneapi::dpl'

Problem line

main.cpp
750: buffer calc_values(results_per_rank, num_step_per_rank);
751: auto calc_begin2 = ~~dpstd~~::begin(calc_values);
752: auto calc_end2 = ~~dpstd~~::end(calc_values);

question about attribute((always_inline)

I don't observe performance difference whether attribute((always_inline)) is added for the sepia filter.
Can you please explain if such attribute is needed ?

Thanks

// always_inline as calls are expensive on Gen GPU.

__attribute__((always_inline)) static void ApplyFilter(uint8_t *src_image, 
                                                       uint8_t *dst_image,
                                                       int i) {

[SYCL][FPGA] unknown attributes warnings

Summary

unknown attribute

Version

Beta10

Environment

Intel DevCloud

Steps to reproduce

DPC++FPGA/Tutorials/Features/

Observed behavior

1 warning: unknown attribute 'loop_coalesce' ignored [-Wunknown-attributes]
[[intel::loop_coalesce(coalesce_factor)]]

2 warning: unknown attribute 'ivdep' ignored [-Wunknown-attributes]
[[intel::ivdep(safe_len)]]
^

Initialize environment variables gives error

Summary

Running

 >Intel oneAPI: Initialize environment variables

in VS code in Mac return Error 1

Version

oneAPI Base Toolkit version 2021.2
oneAPI HPC Toolkit version 2021.3

Environment

Operating system:
MacOS Catalina 10.15.7

Visual Studio COde:
Version: 1.58.0 (Universal)
Commit: 2d23c42a936db1c7b3b06f918cde29561cc47cd6
Date: 2021-07-08T06:54:17.694Z
Electron: 12.0.13
Chrome: 89.0.4389.128
Node.js: 14.16.0
V8: 8.9.255.25-electron.0
OS: Darwin x64 19.6.0

Steps to reproduce

I haven't cloned the repo so not sure if it is reproduceable with master branch

Observed behavior

Another notification pops up after the error that says all new terminals will have their environment set. Also, another pop-up window on the bottom right of VS appears that says it found the setvars.sh script. However, opening a new terminal in VS and typing "ifort --version" returns error.

Expected behavior

The result from the get started tutorials

Request to Add a oneAPI Sample Jacobi

Summary

This is a request for a new code Sample called Jacobi.

Purpose

The sample code contains several bugs, so our users can try the debugger to find and fix real bugs in this sample.

Domain

Tools / ApplicationDebugger

Description

This is a new sample for the Application debugger. It is more complicated than the Array transform and has several bugs introduced intentionally.
The program solves the linear equation Ax=b, where matrix A is a n x n sparse matrix with diagonals [1 1 4 1 1], vector b is set such that the solution is a [1 1 ... 1]^T. The linear system is solved via Jacobi iteration. Each Jacobi iteration submits a kernel to the device (CPU, GPU, FPGA).
@barisaktemur
@JoeOster

Dependencies

oneAPI BaseKit: DPCPP compiler

Proposed folder Location

Tools/ApplicationDebugger/jacobi

Test Bug Anjgola Triage

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

Error in the execution of PyTorch_Hello_World.py

Following the tutorial to make this sample work, after the installation of the oneAPI AI Analytics Toolkit, I executed the commands:


. /opt/intel/oneapi/setvars.sh
conda activate pytorch
cd /opt/intel/oneapi/intelpython/latest/envs/pytorch (where I cloned the script, as administrator)
python PyTorch_Hello_World.py

I get the error:

Segmentation fault (core dumped)

Instead, if I try to clone the environment:

conda create --name usr_pytorch --clone pytorch

the error is:

Source:      /opt/intel/oneapi/intelpython/latest/envs/pytorch
Destination: /home/franka/.conda/envs/usr_pytorch
The following packages cannot be cloned out of the root environment:
 - file:///opt/intel/oneapi/conda_channel/linux-64::conda-4.9.2-py37hea4d9f2_0
Packages: 74
Files: 204

Downloading and Extracting Packages
cpuonly-1.0          | ########################################################################################################################################################## | 100% 
dataclasses-0.8      | ########################################################################################################################################################## | 100% 
python_abi-3.7       | ########################################################################################################################################################## | 100% 

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
        return func(*args, **kwargs)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
        exit_code = do_call(args, p)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 83, in do_call
        return getattr(module, func_name)(args, parser)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/cli/main_create.py", line 41, in execute
        install(args, parser, 'create')
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/cli/install.py", line 222, in install
        clone(args.clone, prefix, json=context.json, quiet=context.quiet, index_args=index_args)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/cli/install.py", line 74, in clone
        index_args=index_args)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/misc.py", line 290, in clone_env
        force_extract=False, index_args=index_args)
      File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/conda/misc.py", line 90, in explicit
        assert not any(spec_pcrec[1] is None for spec_pcrec in specs_pcrecs)
    AssertionError

`$ /opt/intel/oneapi/intelpython/latest/bin/conda create --name usr_pytorch --clone pytorch`

  environment variables:
                 CIO_TEST=<not set>
                CLASSPATH=/opt/intel/oneapi/mpi/2021.2.0//lib/mpi.jar:/opt/intel/oneapi/dal/2021
                          .2.0/lib/onedal.jar
        CMAKE_PREFIX_PATH=/opt/intel/oneapi/tbb/2021.2.0/env/..:/opt/intel/oneapi/dal/2021.2.0:/
                          home/franka/elia_ws/devel:/home/franka/catkin_ws/devel:/home/franka/ws
                          _moveit/devel:/opt/ros/melodic
                CONDA_EXE=/opt/intel/oneapi/intelpython/latest/bin/conda
         CONDA_PYTHON_EXE=/opt/intel/oneapi/intelpython/latest/bin/python
               CONDA_ROOT=/opt/intel/oneapi/intelpython/latest
              CONDA_SHLVL=0
                    CPATH=/opt/intel/oneapi/tbb/2021.2.0/env/../include:/opt/intel/oneapi/mpi/20
                          21.2.0//include:/opt/intel/oneapi/mkl/latest/include:/opt/intel/oneapi
                          /ipp/2021.2.0/include:/opt/intel/oneapi/dev-utilities/2021.2.0/include
                          :/opt/intel/oneapi/dal/2021.2.0/include:/opt/intel/oneapi/compiler/202
                          1.2.0/linux/include
           CURL_CA_BUNDLE=<not set>
         FI_PROVIDER_PATH=
          LD_LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.2.0/env/../lib/intel64/gcc4.8:/opt/intel/on
                          eapi/mpi/2021.2.0//libfabric/lib:/opt/intel/oneapi/mpi/2021.2.0//lib/r
                          elease:/opt/intel/oneapi/mpi/2021.2.0//lib:/opt/intel/oneapi/mkl/lates
                          t/lib/intel64:/opt/intel/oneapi/ipp/2021.2.0/lib/intel64:/opt/intel/on
                          eapi/dal/2021.2.0/lib/intel64:/opt/intel/oneapi/compiler/2021.2.0/linu
                          x/lib:/opt/intel/oneapi/compiler/2021.2.0/linux/lib/x64:/opt/intel/one
                          api/compiler/2021.2.0/linux/lib/emu:/opt/intel/oneapi/compiler/2021.2.
                          0/linux/compiler/lib/intel64_lin:/opt/intel/oneapi/compiler/2021.2.0/l
                          inux/compiler/lib:/home/franka/elia_ws/devel/lib:/home/franka/catkin_w
                          s/devel/lib:/home/franka/ws_moveit/devel/lib:/opt/ros/melodic/lib:/opt
                          /halcon/lib/x64-linux
             LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.2.0/env/../lib/intel64/gcc4.8:/opt/intel/on
                          eapi/mpi/2021.2.0//libfabric/lib:/opt/intel/oneapi/mpi/2021.2.0//lib/r
                          elease:/opt/intel/oneapi/mpi/2021.2.0//lib:/opt/intel/oneapi/mkl/lates
                          t/lib/intel64:/opt/intel/oneapi/ipp/2021.2.0/lib/intel64:/opt/intel/on
                          eapi/dal/2021.2.0/lib/intel64:/opt/intel/oneapi/compiler/2021.2.0/linu
                          x/compiler/lib/intel64_lin:/opt/intel/oneapi/compiler/2021.2.0/linux/l
                          ib
                  MANPATH=/opt/intel/oneapi/mpi/2021.2.0/man::/opt/intel/oneapi/compiler/2021.2.
                          0/documentation/en/man/common:
                  NLSPATH=/opt/intel/oneapi/mkl/latest/lib/intel64/locale/%l_%t/%N
                     PATH=/opt/intel/oneapi/intelpython/latest/bin:/opt/intel/oneapi/intelpython
                          /latest/bin/libfabric:/opt/intel/oneapi/mpi/2021.2.0/libfabric/bin:/op
                          t/intel/oneapi/mpi/2021.2.0/bin:/opt/intel/oneapi/mkl/latest/bin/intel
                          64:/opt/intel/oneapi/dev-utilities/2021.2.0/bin:/opt/intel/oneapi/comp
                          iler/2021.2.0/linux/bin/intel64:/opt/intel/oneapi/compiler/2021.2.0/li
                          nux/bin:/opt/intel/oneapi/compiler/2021.2.0/linux/ioc/bin:/opt/ros/mel
                          odic/bin:/home/franka/anaconda3/condabin:/opt/halcon/bin/x64-linux:/ho
                          me/franka/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
                          :/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
          PKG_CONFIG_PATH=/opt/intel/oneapi/mkl/latest/tools/pkgconfig:/home/franka/elia_ws/deve
                          l/lib/pkgconfig:/home/franka/catkin_ws/devel/lib/pkgconfig:/home/frank
                          a/ws_moveit/devel/lib/pkgconfig:/opt/ros/melodic/lib/pkgconfig
               PYTHONPATH=/home/franka/catkin_ws/devel/lib/python2.7/dist-
                          packages:/home/franka/ws_moveit/devel/lib/python2.7/dist-
                          packages:/opt/ros/melodic/lib/python2.7/dist-packages
       REQUESTS_CA_BUNDLE=<not set>
         ROS_PACKAGE_PATH=/home/franka/elia_ws/src:/home/franka/catkin_ws/src:/home/franka/ws_mo
                          veit/src/franka_ros/franka_description:/home/franka/ws_moveit/src/fran
                          ka_ros/franka_gripper:/home/franka/ws_moveit/src/franka_ros/franka_msg
                          s:/home/franka/ws_moveit/src/franka_ros/franka_hw:/home/franka/ws_move
                          it/src/franka_ros/franka_control:/home/franka/ws_moveit/src/franka_ros
                          /franka_example_controllers:/home/franka/ws_moveit/src/franka_ros/fran
                          ka_ros:/home/franka/ws_moveit/src/franka_ros/franka_visualization:/hom
                          e/franka/ws_moveit/src/geometric_shapes:/home/franka/ws_moveit/src/han
                          deye_calibration:/home/franka/ws_moveit/src/moveit/moveit:/home/franka
                          /ws_moveit/src/moveit_calibration-master/moveit_calibration_plugins:/h
                          ome/franka/ws_moveit/src/moveit_msgs:/home/franka/ws_moveit/src/moveit
                          /moveit_planners/moveit_planners:/home/franka/ws_moveit/src/moveit/mov
                          eit_plugins/moveit_plugins:/home/franka/ws_moveit/src/moveit_resources
                          /moveit_resources:/home/franka/ws_moveit/src/moveit_resources/fanuc_de
                          scription:/home/franka/ws_moveit/src/moveit_resources/fanuc_moveit_con
                          fig:/home/franka/ws_moveit/src/moveit/moveit_commander:/home/franka/ws
                          _moveit/src/moveit_resources/panda_description:/home/franka/ws_moveit/
                          src/moveit_resources/panda_moveit_config:/home/franka/ws_moveit/src/mo
                          veit_resources/pr2_description:/home/franka/ws_moveit/src/moveit/movei
                          t_core:/home/franka/ws_moveit/src/moveit/moveit_planners/chomp/chomp_m
                          otion_planner:/home/franka/ws_moveit/src/moveit/moveit_planners/chomp/
                          chomp_optimizer_adapter:/home/franka/ws_moveit/src/moveit/moveit_ros/m
                          oveit_ros:/home/franka/ws_moveit/src/moveit/moveit_ros/occupancy_map_m
                          onitor:/home/franka/ws_moveit/src/moveit/moveit_ros/perception:/home/f
                          ranka/ws_moveit/src/moveit/moveit_ros/planning:/home/franka/ws_moveit/
                          src/moveit/moveit_plugins/moveit_fake_controller_manager:/home/franka/
                          ws_moveit/src/moveit/moveit_kinematics:/home/franka/ws_moveit/src/move
                          it/moveit_planners/ompl:/home/franka/ws_moveit/src/moveit/moveit_ros/m
                          ove_group:/home/franka/ws_moveit/src/moveit/moveit_ros/manipulation:/h
                          ome/franka/ws_moveit/src/moveit/moveit_ros/robot_interaction:/home/fra
                          nka/ws_moveit/src/moveit/moveit_ros/warehouse:/home/franka/ws_moveit/s
                          rc/moveit/moveit_ros/benchmarks:/home/franka/ws_moveit/src/moveit/move
                          it_ros/planning_interface:/home/franka/ws_moveit/src/moveit/moveit_pla
                          nners/chomp/chomp_interface:/home/franka/ws_moveit/src/moveit/moveit_r
                          os/visualization:/home/franka/ws_moveit/src/moveit/moveit_runtime:/hom
                          e/franka/ws_moveit/src/moveit/moveit_ros/moveit_servo:/home/franka/ws_
                          moveit/src/moveit/moveit_setup_assistant:/home/franka/ws_moveit/src/mo
                          veit/moveit_plugins/moveit_simple_controller_manager:/home/franka/ws_m
                          oveit/src/moveit/moveit_plugins/moveit_ros_control_interface:/home/fra
                          nka/ws_moveit/src/panda_moveit_config:/home/franka/ws_moveit/src/rviz_
                          visual_tools:/home/franka/ws_moveit/src/moveit_visual_tools:/home/fran
                          ka/ws_moveit/src/moveit_calibration-
                          master/moveit_calibration_gui:/opt/ros/melodic/share
        SETVARS_VARS_PATH=/opt/intel/oneapi/tensorflow/latest/env/vars.sh
            SSL_CERT_FILE=<not set>
     TERMINATOR_DBUS_PATH=/net/tenshu/Terminator2
               WINDOWPATH=2

     active environment : None
            shell level : 0
       user config file : /home/franka/.condarc
 populated config files : /opt/intel/oneapi/intelpython/latest/.condarc
                          /home/franka/.condarc
          conda version : 4.9.2
    conda-build version : not installed
         python version : 3.7.9.final.0
       virtual packages : __glibc=2.27=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /opt/intel/oneapi/intelpython/latest  (read only)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
                          file:///opt/intel/oneapi/conda_channel/linux-64
                          file:///opt/intel/oneapi/conda_channel/noarch
                          https://conda.anaconda.org/intel/linux-64
                          https://conda.anaconda.org/intel/noarch
          package cache : /opt/intel/oneapi/intelpython/latest/pkgs
                          /home/franka/.conda/pkgs
       envs directories : /home/franka/.conda/envs
                          /opt/intel/oneapi/intelpython/latest/envs
               platform : linux-64
             user-agent : conda/4.9.2 requests/2.25.1 CPython/3.7.9 Linux/5.6.19-rt11 ubuntu/18.04.5 glibc/2.27
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False


An unexpected error has occurred. Conda has prepared the above report.

How can be solved?

Request to Add a oneAPI Sample OpenCLInterop

Summary

This is a request for a new code Sample called DPC++ OpenCL Interoperabilty

Purpose

These code sample will show how OpenCL objects and Kernels can interact with DPC++. This is useful for OpenCL programmers who wants to migrate to DPC++ in a piecemeal manner, making it easier to migrate from OpenCL.

Domain

Direct Programming / DPC++

Description

Two examples here, one showing DPC++ compiling and launching an OpenCL Kernel. Another showing how DPC++ can use OpenCL memory, context, platform, device, kernel objects.

Proposed folder Location

oneAPI-samples/DirectProgramming/DPC++/OpenCLInterop

@JoeOster @pmpeter1 @bjodom @moushumi-maria

[Change Request] super(TestModel, self).init() in PyTorch_Hello_World.py

move a discussion from PR to this issue.
The request is from keryell

original code:

class TestModel(nn.Module):
    def __init__(self):
        super(TestModel, self).__init__()

Suggested codes:

        super().__init__()

Originally posted by @keryell in #103 (comment)

OpenMP offloading options

What are the OpenMP offloading options in Beta10 ? Thanks.

:~/oneAPI-samples/DirectProgramming/C++/StructuredGrids/iso3dfd_omp_offload/build$ make
[ 25%] Building CXX object src/CMakeFiles/iso3dfd.dir/iso3dfd.cpp.o
clang++: error: unsupported option '-fiopenmp'
clang++: error: unsupported option '-fopenmp-targets=spir64'
src/CMakeFiles/iso3dfd.dir/build.make:62: recipe for target 'src/CMakeFiles/iso3dfd.dir/iso3dfd.cpp.o' failed
make[2]: *** [src/CMakeFiles/iso3dfd.dir/iso3dfd.cpp.o] Error 1
CMakeFiles/Makefile2:86: recipe for target 'src/CMakeFiles/iso3dfd.dir/all' failed
make[1]: *** [src/CMakeFiles/iso3dfd.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

$ which icpx
/opt/intel/oneapi/compiler/2021.1-beta10/linux/bin/icpx

oneAPI for FPGA kernels may not be executing concurrently

Summary

Producer and Consumer kernels in the oneAPI pipe examples may not be executing concurrently. The reported start and end times for the kernels do not suggest concurrent execution. Is there a more appropriate way to collect overall FPGA profiling information than using the sycl::info commands inside of each kernel definition?

Version

OneAPI Beta 10 (latest on Intel DevCloud)

Environment

Intel DevCloud default

Steps to reproduce

Followed steps listed at https://github.com/oneapi-src/oneAPI-samples/tree/master/DirectProgramming/DPC%2B%2BFPGA/Tutorials/Features/pipes using source code there.

Observed behavior

Kernels are not executing concurrently. When adding profiling to the kernels, the first kernel (Producer) reports it start and end time, and the subsequent kernel's (Consumer) start time is after the first kernel's end time.

Profiling commands after q is submitted in the kernel definitions:
k_start = q.get_profiling_infosycl::info::event_profiling::command_start();
k_end = q.get_profiling_infosycl::info::event_profiling::command_end();

Expected behavior

Consumer kernel's start time should be before Producer kernel's end time.

Test bug agola2 read

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

read-only accessor does not return const, it should be changed to read-write accessor

Summary

Defining a read-only accessor results in error when being used as a const object.
In this example:
https://github.com/oneapi-src/oneAPI-samples/blob/master/DirectProgramming/DPC%2B%2B/Jupyter/oneapi-essentials-training/02_DPCPP_Program_Structure/src/complex_mult_solution.cpp#L61-L70

V1 and V2 must be defined as read-write in order to use complex_mul.

Environment

DevCloud, Jupyter notebook

Steps to reproduce

Run the example code in tutorial.

test

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

Environment

Provide OS information and hardware information if applicable.

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

Expected behavior

Document behavior you expect.

printf question

It is not clear why printf is not supported in DPC++. printf is supported in DPC++ with CUDA support.

Addressing Warnings in Migrated Code
Migration generated one warning for code that dpct could not migrate:

warning: DPCT1015:0: Output needs adjustment.
As you have noticed, the migration of this project resulted in one DPCT message that needs to be addressed, DPCT1015. This message is shown because as the Compatibility Tool migrated from the printf-style formatted string in the CUDA code to the output stream supported by DPC++, manual adjustment is needed to generate the equivalent output.

Open result/foo/bar/util.dp.cpp and locate the error DPCT1015. Then make the following changes:

Change:

stream_ct1 << "kernel_util,%d\n";
to

stream_ct1 << "kernel_util," << c << sycl::endl;
You’ll also need to change the stream statement in result/foo/main.dp.cpp.

Change:

stream_ct1 << z"kernel_main!\n";
to

stream_ct1 << "kernel_main!" << sycl::endl;

Unit of time in iso2dfd.cpp

Hi,
I think the unit of time in DirectProgramming/DPC++/StructuredGrids/iso2dfd_dpcpp/src/iso2dfd.cpp
is not correct. "ms" are displayed while it should be "s".

Request to Add a oneAPI Sample fourier_correlation

Summary

This is a request for a new code sample called fourier_correlation, which composes the Fourier correlation algorithm from oneMKL functions.

##Purpose
The samplesy show how to compose more complex mathematical operations from multiple functions while paying attention to where the data resides (to minimize host-device transfers) and where explicit synchronization is required vs. where synchronization is implicit in the task graph.

Description

The Fourier correlation algorithm is commonly used to align 1D signals, overlay 2D images, perform 3D volumetric medical image registration, etc. The algorithm has high arithmetic intensity and the datasets are usually large so performance is critical.

The Fourier correlation algorithm is corr = IDFT(DFT(sig1) * CONJG(DFT(sig2))) where sig1 and sig2 are the real input data (e.g., 1D signals, 2D images, or 3D volumetric images), DFT is the discrete Fourier transform, IDFT is the inverse DFT, and CONJG is the complex conjugate. The necessary functions are available in oneMKL. In addition, oneMKL random number generators are used to add noise to the input data, which is a common technique in signal processing.

The two example codes implement the Fourier correlation algorithm using explicit buffering and USM. They show how to compose more complex mathematical operations from multiple functions while paying attention to where the data resides (to minimize host-device transfers) and where explicit synchronization is required vs. where synchronization is implicit in the task graph.

@srdontha
@JoeOster
@petercad

Domain

Libraries/oneMKL

Dependencies

None

Proposed folder Location

oneAPI-samples/Libraries/oneMKL/fourier_correlation

fpga compiler command failed with exit code 1

Summary

fpga compiler command failed with exit code 1

Steps to reproduce

By following the tutorial, I can't compile program for FPGA.

Observed behavior

u50623@s001-n001:~/download/oneAPI-samples/DirectProgramming/DPC++/DenseLinearAlgebra/simple-add$ make hw -f Makefile.fpga
dpcpp -O2 -g -std=c++17 -fintelfpga a_buffers.o -o simple-add-buffers.fpga -Xshardware
aoc: Compiling for FPGA. This process may take several hours to complete. Prior to performing this compile, be sure to check the reports to ensure the design will meet your performance targets. If the reports indicate performance targets are not being met, code edits may be required. Please refer to the oneAPI FPGA Optimization Guide for information on performance tuning applications for FPGAs.
Error (23035): Tcl error:
Error (23031): Evaluation of Tcl script build/entry.tcl unsuccessful
Error: Quartus Prime Shell was unsuccessful. 2 errors, 0 warnings
For more detail, full Quartus compile output can be found in files quartuserr.tmp and quartus_sh_compile.log.
Error: Compiler Error, not able to generate hardware

dpcpp: error: fpga compiler command failed with exit code 1 (use -v to see invocation)
Makefile.fpga:33: recipe for target 'simple-add-buffers.fpga' failed
make: *** [simple-add-buffers.fpga] Error 1

What should I do?

dpcpp sample iso3dfd run on the incorrect device on tigerlake

Summary

Provide a short summary of the issue. Sections below provide guidance on what
factors are considered important to reproduce an issue.

oneAPI-samples/DirectProgramming/DPC++/StructuredGrids/iso3dfd_dpcpp/src/iso3dfd.cpp

Lines 282 to 288 in 4bed52e

 std::string pattern("CPU"); 

 std::string pattern_gpu("Gen"); 

 // Replacing the pattern string to Gen if running on a GPU 

 if (is_gpu) { 

 pattern.replace(0, 3, pattern_gpu); 

 }

the pattern not works on tigerlake

Version

Report oneAPI Toolkit version and oneAPI Sample version or hash.

2021.4

Environment

Provide OS information and hardware information if applicable.

Ubuntu 20.04

Steps to reproduce

Please check that the issue is reproducible with the latest revision on
master. Include all the steps to reproduce the issue.

Just run on tigerlake

Observed behavior

Document behavior you observe. For performance defects, like performance
regressions or a function being slow, provide a log if possible.

make run should run on the gpu, but on the cpu
make run_cpu should run on the cpu, but on the gpu

$ make run 
Grid Sizes: 256 256 256
Memory Usage: 230 MB
 ***** Running C++ Serial variant *****
Initializing ...
--------------------------------------
time         : 1.91385 secs
throughput   : 87.6621 Mpts/s
flops        : 5.34739 GFlops
bytes        : 1.05195 GBytes/s

--------------------------------------

--------------------------------------
 ***** Running SYCL variant *****
Initializing ...
 Running on 11th Gen Intel(R) Core(TM) i7-1185G7E @ 2.80GHz
 The Device Max Work Group Size is : 8192
 The Device Max EUCount is : 8
 The blockSize x is : 32
 The blockSize y is : 8
 Using Global Memory Kernel
--------------------------------------
time         : 0.657646 secs
throughput   : 255.11 Mpts/s
flops        : 15.5617 GFlops
bytes        : 3.06132 GBytes/s

--------------------------------------

--------------------------------------
Final wavefields from SYCL device and CPU are equivalent: Success

$ make run_cpu

Scanning dependencies of target run_cpu
Grid Sizes: 256 256 256
Memory Usage: 230 MB
 ***** Running C++ Serial variant *****
Initializing ...
--------------------------------------
time         : 1.7656 secs
throughput   : 95.0226 Mpts/s
flops        : 5.79638 GFlops
bytes        : 1.14027 GBytes/s

--------------------------------------

--------------------------------------
 ***** Running SYCL variant *****
Initializing ...
 Running on Intel(R) Iris(R) Xe Graphics [0x9a49]
 The Device Max Work Group Size is : 512
 The Device Max EUCount is : 96
 The blockSize x is : 256
 The blockSize y is : 1
 Using Global Memory Kernel
--------------------------------------
time         : 0.505061 secs
throughput   : 332.182 Mpts/s
flops        : 20.2631 GFlops
bytes        : 3.98618 GBytes/s

--------------------------------------

--------------------------------------
Final wavefields from SYCL device and CPU are equivalent: Success

Expected behavior

Document behavior you expect.

run on the correct device

Request to Add a oneAPI Sample oneVPL Installation and Testing

Summary

This is a request for a new code Sample called oneVPL Installation and Testing. The goal is to add Dockerfiles enabling easy and quick install the full oneVPL component from public repo on Linux, then run a simple validation test automatically.

Purpose

Answer the following questions

What specifically is this code sample trying to show?

Provide quick start and demonstration of different OS installation deployments. Augment article for installation guide for VPL.
Provide quick start and concrete configuration parameters for multiple H/W platforms (i.e. Tigerlake, Sapphire Rapids, dGFX). Expectation is hardware combinations will require substantial permutations.
Create quick path that does not require Base Toolkit (5GB download/25GB disk space) for VPL developers.
Give specific instructions that may be best left abstract (universal) in online html-based installation guide.

Why is this important to the oneAPI ecosystem?

Provide a quick and easy way to enable oneVPL on Linux.
Following the community trend to use Docker for accurate instructions.
Improve the developer experience and create a efficient support channel
Extendable for more Linux distribution support.

Domain

oneVPL, Docker, Linux environment

Description

oneVPL software stack includes several components out of the standard release package, this makes installation complicated. The purpose of this sample is to use Docker file to execute the whole process of installation and validation automatically. This is critical to enable the oneVPL product to the oneAPI community and broad our support in different dimensions.

Dependencies

Docker, oneVPL of Base Toolkit, oneVPL GPU RT from public Intel graphic repo.

Proposed folder Location

oneVPL installation and test with Docker

Checklist

[ ] Samples Working Group Permission accepted on

Add empty ciTest object so testing does not fail.

see #240

Issue with missing ciTests object in sample.json file for the three new dpct samples.

FYI: @JoeOster and @srdontha

Missing Fortran library after installation

Hi,

I have installed the latest version of intel-aikit (2021.1.1) on an Ubuntu 20.04 64-bit LTS.
Even the simplest use case with setting the environment variables once for a shell (as described in https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top/before-you-begin.html#before-you-begin does not lead to a functional environment.

Output of $ source setvars.sh

:: initializing oneAPI environment ...
BASH version = 5.0.16(1)-release
:: intelpython -- latest
:: mkl -- latest
:: iLiT -- latest
:: modelzoo -- latest
:: mpi -- latest
:: ipp -- latest
:: tbb -- latest
:: dal -- latest
:: dev-utilities -- latest
:: compiler -- latest
:: oneAPI environment initialized ::

Trying Sample Daal4py Linear Regression Example for Distributed Memory Systems [SPMD mode]
https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelPython_daal4py_GettingStarted

If I run the file directly through python: python IntelPython_daal4py_GettingStarted.py then I encounter a problem with the second import (the first line: import daal4py as d4p does finish).

Traceback (most recent call last):
File "/tmp/IntelPython_daal4py_GettingStarted.py", line 36, in
from sklearn.datasets import load_boston
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/init.py", line 80, in
from .base import clone
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/base.py", line 21, in
from .utils import _IS_32BIT
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/utils/init.py", line 23, in
from .class_weight import compute_class_weight, compute_sample_weight
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/utils/class_weight.py", line 7, in
from .validation import _deprecate_positional_args
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/utils/validation.py", line 25, in
from .fixes import _object_dtype_isnan, parse_version
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/sklearn/utils/fixes.py", line 18, in
import scipy.stats
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/stats/init.py", line 388, in
from .stats import *
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/stats/stats.py", line 174, in
from scipy.spatial.distance import cdist
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/spatial/init.py", line 101, in
from ._procrustes import procrustes
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/spatial/_procrustes.py", line 9, in
from scipy.linalg import orthogonal_procrustes
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/linalg/init.py", line 194, in
from .misc import *
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/linalg/misc.py", line 3, in
from .blas import get_blas_funcs
File "/opt/intel/oneapi/intelpython/latest/lib/python3.7/site-packages/scipy/linalg/blas.py", line 213, in
from scipy.linalg import _fblas
ImportError: libifport.so.5: cannot open shared object file: No such file or directory

I have tried to use the specific var setting shell file for the compiler, but that just messes up the variables so that it no longer sees intel_python.

There are other shell files in the /opt/intel/oneapi folder named modulefiles-setup.sh and sys_check.sh.
I have run modulefiles-setup.sh as root with output

:: Initializing oneAPI modulefiles folder ...
:: Removing any previous oneAPI modulefiles folder content.
:: Generating oneAPI modulefiles folder links.
-- compiler/2021.1.2 -> /opt/intel/oneapi/compiler/2021.1.2/modulefiles/compiler
-- compiler-rt/2021.1.2 -> /opt/intel/oneapi/compiler/2021.1.2/modulefiles/compiler-rt
-- compiler-rt32/2021.1.2 -> /opt/intel/oneapi/compiler/2021.1.2/modulefiles/compiler-rt32
-- compiler32/2021.1.2 -> /opt/intel/oneapi/compiler/2021.1.2/modulefiles/compiler32
-- compiler/latest -> /opt/intel/oneapi/compiler/latest/modulefiles/compiler
-- compiler-rt/latest -> /opt/intel/oneapi/compiler/latest/modulefiles/compiler-rt
-- compiler-rt32/latest -> /opt/intel/oneapi/compiler/latest/modulefiles/compiler-rt32
-- compiler32/latest -> /opt/intel/oneapi/compiler/latest/modulefiles/compiler32
-- dev-utilities/2021.1.1 -> /opt/intel/oneapi/dev-utilities/2021.1.1/modulefiles/dev-utilities
-- dev-utilities/latest -> /opt/intel/oneapi/dev-utilities/latest/modulefiles/dev-utilities
-- mkl/2021.1.1 -> /opt/intel/oneapi/mkl/2021.1.1/modulefiles/mkl
-- mkl32/2021.1.1 -> /opt/intel/oneapi/mkl/2021.1.1/modulefiles/mkl32
-- mkl/latest -> /opt/intel/oneapi/mkl/latest/modulefiles/mkl
-- mkl32/latest -> /opt/intel/oneapi/mkl/latest/modulefiles/mkl32
-- mpi/2021.1.1 -> /opt/intel/oneapi/mpi/2021.1.1/modulefiles/mpi
-- mpi/latest -> /opt/intel/oneapi/mpi/latest/modulefiles/mpi
-- tbb/2021.1.1 -> /opt/intel/oneapi/tbb/2021.1.1/modulefiles/tbb
-- tbb/latest -> /opt/intel/oneapi/tbb/latest/modulefiles/tbb
:: oneAPI modulefiles folder initialized.
:: oneAPI modulefiles folder is here: "/opt/intel/oneapi/modulefiles"

Should the missing library be installed as part of installing intel-aikit or do I need another package to get this to work?

Run make fpga_emu -f Makefile.fpga on devcloud for vector-add and encountered an error.

u89487@login-2:~/oneAPI-samples/DirectProgramming/DPC++/DenseLinearAlgebra/vector-add$ make fpga_emu -f Makefile.fpga
dpcpp -O2 -g -std=c++17 -fintelfpga src/vector-add-buffers.cpp -o vector-add-buffers.fpga_emu -DFPGA_EMULATOR=1
src/vector-add-buffers.cpp:103:8: warning: 'INTEL' is deprecated: use 'ext::intel' instead [-Wdeprecated-declarations]
INTEL::fpga_emulator_selector d_selector;
^
/glob/development-tools/versions/oneapi/2021.4/inteloneapi/compiler/2021.4.0/linux/include/sycl/ext/intel/fpga_reg.hpp:49:11: note: 'INTEL' has been explicitly marked deprecated here
namespace __SYCL2020_DEPRECATED("use 'ext::intel' instead") INTEL {
^
/glob/development-tools/versions/oneapi/2021.4/inteloneapi/compiler/2021.4.0/linux/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:52:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/glob/development-tools/versions/oneapi/2021.4/inteloneapi/compiler/2021.4.0/linux/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:43:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
1 warning generated.
Platform name: Intel(R) FPGA Emulation Platform for OpenCL(TM)
Device name: Intel(R) FPGA Emulation Device
Driver version: 2021.12.9.0.24_005321
terminate called recursively
terminate called after throwing an instance of 'std::runtime_error'
terminate called recursively
terminate called recursively
llvm-foreach: Aborted
dpcpp: error: fpga compiler command failed with exit code 254 (use -v to see invocation)
Intel(R) oneAPI DPC++/C++ Compiler 2021.4.0 (2021.4.0.20210924)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /glob/development-tools/versions/oneapi/2021.4/inteloneapi/compiler/2021.4.0/linux/bin
dpcpp: note: diagnostic msg: Error generating preprocessed source(s).
Makefile.fpga:19: recipe for target 'vector-add-buffers.fpga_emu' failed
make: *** [vector-add-buffers.fpga_emu] Error 254

Unable to fix this problem. Could you please help me?

	std::string pattern("CPU");
	std::string pattern_gpu("Gen");

	// Replacing the pattern string to Gen if running on a GPU
	if (is_gpu) {
	pattern.replace(0, 3, pattern_gpu);
	}