GithubHelp home page GithubHelp logo

question about which version to use about ideep HOT 25 CLOSED

intel avatar intel commented on September 5, 2024
question about which version to use

from ideep.

Comments (25)

wangyuyue avatar wangyuyue commented on September 5, 2024 1

export BOOST_ROOT=<boost_install_folder>/include
cd /path/to/mlperf_submit/pytorch
cd third_party/ideep/euler/
mkdir build; cd build
cmake3 .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2
make
I think intel c++ compiler is firstly used here.
I change icc to gcc, icpc to g++, cmake3 to cmake, and BOOST_ROOT=/opt/boost_1_70_0/.
Then I get the error message below:
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp:270:68: error: β€˜new’ of type β€˜euler::elx_int8_conv_wino_t<euler::ConvTypes<unsigned char, float, signed char, float>, euler::ConvImplTypes<unsigned char, signed char, float, float, float>, 6, 3, 16, 512>’ with extended alignment 64 [-Werror=aligned-new=] xc = new elx_int8_conv_wino_t<U, T, 6, 3, 16, ISA_AVX512>(*this);
Thanks for your kind help!

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024 1

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

Hi, I don't need to get the exactly same performance as the original experiment. It is enough to run the whole process.
@uyongw I'm reproducing using this doc

from ideep.

pinzhenx avatar pinzhenx commented on September 5, 2024

Hi @wangyuyue

To repro mlperf results, you may refer to this doc. It will choose the right ideep commit for you

cd pytorch
git fetch origin pull/25235/head:mlperf
git checkout mlperf
git submodule sync && git submodule update --init --recursive

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

Hi, thanks for the response. I notice that I should set MKLROOT. Should I install MKL in advance and set MKLROOT as /opt/intel/mkl (the default install location is /opt/intel)?
And is intel c++ compiler a necessity? Can I use gcc to do all the compilation?
For these two lines of code, what should I set if I use libgomp.so instead of libiomp5.so?
export LD_PRELOAD=/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64/libiomp5.so
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64

from ideep.

CaoZhongZ avatar CaoZhongZ commented on September 5, 2024

Yes, you should install MKL. We do not use intel c++ compiler, or have we used ICX (intel next gen)? I'm not sure. You could set libgomp.so instead of libiomp5.so, but that will have performance impact because all the configurations targeted for intel OpenMP. And intel openmp is an open source project under LLVM umbrella.

from ideep.

CaoZhongZ avatar CaoZhongZ commented on September 5, 2024

Oh, my bad, Euler must be compiled by ICC or you have absolutely no performance at all. πŸ₯³

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

Now I installed icc and icpc.

Firstly I run cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

Then I run make, and meet error. Below is part of the error message.

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(101): error: class "_mm<16>" has no member "stream_ps"
          _mm<V>::stream_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(102): error: expression must be a pointer to a complete object type
                         *((__m<V> *)&md3(at, _hA, _wA, 0)));
                          ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(107): error: class "_mm<16>" has no member "store_ps"
          _mm<V>::store_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(108): error: expression must be a pointer to a complete object type
                        *((__m<V> *)&md3(at, _hA, _wA, 0)));
                         ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: class "_mm<16>" has no member "cvt_f32_b16"
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                               ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: expression must be a pointer to a complete object type
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                                            ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

from ideep.

CaoZhongZ avatar CaoZhongZ commented on September 5, 2024

@uyongw

from ideep.

CaoZhongZ avatar CaoZhongZ commented on September 5, 2024

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

DNNL/MKL is major acceleration library with production quality. Euler was a winograd kernel library. For a competitive performance, as long as code was published for scrutiny, we could do whatever reasonable optimization we want. So it is more complicated than ordinary environment. 😊

from ideep.

uyongw avatar uyongw commented on September 5, 2024

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json
Here is the content:

[
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp"
},
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp"

from ideep.

uyongw avatar uyongw commented on September 5, 2024

Can you help to check your ICC version?
$ /opt/intel/sw_dev_tools/bin/icpc--version

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

$ /opt/intel/sw_dev_tools/bin/icpc--version

icpc (ICC) 19.1.2.254 20200623
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

I downloaded the installer from Intel system studio, and I think it's the latest version. Can you also provide the install webpage to me if I need to install an older version? Thanks.

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

Here are some environment variables I set:

export MKLROOT=/opt/intel/mkl

export USE_OPENMP=ON

export CAFFE2_USE_MKLDNN=ON

export MKLDNN_USE_CBLAS=ON

export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so

export OMP_NUM_THREADS=8 KMP_AFFINITY=proclist=[0-7],granularity=thread,explicit

export PYTHONPATH=/opt/mlperf_submit/pytorch/build

export BOOST_ROOT=/opt/boost_1_70_0

export PATH=$PATH:/opt/intel/sw_dev_tools/bin

Hope this is helpful for the debug.

from ideep.

uyongw avatar uyongw commented on September 5, 2024

Can run verbose make?

$ cd build
$ make VERBOSE=1

BTW, for ICC, normally you run the setup procedure like this (before the build):
$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
You have changed variables that require your cache to be deleted.
Configure will be re-run and you may have to reset some variables.
The following variables have changed:
CMAKE_CXX_COMPILER= icpc
CMAKE_C_COMPILER= icc

-- The CXX compiler identification is Intel 19.1.2.20200623
-- The C compiler identification is Intel 19.1.2.20200623
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Git: /usr/bin/git (found version "2.17.1")
-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found Gflags: /usr/include
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

make VERBOSE=1

/usr/bin/cmake -H/opt/mlperf_submit/pytorch/third_party/ideep/euler -B/opt/mlperf_submit/pytorch/third_party/ideep/euler/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/depend
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
cd /opt/mlperf_submit/pytorch/third_party/ideep/euler/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake --color=
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Scanning dependencies of target el
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/build
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
[  0%] Building CXX object CMakeFiles/el.dir/src/common/el_log.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/eld_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/elx_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv.cpp
[  2%] Building CXX object CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp
------------------------------------------------------------------------------------------------
I omitted the error messages, which are similar to what I pasted above https://github.com/intel/ideep/issues/43#issuecomment-672547982.
------------------------------------------------------------------------------------------------
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(280): error: incomplete type is not allowed
    const __i<V> vindex = _mm<V>::set_epi32(SET_VINDEX_16(xc->ih * xc->iw));
                 ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_nchw(TinputType *, InputType *, int, int, int) [with TinputType=short, InputType=float, I=512, A=4, K=3, V=16]" at line 359

compilation aborted for /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp (code 4)
CMakeFiles/el.dir/build.make:134: recipe for target 'CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o' failed
make[2]: *** [CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o] Error 4
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/el.dir/all' failed
make[1]: *** [CMakeFiles/el.dir/all] Error 2
make[1]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

from ideep.

uyongw avatar uyongw commented on September 5, 2024

Can you check your CPU info like this:

$ lscpu
or
$ cat /proc/cpuinfo | grep flags | head -n1

Euler expects you have at least Skylake server with AVX512 instruction set. For MLPerf it requires Cascade Lake (CLX) or Cooper Lake (CPX) which has AVX512_VNNI instruction support. The error you met should because your building box is not with AVX512 (the -xHost tells the compiler to generate instructions according to your host building environment but Euler has intrinsics using AVX512).

If you can please use CLX to build. Or you can cross-compile it using the command below. But you may not run it without a HW with AVX512/VNNI.

$ mkdir -p build && cd build
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS=-xCore-AVX512 -DWITH_VNNI=ON -DWITH_TEST=OFF
$ make -j

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to make too many changes. Thanks.

from ideep.

uyongw avatar uyongw commented on September 5, 2024

You can disable Euler and use MKLDNN engine instead.

During the build of PyTorch/Caffe2, disable EULER like this:

export CAFFE2_USE_EULER=OFF

BTW which document are your referencing to? @wuhuikx May have more insights on that part.

from ideep.

CaoZhongZ avatar CaoZhongZ commented on September 5, 2024

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

from ideep.

uyongw avatar uyongw commented on September 5, 2024

@CaoZhongZ Yes. MLPerf submission is based on INT8 low precision. At 2019 even MKLDNN does not support INT8 on AVX2. So the minimal HW requirement is Skylake with AVX512 to run INT8 with either MKLDNN or Euler. @wangyuyue

from ideep.

vpirogov avatar vpirogov commented on September 5, 2024

int8 optimizations for systems without Intel AVX512 are available in oneDNN v1.2 and later.

from ideep.

wangyuyue avatar wangyuyue commented on September 5, 2024

Thanks for the response. So I need to clone the latest oneDNN as third party of Pytorch? Do I need some other changes? @vpirogov
Here is the result of cat /proc/cpuinfo | grep flags | head -n1 on my host

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

from ideep.

uyongw avatar uyongw commented on September 5, 2024

Unfortunately this will not work unless quite an effort on back porting. @wangyuyue

from ideep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.