GithubHelp home page GithubHelp logo

VS2015 Build Errors about deepcl HOT 46 CLOSED

jakakonda avatar jakakonda commented on May 24, 2024
VS2015 Build Errors

from deepcl.

Comments (46)

hughperkins avatar hughperkins commented on May 24, 2024

Interesting. Maybe I'm using an old fork, I mean, I am using an old fork. What happens if you replace the clBLAS referenced by DeepCL with a new fork? I think to do that you would do something like:

cd clMathLibraries/clBLAS
git fetch
git checkout [put-some-commit-hash-here]
cd ..
git add clBLAS

... and then rebuild again.

If that works, please submit a pull request, or at least the commit hash that you tested against, and found working, and I will take a look.

Edit: hmmm, the error is appearing in the headerfiles. It's possible I'm calling something in an unsupported way. Dont suppose... can you provide the full build output please? (Edit 2: hmmm, having looked at the headerfile, seems unlikely that it's to do with how I'm calling it (though it's possible), so I reckon first thing to do is try using a newer fork, as per the start of this reply)

Edit 3: Per https://github.com/clMathLibraries/clBLAS/releases/tag/v2.8 , v2.8 supports VS2015. That strongly implies to my mind that pre-2.8 does not support VS2015. Please confirm, as per test outlined in the start of this reply, and then I'll update DeepCL to point to clBLAS 2.8.

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Changing clBLAS library to the latest version does solve a lot of problems (alignment and a couple of others), but it still doesn't build successfully.

Full (new) output log here.
Partial for clBLAS-external here.
If I can anyhow provide it in a nicer format please let me know.

Limiting myself to clBLAS-external project and getting an error:

LNK2019 unresolved external symbol clEnqueueBarrierWithWaitList referenced in function emptyAction [D:\DeepCL\build_VS14_x64\clBLAS\src\clBLAS-external-build\library\clBLAS.vcxproj] clBLAS-external D:\DeepCL\build_VS14_x64\matrix.obj 1

I'll continue by checking if OpenCL libraries might be incorrectly configured and continue from there on with my "investigation".

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

I created a branch with clblas-2.8.0 on. Builds ok for me, but doesnt run on my nvidia GPU, I get clMathLibraries/clBLAS#153 , but might work for you? Branch is clblas-2.8.0, https://github.com/hughperkins/DeepCL/tree/clblas-2.8.0 (Edit: note, didnt see you had added a post, before posting this, or maybe we posted at the same time?)

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

For those sprintf build warnings, pzawal found a solution #30 (comment)

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

For the clEnqueueBarrierWaitList error, the clew build referenced by the EasyCL build, in the clblas-2.8.0 branch above fixes this error.

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

I was a minute or two faster, same time practically.

I download your clblas-2.8.0 branch and it compiled without any problems :). Thanks!
As for the error, I get something similar as you on my notebook amd (hd7730) and also on my integrated intel gpu (hd4000). Still have one slightly stronger amd gpu to test.

D:\DeepCL3\clMathLibraries\clBLAS\src\library\blas\xgemm.cc
clBuildProgram Failed
err = -11
Error: Failed to build program executable!
Build Log:
Error: aclBinary init failure

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Works with my older amd gpu, doesn't support OpenCL 2.0 which seems to be the problem I get on my notebook.

Thank you very much for your help! Based on my experimenting today it's really awesome, keep up the good work!

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Ok. What do you mean by 'doesnt support OpenCL 2.0'? You mean, if OpenCL 2.0 is available, then something in DeepCL stops working?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

I believe everything is OK with DeepCL itself, I was referencing to clMathLibraries/clBLAS#153 error you and I both received (but a different message). It doesn't appear on my older amd gpu, while it does on a newer with OpenCL 2.0. So I'd say that the problem is purely clBLAS specific.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hmmm, ok. Arguably, it's still DeepCL's responsibility to fix it somehow, eg by not using clBLAS, or by fixing it in clBLAS, but since it's a new release of clBLAS, I reckon might be worth waiting a few weeks, in case someone fixes it for us :-)

(Note that I'm pondering that my own segfaults, on NVIDIA, might be because clBLAS really is using clEnqueueBarrierWithLists, which just points into the void on my card, hence the segfault. I might see if I can add a guard into clew for that, which at least throws a useful exception instead, ideally.)

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda, I've fixed both the errors I was getting on clblas now, and updated DeepCL to point to fixes for those. It's still a bit leaky (very leaky actually :-( ) but at least no obvoius failures/seg-faults. Do you want to pull down the latest clblas-2.8.0 branch of DeepCL, and see what happens?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Wow that was fast! I was just getting ready to wait for a week or two :).
Sure will will. I'll let you know what happens.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Ok :-)

(Edit: make sure to monitor your memory carefully, if you run the unit-tests; I can only run about half of them, and then I have to ctrl-C out :-P )

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi. Quick update, commit d93a41a , on branch clblas-2.8.0 of DeepCL, fixes the memory leaks, and now the tests run to completion for me:

[----------] Global test environment tear-down
[==========] 159 tests from 28 test cases ran. (36835 ms total)
[  PASSED  ] 157 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] testbackward.compare_1_n_kgsgo_32c5
[  FAILED  ] testsinglebatch.imagesize5_filtersize3_batchsize2_10filters
$ git log -n 5 --oneline
d93a41a fix leak in clblas teardown
9133a42 update to latest clblas, which fixes segfault after teardown/setup
a93593d Update clblas to 2.8.0.  doesnt work on nvidia.
719e82b Add guards to BackpropWeightsScratch and BackpropWeightsScratchLarge
7634e54 make inputimage size 15 on amd for certain kernels for backward.test_kgs

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hmmm, but hte error you posted earlier, about failure to build program, is quite different from the errors I've been seeing... there's nothing I've changed that will have addressed that... (edit... well, it might have done ... if it was caused by the initializerlist... I guess you can try, and then you will need to log the issue with clBLAS probably, if it persists, along with as much logfiles and so on as you can find)

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

After a couple of hiccups while building (purely my stupidity), builds fine, all tests passing until:

Assertion failed: false, file D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\xgemm.cc, line 163

Output for unittests. Tests ran on intel hd4000, but the same assertion occurs while running mnist training on amd.
(Edit: Currently searching from where this code is called)

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda, accidentally replied to the gist rather than to the issue here :-P

Can you do the following please:

  1. To find out which build options are being used, insert the following at line 159 of xgemm.cc:
      printf("build options %s\n", sourceBuildOptions);

then rebuild and rerun.
2. To see if we can get a simple reproduceable example to log on clblas issues, can you download clblas directly, build the samples, and try running the xgemm sample:

git clone --recursive https://github.com/clMathLibraries/clBLAS.git
cd clBLAS/src
mkdir build
cd build
ccmake ..
# press  'c'
# turn off tests, turn on samples
# press 'c', then 'g', ccmake should exit, then:
make -j 4
LD_LIBRARY_PATH=library samples/example_sgemm

(Edit: hopefully the sgemm sample will fail for you, with same error about buildoptions, in which case its easy to raise an issue in the clblas project issue tracker)

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024
  1. Build Options

    build options -cl-std=CL2.0
    OpenCL error -43 on line 164 of D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\xgemm.cc
    Assertion failed: false, file D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\xgemm.cc, line 164
    
  2. Sgemm example compiled directly clBLAS runs without any problems.

    size_t valueSize;
    clGetDeviceInfo(device, CL_DEVICE_VERSION, 0, NULL, &valueSize);
    char *value = (char*)malloc(valueSize);
    clGetDeviceInfo(device, CL_DEVICE_VERSION, valueSize, value, NULL);
    printf("OpenCL Version: %s", value);
    free(value);
    

    For informational purposes I outputed OpenCL version ran by sgemm example and got:

    OpenCL Version: OpenCL 1.2

I'd suspect that the origin of the problem comes with different OpenCL version, which would also explain why it works on an older gpu.
But my knowledge about OpenCL is fairly limited so I won't vouch that code above prints the correct version.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

That's interesting. Hmmm, so what do you see if, from the DeepCL build directory, you do:

cd clBLAS/src/clBLAS-external-build
ccmake ../../../../clMathLibraries/clBLAS/src/

There is a setting 'OPENCL_VERSION'. What does it say? What happens if you change it to read '1.2', press 'c' for configure, 'g' for generate, at which point ccmake will exit, and then rebuild?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

OPENCL_VERSION:STRING is set to 1.2, which is strange as the kernel is compiled with 2.0 (based on build options output).

Now I'm compiling clBLAS with OpenCL 2.0 to see what happens.

Edit: same error occurs with DeepCL (assertion... xgemm.cc line 163).

So I'd say there are two bugs:

  1. clBLAS xgemm assertion
  2. DeepCL uses OpenCL 2.0 somehow even thought 1.2 is specified in config.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hmmm, that's .... odd :-P There's nothing in DeepCL that knowingly requests opencl 2.0. In fact, my own gpus have always been opencl 1.1, for a very long time (until about 3 days ago, when I upgraded the drivers of one of them, to the lofty heights of 1.2 :-P ). But I have a few ideas of what could be going on plausibly, and will have a dig through the code a bit.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda, in your DeepCL directory, can you open the file build/clBLAS/src/clBLAS-external-build/include/AutoGemmIncludes/AutoGemmKernelBuildOptionsSource.cpp , and copy/paste eg lines 9-11?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Sure I can:

const char * const sgemm_Col_NN_B0_MX096_NX096_KX08_srcBuildOptions = "-cl-std=CL1.2";
const char * const sgemm_Col_NN_B0_MX096_NX096_KX01_srcBuildOptions = "-cl-std=CL1.2";
const char * const sgemm_Col_NN_B0_MX064_NX064_KX16_srcBuildOptions = "-cl-std=CL1.2";

Everywhere the string values is -cl-std=CL1.2

I also did a search for "cl2.0" string in entire DeepCL directory and got:

  D:\DeepCL\clMathLibraries\clBLAS\src\flags_public.txt (2 hits)
    Line 3: HAWAII2_OCL "-cl-std=CL2.0";
    Line 4: BONAIRE_OCL "-cl-std=CL2.0";
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\AutoGemm\UserGemmKernelSources\UserGemmKernelSourceIncludes.cpp (1 hit)
    Line 54: //const char * const User_srcBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\AutoGemm\UserGemmKernelSources\UserGemmKernelSourceIncludes.h (2 hits)
    Line 12: const char * const User_srcBuildOptions = "-cl-std=CL2.0";
    Line 13: const char * const User_binBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\functor\gcn_sgemm.cc (1 hit)
    Line 107:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\functor\gcn_sgemmSmallMatrices.cc (1 hit)
    Line 103:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\functor\gcn_zgemm.cc (1 hit)
    Line 107:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\trtri\TrtriKernelSourceIncludes.h (2 hits)
    Line 9: const char * const TrtriBuildOptions = "-cl-std=CL2.0";
    Line 10: const char * const TrtribinBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\clMathLibraries\clBLAS\src\library\CMakeLists.txt (1 hit)
    Line 713:     set( OCL_COMPILER_FLAGS "-cl-std=CL2.0")
  D:\DeepCL\dist\bin\src\flags_public.txt (2 hits)
    Line 3: HAWAII2_OCL "-cl-std=CL2.0";
    Line 4: BONAIRE_OCL "-cl-std=CL2.0";
  D:\DeepCL\dist\bin\src\library\blas\AutoGemm\UserGemmKernelSources\UserGemmKernelSourceIncludes.cpp (1 hit)
    Line 54: //const char * const User_srcBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\dist\bin\src\library\blas\AutoGemm\UserGemmKernelSources\UserGemmKernelSourceIncludes.h (2 hits)
    Line 12: const char * const User_srcBuildOptions = "-cl-std=CL2.0";
    Line 13: const char * const User_binBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\dist\bin\src\library\blas\functor\gcn_sgemm.cc (1 hit)
    Line 107:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\dist\bin\src\library\blas\functor\gcn_sgemmSmallMatrices.cc (1 hit)
    Line 103:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\dist\bin\src\library\blas\functor\gcn_zgemm.cc (1 hit)
    Line 107:   NULL , "-cl-std=CL2.0",                                                        \
  D:\DeepCL\dist\bin\src\library\blas\trtri\TrtriKernelSourceIncludes.h (2 hits)
    Line 9: const char * const TrtriBuildOptions = "-cl-std=CL2.0";
    Line 10: const char * const TrtribinBuildOptions = "-cl-std=CL2.0";
  D:\DeepCL\dist\bin\src\library\CMakeLists.txt (1 hit)
    Line 713:     set( OCL_COMPILER_FLAGS "-cl-std=CL2.0")

clBLAS is surely set to 1.2:

  D:\DeepCL\build\clBLAS\src\clBLAS-external-build\CMakeCache.txt (1 hit)
      Line 220: OPENCL_VERSION:STRING=1.2

The only point from where the values above could become 2.0 (based on my findings) is at clMathLibraries\clBLAS\src\library\CMakeLists.txt line 713...

set( OCL_COMPILER_FLAGS " ")
if( OPENCL_VERSION STREQUAL "2.0")
    set( OCL_COMPILER_FLAGS "-cl-std=CL2.0") # 713 <======
endif()

... but before it is a pretty obvious "if", which makes everything odd...

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hmmm... that's kind of odd. So, let's work up the chain. I'll do it here, as I'm writing:

  • in xgemm.cc, the error is line 164, an dthe sourceBuildOptions there, is 2.0 (right?)
  • that sourcebuildoptions gets passed in on line 94, into the makeGemmKernel call
  • makeGemmkernel is called from line 494 of xgemm.cc
  • that sourceBuildOptions is null at line 376
  • ...and set by the call to gemmSelectKernel, at line 395
  • (note: you could add a printf at line 423, to confirm that sourceBuildOptions is already 2.0 by this point)
  • gemmSelectKernel is in DeepCL/build/clBLAS/src/clBLAS-external-build/include/AutoGemmIncludes/AutoGemmKernelSelection.cpp
  • according to the geometry of your gemm, it's going to set sourceBuildOptions to one of the existing strings such as sgemm_Col_NN_B0_MX096_NX096_KX16_srcBuildOptions
  • ... and sgemm_Col_NN_B0_MX096_NX096_KX16_srcBuildOptions is defined in the file you checked just now, ie DeepCL/build/clBLAS/src/clBLAS-external-build/include/AutoGemmIncludes/AutoGemmKernelBuildOptionsSource.cpp

I reckon that somehow the rebuild hasnt fully recompiled the AutoGemmBuildOptionsSource somehow. One thing you might consider doing is to either use gdb to examine the stack frame at each point in this chain, or else put printf's at strategic points along this chain.

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Just an intermediate finding about:

(note: you could add a printf at line 423, to confirm that sourceBuildOptions is already 2.0 by this point)

I added the following line to line number 424 in xgemm.cc:

 `printf("FIND ME! %s\n", sourceBuildOptions);`

Got fired 88 times always with value -cl-std=CL1.2.

Last few lines of unittest output:

ForwardIm2Col.cl build log: 
fcl build 1 succeeded.
fcl build 2 succeeded.
bcl build succeeded.

FIND ME! -cl-std=CL1.2
build options -cl-std=CL1.2
FIND ME! -cl-std=CL1.2
FIND ME! -cl-std=CL1.2
BackpropWeightsAuto: kernel 4 224ms
calcGradWeights try kernel 4
   ... seems valid
ForwardIm2Col.cl build log: 
fcl build 1 succeeded.
fcl build 2 succeeded.
bcl build succeeded.

build options -cl-std=CL2.0
OpenCL error -43 on line 164 of D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\xgemm.cc

Result from "find usages" also get me a few additional results (special cases):

Search target
  makeGemmKernel
Found 14 usages in solution
  <clBLAS> (14 items)
    GemmSpecialCases.cpp (10 items)
      (266,6) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (341,6) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (480,4) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (481,4) makeGemmKernel(rowClKernel, commandQueues[0], rowKernelSource, User_srcBuildOptions, &rowKernelBinary, &rowKernelBinarySize, User_binBuildOptions);
      (482,4) makeGemmKernel(columnClKernel, commandQueues[0], columnKernelSource, User_srcBuildOptions, &columnKernelBinary, &columnKernelBinarySize, User_binBuildOptions);
      (483,4) makeGemmKernel(singleClKernel, commandQueues[0], singleKernelSource, User_srcBuildOptions, &singleKernelBinary, &singleKernelBinarySize, User_binBuildOptions);
      (580,4) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (625,4) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (670,4) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
      (768,3) makeGemmKernel(tileClKernel, commandQueues[0], tileKernelSource, User_srcBuildOptions, &tileKernelBinary, &tileKernelBinarySize, User_binBuildOptions);
    xgemm.cc (4 items)
      (491,25) if (needTileKernel)   makeGemmKernel(  tileClKernel, commandQueues[0],   tileKernelSource, sourceBuildOptions,   &tileKernelBinary,   tileKernelBinarySize, binaryBuildOptions);
      (492,25) if (needRowKernel)    makeGemmKernel(   rowClKernel, commandQueues[0],    rowKernelSource, sourceBuildOptions,    &rowKernelBinary,    rowKernelBinarySize, binaryBuildOptions);
      (493,25) if (needColKernel)    makeGemmKernel(   colClKernel, commandQueues[0],    colKernelSource, sourceBuildOptions,    &colKernelBinary,    colKernelBinarySize, binaryBuildOptions);
      (494,25) if (needCornerKernel) makeGemmKernel(cornerClKernel, commandQueues[0], cornerKernelSource, sourceBuildOptions, &cornerKernelBinary, cornerKernelBinarySize, binaryBuildOptions);

Currently I'm working on a quick command line debugging tutorial or how to get to the crash point from VS to get call stack (I imagine everything will get much easier).

EDIT1: Got successfully hooked up with VS debugger

EDIT2: -cl-std=2.0 is present at GemmSpecialCases.cpp line 670, working further up the chain.

EDIT3: It gets really easy from there on as the values are hardcoded:
UserGemmKernelSourceIncludes.h line 11 to 13:

//**** compiler flags
const char * const User_srcBuildOptions = "-cl-std=CL2.0";
//**** online compilation flags
const char * const User_binBuildOptions = "-cl-std=CL2.0";

Is this clBLAS bug?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Got the full call stack!

(There was a slight problem with the project cmake generated, so initially I couldn't get the program debug database for DeepCL.dll to load missing some frames. Fixed by changing at DeepCL project properties -> Linker -> Debugger -> Generate Program Database file to inherit).

Added call stack with parameter values to gist for better readability.

And a slightly cleaner version of call stack.

clBLAS.dll!makeGemmKernel Line 159  
clBLAS.dll!SGEMM_BRANCH_32 Line 672 
clBLAS.dll!GemmSpecialCases<float> Line 889 
clBLAS.dll!clblasGemm<float> Line 325   
clBLAS.dll!clblasSgemm Line 616 
DeepCL.dll!ClBlasHelper::Gemm Line 53   
DeepCL.dll!BackpropWeightsIm2Col::calcGradWeights Line 88   
DeepCL.dll!BackpropWeightsAuto::calcGradWeights Line 71 
DeepCL.dll!ConvolutionalLayer::backward Line 433    
DeepCL.dll!NeuralNet::backward Line 234 
DeepCL.dll!SGD::train Line 83   
DeepCL.dll!SGD::train Line 103  
DeepCL.dll!Trainer::train Line 52   
DeepCL.dll!NetLearnAction2::run Line 28 
DeepCL.dll!Batcher2::internalTick Line 113  
DeepCL.dll!Batcher2::tick Line 104  
DeepCL.dll!Batcher2::run Line 131   
DeepCL.dll!EpochMaker::run Line 31  
deepcl_unittests.exe!testsimpleconvolvenet_imagesize_5_4_2layers_filtersize_2_4_biased_n3_Test::TestBody Line 630   
deepcl_unittests.exe!testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,void> Line 3547    
deepcl_unittests.exe!testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,void> Line 3598   
deepcl_unittests.exe!testing::Test::Run Line 3641   
deepcl_unittests.exe!testing::TestInfo::Run Line 3814   
deepcl_unittests.exe!testing::TestCase::Run Line 3929   
deepcl_unittests.exe!testing::internal::UnitTestImpl::RunAllTests Line 5800 
deepcl_unittests.exe!testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool> Line 3547  
deepcl_unittests.exe!testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool> Line 3598 
deepcl_unittests.exe!testing::UnitTest::Run Line 5410   
deepcl_unittests.exe!RUN_ALL_TESTS Line 20059   
deepcl_unittests.exe!main Line 59   
deepcl_unittests.exe!invoke_main Line 75    
deepcl_unittests.exe!__scrt_common_main_seh Line 264    
deepcl_unittests.exe!__scrt_common_main Line 309    
deepcl_unittests.exe!mainCRTStartup Line 17 
kernel32.dll!00007ffa20122d92   Unknown
ntdll.dll!00007ffa22999f64  Unknown

EDIT1: Changed UserGemmKernelSourceIncludes.h line 11 to 13 to version 1.2 just to see what happens. Problematic test passed successfully, but after that nothing pretty...

OpenCL error -38 on line 673
Assertion failed: false, file D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\specialCases\GemmSpecialCases.cpp, line 673

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Changed UserGemmKernelSourceIncludes.h line 11 to 13 to version 1.2 just to see what happens. Problematic test passed successfully

Nice!

but after that nothing pretty...

Error -38? That sounds familiar. That was the error I was getting when it was reusing an old kernel from a previous OpenCL context, because the previous one hadnt been cleared down during clblasTeardown. Seems plausible that the cause could be similar this time too.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi, it looks like if we define AUTOGEMM_PRINT_DEBUG, it will print out a bunch of useful-looking debugging information. (eg, you can see there is an #ifdef for this at line 106 of xgemm.cc)

(Edit: eg you could add to the top of clBLAS.h:

#define AUTOGEMM_PRINT_DEBUG

)

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hmmm, I think it would be good to find out on line 669 of GemmSpecialCases.cpp:

  • what is the value of tileClKernel? specificlaly: is it NULL?
  • is tileKernelBinary NULL ?
  • what is the value of tileKernelBinarySize?

I usually need a bit of hacking around to print addresses, because compiler tries to make sure we really do intend to do that, but maybe something like:

printf("kernel %lld binary %lld size %lld\n", (long long)(void *)tileClKernel, (long long)(void *)tileKernelBinary, (long long)tileKernelBinarySize);

... but I remember that in VS, %lld might not be quite right?

Edit: basicaly what I reckon is that one of these values is non-zero, and is not being wiped correctly by clblasTeardown. I suspect it is the binary field, since I already added wiping for the kernel field, ie clMathLibraries/clBLAS#163

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Oh: sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel is not in AutoGemmClKernels.cpp Probably needs to be wiped too somehow.

Edit: looks like it is in clMathLibraries/clBLAS/src/library/blas/AutoGemm/UserGemmKernelSources/UserGemmClKernels.h I can probably add wiping for this somehow. Though what I'm lacking is a test-case...

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

I'll try to get all of the logs and outputs today in the evening (I'm GMT +1) or at the latest tomorrow in the afternoon. I'm at faculty all day (student here).

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi, I've upgraded clblas to wipe autogemm userkernels during teardown. I dont have solid evidence that it's the cause of your error, but it seems likely. Can you pull down the latest DeepCL, and rerun git submodule update --recursive etc, and rebuild, and see if that changes anything when you rerun the tests? (Relevant commit: 971712b in DeepCL, and hughperkins/clBLAS@2111bb0 in clBLAS

from deepcl.

TimmyLiu avatar TimmyLiu commented on May 24, 2024

Hi I was redirected from clBLAS issue #169. Browsing through your conversation, I have a couple comments.
1, it is a bug in clBLAS "UserGemmKernelSourceIncludes.h" where the compiler flags are hard coded to cl2.0. it should be configurable by cmake settings.
2, the reason why the inconsistent OpenCL compiler flag works for older AMD gpu is that card probably does not support OpenCL2.0 and thus ignored the compiler flag of OpenCL2.0. In other words, everything was compiled with 1.2 and at least they are consistent.
3, you certainly have a more intensive test scenario than clBLAS test suite currently has. that's wonderful.

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Update everything (redownloaded, reconfigured and rebuilt entire solution just to be sure), kept UserGemmKernelSourceIncludes.h line 11 and 13 at version 1.2.

#define AUTOGEMM_PRINT_DEBUG

Added.

printf("kernel %lld binary %lld size %lld\n", (long long)(void *)tileClKernel, (long long)(void *)tileKernelBinary, (long long)tileKernelBinarySize);

VS (surprisingly) did not complain.
Otherwise the output from last four lines of unit tests (fails at the same test):

kernel 140712488066176 binary 0 size 0
makeGemmKernel: "sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_src" already built; returning.
OpenCL error -38 on line 674
Assertion failed: false, file D:\DeepCL\clMathLibraries\clBLAS\src\library\blas\specialCases\GemmSpecialCases.cpp, line 674

Entire output from unittests.

If you're doing estimated guesses and fixing things (working blind basically) and need access to a machine with specific hardware I'm sure we can work something out...

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda,

Hmmm, ok. Don't suppose, can you provide the output of the following please, just to check you really do have the latest clblas version. Starting from the build directory, of DeepCL:

(cd ../clMathLibraries/clBLAS/; git --no-pager log -n 5 --oneline)
grep GemmClKernels ../clMathLibraries/clBLAS/src/library/blas/init.c
nm clBLAS/src/clBLAS-external-build/library/libclBLAS.so | grep initUserGemm
nm ../dist/lib/libclBLAS.so | grep UserGemm

On my build, tihs gives:

(cd ../clMathLibraries/clBLAS/; git --no-pager log -n 5 --oneline)
2111bb0 Fix teardown of UserGemmClKernels
3681c78 Fix catch-22 in build order, following pull-163, where init.c compile fails because AutoGemmClKernels.h hasnt been built yet
27ab572 Merge pull request #163 from hughperkins/fix-teardown
c56c725 Fix https://github.com/clMathLibraries/clBLAS/issues/159 , teardown/setup when using autogemm causes next call to gemm to fail (segfault)
16744bf fix 'array initializer must be an initializer list', https://github.com/clMathLibraries/clBLAS/issues/153
user@pear:~/git/DeepCL/build$ grep GemmClKernels ../clMathLibraries/clBLAS/src/library/blas/init.c
#include "UserGemmClKernels.h"
   initUserGemmClKernels();
   initAutoGemmClKernels();
user@pear:~/git/DeepCL/build$ nm clBLAS/src/clBLAS-external-build/library/libclBLAS.so | grep initUserGemm
00000000002b64c0 T initUserGemmClKernels
user@pear:~/git/DeepCL/build$ nm ../dist/lib/libclBLAS.so | grep UserGemm
00000000002b64c0 T initUserGemmClKernels

If you're doing estimated guesses and fixing things (working blind basically) and need access to a machine with specific hardware I'm sure we can work something out...

Yes, I lack an AMD card to test on really...

Edit: oh, I sent you linux commands :-P Ok, let me re-ponder that.

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

I can give you the output from the first two commands...

user@PC /d/DeepCL/build (clblas-2.8.0)
$ (cd ../clMathLibraries/clBLAS/; git --no-pager log -n 5 --oneline)
2111bb0 Fix teardown of UserGemmClKernels
3681c78 Fix catch-22 in build order, following pull-163, where init.c compile fails because AutoGemmClKernels.h hasnt been built yet
27ab572 Merge pull request #163 from hughperkins/fix-teardown
c56c725 Fix https://github.com/clMathLibraries/clBLAS/issues/159 , teardown/setup when using autogemm causes next call to gemm to fail (segfault)
16744bf fix 'array initializer must be an initializer list', https://github.com/clMathLibraries/clBLAS/issues/153

user@PC /d/DeepCL/build (clblas-2.8.0)
$ grep GemmClKernels ../clMathLibraries/clBLAS/src/library/blas/init.c
#include "UserGemmClKernels.h"
   initUserGemmClKernels();
   initAutoGemmClKernels();

...but you slightly lost me with nm. Haven't switched from VS yet (I can do that tomorrow), instead tried with dumpbin.exe /ALL but could find the initUserGemmClKernels, therefore I hooked up with debugger to make sure it's executed (it is) and calculated SHA1 hashes from both files (they are identical) to make sure they are the same (I assume that was the point behind those two lines).

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Added call stack for current error..
Added cal stack for first 3 calls (explained bellow point 2).

While running units tests, if bellow in initUserGemmClKernels is never true (inner body, clReleaseKernel and set to NULL is NEVER executed).
if(sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel != NULL) {
clReleaseKernel(sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel);
sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel = NULL;
}

Based on the values and error (A variable points to invalid chunk of memory) I see two options:

  1. clBLAS part of of stack has value A identical so the problem would be in DeepCL (find it highly unlikely).
  2. Problem with tileClKernel (= sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel). This chunk of code (GemmSpecialCases.cpp from line 664 on) is executed 3 times before assertion, after that initUserGeemClKernels is ran and then the forth time the crash occurs.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

(gah, ctrl-c causes evreything to be deleted, starting over again :-( )

Ok, will take a look through this. In the meantime, here is what is happening conceptually. I already typed htis once, and accidentlaly lost it, which is a bit annoying :-P Anyway, so for each of the unit tests, what happens is:

  • an OpenCL context is created
  • an OpenCL command queue is created, in this context
  • an OpenCL kernel is compiled, also bound to the context
  • a command is sent to the queue, to copy data from main memory to the gpu
  • a command is sent to the queue, to run the kernel, using that data
  • a command is sent to hte queue, to copy the results back to main memroy, from the gpu
  • the kernel is released
  • the command queue is released
  • the context is released

That's how it works without clblas. Note that the kernel is bound to a context. If you create a new context, you'll need to compile the kernel again.

Now, let's throw clblas into the mix. this means we need to add in a clblasSetup, and a clblasTeardown, and let's add in a call to xgemm:

  • OpenCL context is created
  • opencl queue is created, in this context
  • we call clblasSetup
    • this doesnt do much yet
  • we call xgemm:
    • an appropriate kernel is compiled, and stored in one of the variables, eg in sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_clKernel (this is where I got to when I hit ctrl-c on my first attempt at writing this :-P)
    • our data is copied to gpu
    • the kernel is run
    • data copied back from gpu
  • we call clblasTeardown
    • this may or may not do very much
    • ideally it should:
      • release the kernel
      • set the kernel variable to 0
  • the command queue is released
  • the context is released

Ok, so far so good. that will all run ok. But what if we call xgemm twice. Within the same test. Well, it will reuse the existing kernel, that's ok, because it is the same context.

But what is happening is that when we run multiple tests, one after the other, each test runs in its own opencl context. Therefore, it's vital that the kernel vairable, in clblas, is set bakc to 0. Otherwise it will try to use a kernel object created for a different context. And we'll get error -38. So, the test should call clblasTeardown at the end, and clblasTeardown should release the kernel (ideally), and set the kernel varaibel to 0 (essential).

So, there are at least a copule of reasons why we might get error -38:

  • reason number 1: clblasTeardown doesnt actually set the kernel variable to 0 (was the case for the autogemm kernels, but I patched that; still the case for the usergemm kernels, in clblas master, but in theory I've patched that too, in theory, in branch)
  • reason number 2: maybe one or more deepcl tests arent calling clblasTeardown

It's possible that both these reasons are occurring in fact...

Anyway, this is how I see the issue conceptually.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

2, the reason why the inconsistent OpenCL compiler flag works for older AMD gpu is that card probably does not support OpenCL2.0 and thus ignored the compiler flag of OpenCL2.0. In other words,

Hi Timmy,

Thanks! Couple of questoins:

  1. why would the compiler ignore an unknown flag? I would expect it to choke on a strange unknown flag?
  2. why does compiling for opencl 2.0 on an opencl 2.0 card cause a failure?

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

jakakonda, I'm going to see if I can create a test case that calls UserKernels, so I can check that / if the kernels are being cleaned during teardown.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda and Timmy, logged an issue for the 2.0 hardcoding issue at clMathLibraries/clBLAS#172 , including a sample test case, that fails ok on my machine too.

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Hi jakakonda, I've reproduced the bug you see above on my machine, see test case at clMathLibraries/clBLAS#169 (comment) , an dfixed these in patches to clblsa hughperkins/clBLAS@79ea756 and hughperkins/clBLAS@7d708e4 , and pushed this to branch clblas-2.8.0 of DeepCL.

Dont suppose... can you pull down the latest version of DeepCL clblas-2.8.0 branch, and run git submodule update --recursive, and rebuild, and retry please?

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Hi, sorry for long response.
I'm gonna let tests speak for themselves :).

[----------] Global test environment tear-down
[==========] 160 tests from 29 test cases ran. (107904 ms total)
[  PASSED  ] 158 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] testsinglebatch.imagesize5_filtersize3_batchsize2_10filters
[  FAILED  ] testNorbLoader.load1000

 2 FAILED TESTS
  YOU HAVE 2 DISABLED TESTS

And if there is anything interesting in full output for those failed tests.
Mnist demo running without any problems (on intel and amd).

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

Cool :-) I think this is really closed this time :-) The two tests that failed, one is just, I probably should make it not so stochastic probably, and the other requires norb dataset, so they dont worry me. Cool, looks good :-)

from deepcl.

jakakonda avatar jakakonda commented on May 24, 2024

Everything quickly deviated from primary problem (VS2015 build problems), but everything got solved eventually :-). Thanks a lot!

from deepcl.

hughperkins avatar hughperkins commented on May 24, 2024

:-D I put your name in the 'recent changes' on the front page by the way, for helping with getting this working (*). Thank you for all your help with getting this working by the way. Very much appreciated :-)

(*) If you dont want your name there, feel free to let me know by the way, and I can remove it. Up to you :-)

from deepcl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.