System information (version) mandelbulber2 => 2.29-dev</

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

I get a different output: <div class="snippet-clipboard-content notranslate positi

clinfo output: <div class="snippet-clipboard-content notranslate position-relative

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Error during compilation of OpenCL program (after nvidia driver upgrade) about mandelbulber2 HOT 27 CLOSED

nikolas-davis80 commented on June 2, 2024

Error during compilation of OpenCL program (after nvidia driver upgrade)

from mandelbulber2.

Comments (27)

buddhi1980 commented on June 2, 2024 1

I have finally reported it to Debian maintainers.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

Thanks for reporting it. I haven't tried with this new driver yet. I still use version 470 (which is still in Debian testing).
Unfortunately compiler error message tells nothing (build log is empty). It looks like an error in the OpenCL compiler.
The only thing which could help (I hope) will be to delete ~/.nv folder (there is OpenCL compiler cache).
I will try to find if any other people reports similar problems with this driver version.

from mandelbulber2.

nikolas-davis80 commented on June 2, 2024

Thank you. Unfortunately, deleting ~/.nv folder does not solve the problem - I get the same error, and the folder is recreated.

Concerning the driver: I can confirm mandelbulber2 with OpenCL was working fine with previously installed version 515, prior to the update.

One thing I don't understand (sorry, I'm a newbie wrt opencl): why do I get a compile error at runtime? Shouldn't there be an error while compiling mandelbulber already?

Please let me know if you think of any diagnostics I could run.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

OpenCL programs work different than programs for CPU. Programs for GPU are compiled by the graphics driver just before run. Thanks to that they are well optimized for given GFX card and given task.
This is the reason why you see compiler errors at runtime.

from mandelbulber2.

raisingw commented on June 2, 2024

Can confirm this, also on Endeavour and updated this morning to the new driver and kernel. Nothing seems to fix the issue but I too get an empty build log so I'm not sure where the error might be.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

I have one test to perform to see if the problem is related to Mandelbulber or OpenCL library
To do the test:
install opencl-nvidia, git, make and gcc packages

open terminal and create and go to some work folder. For example:
cd ~
mkdir work
cd work

and run following commands:
git clone https://github.com/rsnemmen/OpenCL-examples.git
cd OpenCL-examples
cd add_numbers
make
./add_number

last command should run test and display:
Computed sum = 2016.0.
Check passed.

Please let me know if you can run this test

from mandelbulber2.

raisingw commented on June 2, 2024

12:43:26 [kirk@gaudi add_numbers]$ make
gcc -std=c99 -Wall -DUNIX -g -DDEBUG -m64 -o add_numbers add_numbers.c   -lOpenCL
In file included from /usr/include/CL/cl.h:20,
                 from add_numbers.c:14:
/usr/include/CL/cl_version.h:22:9: note: ‘#pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)’
   22 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)")
      |         ^~~~~~~
add_numbers.c: In function ‘main’:
add_numbers.c:197:4: warning: ‘clCreateCommandQueue’ is deprecated [-Wdeprecated-declarations]
  197 |    queue = clCreateCommandQueue(context, device, 0, &err);
      |    ^~~~~
/usr/include/CL/cl.h:1913:1: note: declared here
 1913 | clCreateCommandQueue(cl_context                     context,
      | ^~~~~~~~~~~~~~~~~~~~
12:43:29 [kirk@gaudi add_numbers]$ ./add_numbers

12:43:36 [kirk@gaudi add_numbers]$

Output is blank, must be a driver thing. I suppose it's a case of waiting for a fix or roll back.

Thanks for the help though!

from mandelbulber2.

nikolas-davis80 commented on June 2, 2024

I get a different output:

$ ./add_numbers
Couldn't access any devices: File exists

and:

$ clinfo --list
Platform #0: Clover
Platform #1: NVIDIA CUDA
 `-- Device #0: NVIDIA GeForce GTX 960M

so, I'm assuming I need to set some environment variable to access my NVIDIA card, but I'm not sure which one. Any ideas?

Edit: Some more (hopefully) useful info:

$ ldd add_numbers
        linux-vdso.so.1 (0x00007fff0ad7d000)
        libOpenCL.so.1 => /opt/cuda/lib64/libOpenCL.so.1 (0x00007f7541c00000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f7541a19000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f7541e1f000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f7541e1a000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7541e57000)

from mandelbulber2.

raisingw commented on June 2, 2024

clinfo output:

14:03:00 [kirk@gaudi ~]$ clinfo --list
Platform #0: NVIDIA CUDA
 +-- Device #0: NVIDIA GeForce GT 1030
 `-- Device #1: NVIDIA GeForce GTX 750 Ti
Platform #1: Portable Computing Language
 `-- Device #0: pthread-Intel(R) Core(TM) i7 CPU         965  @ 3.20GHz
Platform #2: Clover

PoCL does work and renders the rectangular work groups in single precision as you'd expect (though you can see the age of my computer here, it's not a viable option due to how slow doing this on CPU is)

Got to be a driver thing.

from mandelbulber2.

nikolas-davis80 commented on June 2, 2024

@raisingw
how did you get PoCL to work? It seems to me that the nVidia OpenCL driver is working, since I get CUDA functionality in other applications, e.g. CuPy. But for some reason, I cannot access the device when running the OpenCL examples and/or mandelbulber2.

I believe this could be solved with setting some environment variable. I will try to ask around in the EndeavourOS forums.

from mandelbulber2.

raisingw commented on June 2, 2024

It was just a case of installing pocl from the main community repo. By default it only works on my CPU.

There is a pocl-cuda-git package in the AUR that I just tried, it drags in cuda and llvm then compiles but despite it seeing my two GPUs and listing them it simply crashed Mandelbulber after the compile stage.

from mandelbulber2.

thebatguy commented on June 2, 2024

I had the same issue after upgrading to the latest opencl driver on EndeavourOS with 2 GTX 960, but it seems I have solved it by installing the beta from the 32lib "lib32-opencl-nvidia-beta 520.56.06-1" from AUR repositories with yay opencl-nvidia (number 4).
It was even an accident by my side by wanting to install the opencl-beta (64bit) and making this mistake.
During installation it ask to replace also some nvidia programs.

I had Mandelbulber open and it suddenly worked again in opencl mode, not even need to reboot.

from mandelbulber2.

raisingw commented on June 2, 2024

Can also confirm that installing the standard opencl-nvidia-beta package from that list gets Mandelbulber working again.

Nvidia has done something weird to their OpenCL module. Not a Mandelbulber issue.

from mandelbulber2.

nikolas-davis80 commented on June 2, 2024

@raisingw
Can you please clarify if you installed the opencl-nvidia-beta or the lib32-opencl-nvidia-beta (32bit version) from AUR, or both? Also, did you have to replace any standard packages in the process?

One more thing: I think this fix is unrelated to the OpenCL examples test. Looking at add_numbers.c, I see it tries to use the 1st device of the 1st OpenCL platform in the system. In my case:

$ clinfo --list
Platform #0: Clover
Platform #1: NVIDIA CUDA
 `-- Device #0: NVIDIA GeForce GTX 960M

So, the 1st platform is "Clover", and no device is found. I suspect if I modified the code to use the 2nd platform - or set some environment variables to disable Clover - that it would run, but I haven't found out how to do that; if someone could instruct me, I'd appreciate it. In the case of raisingw, though, I see the 1st platform is the NVIDIA CUDA, so the result would be different.

from mandelbulber2.

nikolas-davis80 commented on June 2, 2024

I confirm the solution by @thebatguy works for me (installing lib32-opencl-nvidia-beta & dependencies from AUR).

from mandelbulber2.

raisingw commented on June 2, 2024

Just the standard opencl-nvidia-beta package. I wanted to check if it worked without resorting to 32-bit packages and it does.

Installing this replaced nvidia-utils with nvidia-utils-beta. Both packages report version 520.56.06-1 though so I don't understand why the main repo version is broken while the AUR beta package is working.

from mandelbulber2.

EurekaChen commented on June 2, 2024

I still get slower “Compiling OpenCL Programs” more than 20 seconds with new Nvida driver on windows with RTX 3090.

from mandelbulber2.

palWorx commented on June 2, 2024

Same Problem here on Debian sid (6.1.0-3-amd64).

Haven't used Mandelbulber for some time (shame on me) and updated nvidia today to 525.85.12-1

Versions used: Mandelbulber_v2-2.28-x86_64.appimage & Mandelbulber_v2-6d5d221-x86_64.AppImage

add_numbers output is blank.

clinfo --list
Platform #0: NVIDIA CUDA
-- Device #0: Quadro K2200
Platform #1: Portable Computing Language
-- Device #0: pthread-ivybridge-Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

I am on sid, there is no such thing like opencl-nvidia-beta or lib32-opencl-nvidia-beta and i am already using non-free-firmware in source.list

Feel free to ask me to run tests if it helps you find a solution.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

Now I have the same problem on Debian. When I upgraded the driver to version 520 I also have OpenCL compiler error without any error output. It looks like the driver is broken or NVidia changed something which is as always not documented.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

The problem is definitely not in the Mandelbulber. I also cannot run minimal OpenCL example program like:

#include <CL/opencl.hpp>
#include <algorithm>
#include <iostream>
#include <iterator>

using namespace std;
using namespace cl;

int factorial(int n) { return (n <= 1) ? 1 : n * factorial(n - 1); }

Platform getPlatform() {
  /* Returns the first platform found. */
  std::vector<Platform> all_platforms;
  Platform::get(&all_platforms);

  if (all_platforms.size() == 0) {
    cout << "No platforms found. Check OpenCL installation!\n";
    exit(1);
  }
  return all_platforms[0];
}

Device getDevice(Platform platform, int i, bool display = false) {
  /* Returns the deviced specified by the index i on platform.
   * If display is true, then all of the platforms are listed.
   */
  std::vector<Device> all_devices;
  platform.getDevices(CL_DEVICE_TYPE_ALL, &all_devices);
  if (all_devices.size() == 0) {
    cout << "No devices found. Check OpenCL installation!\n";
    exit(1);
  }

  if (display) {
    for (int j = 0; j < all_devices.size(); j++)
      printf("Device %d: %s\n", j,
             all_devices[j].getInfo<CL_DEVICE_NAME>().c_str());
  }
  return all_devices[i];
}

int main() {
  const int n = 1024;   // size of vectors
  const int c_max = 5;  // max value to iterate to
  const int coeff = factorial(c_max);

  int A[n], B[n], C[n];  // A is initial, B is result, C is expected result
  for (int i = 0; i < n; i++) {
    A[i] = i;
    C[i] = coeff * i;
  }
  Platform default_platform = getPlatform();
  Device default_device = getDevice(default_platform, 1);
  Context context({default_device});
  Program::Sources sources;

  std::string kernel_code =
      "void kernel multiply_by(global int* A, const int c) {"
      "   A[get_global_id(0)] = c * A[get_global_id(0)];"
      "}";
  sources.push_back({kernel_code.c_str(), kernel_code.length()});

  Program program(context, sources);
  if (program.build({default_device}) != CL_SUCCESS) {
    cout << "Error building: "
         << program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(default_device)
         << std::endl;
    exit(1);
  }

  Buffer buffer_A(context, CL_MEM_READ_WRITE, sizeof(int) * n);
  CommandQueue queue(context, default_device);
  queue.enqueueWriteBuffer(buffer_A, CL_TRUE, 0, sizeof(int) * n, A);

  Kernel multiply_by = Kernel(program, "multiply_by");
  multiply_by.setArg(0, buffer_A);

  for (int c = 2; c <= c_max; c++) {
    multiply_by.setArg(1, c);
    queue.enqueueNDRangeKernel(multiply_by, NullRange, NDRange(n), NDRange(32));
  }

  queue.enqueueReadBuffer(buffer_A, CL_TRUE, 0, sizeof(int) * n, B);

  if (std::equal(std::begin(B), std::end(B), std::begin(C)))
    cout << "Arrays are equal!" << endl;
  else
    cout << "Uh-oh, the arrays aren't equal!" << endl;

  return 0;
}

from mandelbulber2.

palWorx commented on June 2, 2024

So it's a Nvidia problem, but which package:
nvidia-opencl-common 525.85.12-1 amd64 NVIDIA OpenCL driver - common files
nvidia-opencl-icd:amd64 525.85.12-1 amd64 NVIDIA OpenCL installable client driver (ICD)
ocl-icd-libopencl1:amd64 2.3.1-1 amd64 Generic OpenCL ICD Loader
ocl-icd-opencl-dev:amd64 2.3.1-1 amd64 OpenCL development files

I am not lazy, but i honestly don't know for which package i should open a bug and what to report. Will you do it, or tell me detailed who and what?

from mandelbulber2.

buddhi1980 commented on June 2, 2024

You should report it to just nvidia-driver package
I will also report it, but actually in Debian testing is version 520 and there is already version 525 in Unstable. But I hesitate to install it because I need the system running. I will wait until version 525 will be in testing. If it still doesn't work, the bug report will be placed.

from mandelbulber2.

palWorx commented on June 2, 2024

I should report what?
Telling them, that Mandelbubler isn't working will not raise their interest.
Quote: "I also cannot run minimal OpenCL example program...", shall I report that to the maintainer?

from mandelbulber2.

buddhi1980 commented on June 2, 2024

Here is the bug report https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1031080
I have already upgraded the driver to 525 and the problem still exist.

from mandelbulber2.

palWorx commented on June 2, 2024

I have already seen the bug report. Thanks for your effort. I am curious to see if something happens.

from mandelbulber2.

buddhi1980 commented on June 2, 2024

I have workaround for Debian:
install libnvidia-nvvm4 package and create symbolic links to the files libnvidia-nvvm.so.4 and libnvidia- nvvm.so.525.85.12 in /usr/lib/x86_64-linux-gnu/

from mandelbulber2.

palWorx commented on June 2, 2024

Thank you very much for your efforts! It works again.

from mandelbulber2.

Error during compilation of OpenCL program (after nvidia driver upgrade) about mandelbulber2 HOT 27 CLOSED

Comments (27)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs