Running the example from the posts/tensor-cores folde

Getting errors running tensor-cores example,about nvidia-developer-blog/code-samples

Comments (4)

agschrei commented on June 14, 2024 1

Hey everyone,
so I recently ran into the same problem with CUDA 11 and for me it was an issue with the device code that got generated.

If you want to run this sample on Turing you will have to make sure that you are using the -gencode arch=compute_75,code=sm_75 flags during compilation.
Trying to run this on Turing with a binary compiled for a Volta target (sm_70) will provide the error above. I'm guessing the wmma instructions are so low-level that they are not compatible between architectures.

I'm just leaving this here for future reference, hoping I'll save somebody a lot of head-scratching.

from code-samples.

Ivanrs297 commented on June 14, 2024

which version of nvcc are you using?, and did you solved it?

from code-samples.

yofufufufu commented on June 14, 2024

Device: RTX3090
In CMakeLists: set(CMAKE_CUDA_ARCHITECTURES 86)
NVCC version: 11.1
I get the same issue, anyone can help?

from code-samples.

ken012git commented on June 14, 2024

Hi, I have the same issue.
Device: A100
NVCC version: 11.1
I tried -arch=sm_80 but it does not work for me.

The results seem correct after reducing MATRIX_M, MATRIX_N, and MATRIX_K from 16384 to 1024.
I think the 0.01% relative tolerance and 1e-5 absolute tolerance in the code are too small for large matrix like 16384x16384.

However, I did not get speed up with 1024x1024 matrices:
wmma took 0.300032ms
cublas took 0.041984ms

I guess we would just use cuBLAS or refering to the faster implementation here.

   // Use tensor cores
   cublasErrCheck(cublasSetMathMode(cublasHandle, CUBLAS_TENSOR_OP_MATH));

from code-samples.

Recommend Projects

Getting errors running tensor-cores example about code-samples HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs