I'm having an issue trying to get mamba running on 3xP40. The model will load into vr

I built a wheel under the Python 3.10 , <code class="n

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

RuntimeError: CUDA error: no kernel image is available for execution on the device on 3xP40 about mamba HOT 5 OPEN

DewEfresh commented on August 28, 2024 2

RuntimeError: CUDA error: no kernel image is available for execution on the device on 3xP40

from mamba.

Comments (5)

Tworan commented on August 28, 2024 3

I came into the same issue with gtx1080ti which compute capability is compute_60. In the setup.py file, I see both the mamba_ssm and causal_conv1d only compile the code with compute_70, compute_80 and compute_90 except compute_60 which is corresponding to the sm_60:

mamba/setup.py

Lines 108 to 114 in 86a3a90

 cc_flag.append("-gencode") 

 cc_flag.append("arch=compute_70,code=sm_70") 

 cc_flag.append("-gencode") 

 cc_flag.append("arch=compute_80,code=sm_80") 

 if bare_metal_version >= Version("11.8"): 

 cc_flag.append("-gencode") 

 cc_flag.append("arch=compute_90,code=sm_90")

So I add two line to both setup.py file of mamba_ssm and causal_conv1d, then reinstall both packages and it works for me.

    #add
    cc_flag.append("-gencode")
    cc_flag.append("arch=compute_60,code=sm_60")
    # 
    cc_flag.append("-gencode")
    cc_flag.append("arch=compute_70,code=sm_70") 
    cc_flag.append("-gencode") 
    cc_flag.append("arch=compute_80,code=sm_80") 
    if bare_metal_version >= Version("11.8"): 
        cc_flag.append("-gencode") 
        cc_flag.append("arch=compute_90,code=sm_90")

And I think this may be because the Pascal architecture GPU(compute_60) does not support the bf16 data type. And this may cause the functions using bf16 to be unusable.

from mamba.

hhhhpaaa commented on August 28, 2024 2

I built a wheel under the Python 3.10, PyTorch 2.1, and CUDA 11.8 environment to support the compute_60. For details, please refer to the link.
@SamsongB @DewEfresh

from mamba.

masc-it commented on August 28, 2024

Have you tried running it on a single gpu? (Maybe a smaller one) If that's the case, I think there are some issues with the custom operators and DDP.

Or, it could be the fact that P40 is missing some kernels that the mamba optimized version uses. On my ampere-series card no probs whatsoever.

from mamba.

SamsongB commented on August 28, 2024

I also see the same issue with GM200 [GeForce GTX TITAN X], tried the above solution with compute_60, but still cannot seem to run. is there an update regarding this?

from mamba.

a987042035 commented on August 28, 2024

@DewEfresh Hello, my dear. Have you solved your problem? The single card p40 I use also has this problem. Please help me with my cuda version 11.8

from mamba.

Recommend Projects

RuntimeError: CUDA error: no kernel image is available for execution on the device on 3xP40 about mamba HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	cc_flag.append("-gencode")
	cc_flag.append("arch=compute_70,code=sm_70")
	cc_flag.append("-gencode")
	cc_flag.append("arch=compute_80,code=sm_80")
	if bare_metal_version >= Version("11.8"):
	cc_flag.append("-gencode")
	cc_flag.append("arch=compute_90,code=sm_90")