GithubHelp home page GithubHelp logo

cudnn-python-wrappers's Introduction

cudnn-python-wrappers

Python wrappers for the NVIDIA cuDNN libraries.

This is a set of minimal Python wrappers for the NVIDIA cuDNN library of convolutional neural network primitives. NVIDIA cuDNN is available free of charge, but requires an NVIDIA developer account to download. Users should follow the cuDNN API documentation to use these wrappers, as they faithfully replicate the cuDNN C API.

These wrappers expose the full cuDNN API as Python functions, but are minimalistic in that they don't implement any higher order functionality, such as operating directly on data structures like PyCUDA GPUArray or cudamat CUDAMatrix. Since the interface faithfully replicates the C API, the user is responsible for allocating and deallocating handles to all cuDNN data structures and passing references to arrays as pointers. However, cuDNN status codes are translated to Python exceptions. The most common application for these wrappers will be to be used along PyCUDA, but they will work equally well with other frameworks such as CUDAMat.

This version of cudnn-python-wrappers targets cudnn-8.0-v6.0. Please use version 1.x of the wrappers for cudnn-6.5-R1. Please use version 2.1b of the wrappers for cudnn-7.0-v4.0.

Users need to make sure that they pass all arguments as the correct data type, that is ctypes.c_void_p for all handles and array pointers and ctypes.c_int for all integer arguments and enums. Here is an example on how to perform forward convolution on a PyCUDA GPUArray:

import pycuda.autoinit
import pycuda.driver as drv
from pycuda import gpuarray
import libcudnn, ctypes
import numpy as np

# Create a cuDNN context
cudnn_context = libcudnn.cudnnCreate()

# Set some options and tensor dimensions
tensor_format = libcudnn.cudnnTensorFormat['CUDNN_TENSOR_NCHW']
data_type = libcudnn.cudnnDataType['CUDNN_DATA_FLOAT']
convolution_mode = libcudnn.cudnnConvolutionMode['CUDNN_CROSS_CORRELATION']
convolution_fwd_pref = libcudnn.cudnnConvolutionFwdPreference['CUDNN_CONVOLUTION_FWD_PREFER_FASTEST']

start, end = (drv.Event(), drv.Event())

def start_bench():
    start.record()

def end_bench(op):
    end.record()
    end.synchronize()
    msecs  = end.time_since(start)
    print("%7.3f msecs" % (msecs))

n_input = 64
filters_in = 128
filters_out = 128
height_in = 112
width_in = 112
height_filter = 7
width_filter = 7
pad_h = 3
pad_w = 3
vertical_stride = 1
horizontal_stride = 1
upscalex = 1
upscaley = 1
alpha = 1.0
beta = 1.0

# Input tensor
X = gpuarray.to_gpu(np.random.rand(n_input, filters_in, height_in, width_in)
    .astype(np.float32))

# Filter tensor
filters = gpuarray.to_gpu(np.random.rand(filters_out,
    filters_in, height_filter, width_filter).astype(np.float32))

# Descriptor for input
X_desc = libcudnn.cudnnCreateTensorDescriptor()
libcudnn.cudnnSetTensor4dDescriptor(X_desc, tensor_format, data_type,
    n_input, filters_in, height_in, width_in)

# Filter descriptor
filters_desc = libcudnn.cudnnCreateFilterDescriptor()
libcudnn.cudnnSetFilter4dDescriptor(filters_desc, data_type, tensor_format, filters_out,
    filters_in, height_filter, width_filter)

# Convolution descriptor
conv_desc = libcudnn.cudnnCreateConvolutionDescriptor()
libcudnn.cudnnSetConvolution2dDescriptor(conv_desc, pad_h, pad_w,
    vertical_stride, horizontal_stride, upscalex, upscaley,
    convolution_mode, data_type)

# Get output dimensions (first two values are n_input and filters_out)
_, _, height_output, width_output = libcudnn.cudnnGetConvolution2dForwardOutputDim(
    conv_desc, X_desc, filters_desc)

# Output tensor
Y = gpuarray.empty((n_input, filters_out, height_output, width_output), np.float32)
Y_desc = libcudnn.cudnnCreateTensorDescriptor()
libcudnn.cudnnSetTensor4dDescriptor(Y_desc, tensor_format, data_type, n_input,
    filters_out, height_output, width_output)

# Get pointers to GPU memory
X_data = ctypes.c_void_p(int(X.gpudata))
filters_data = ctypes.c_void_p(int(filters.gpudata))
Y_data = ctypes.c_void_p(int(Y.gpudata))

# Perform convolution
algo = libcudnn.cudnnGetConvolutionForwardAlgorithm(cudnn_context, X_desc,
    filters_desc, conv_desc, Y_desc, convolution_fwd_pref, 0)

print("Cudnn algorithm = %d" % algo.value)

ws_size = libcudnn.cudnnGetConvolutionForwardWorkspaceSize(cudnn_context, X_desc, filters_desc, conv_desc, Y_desc, algo)
ws_ptr  = drv.mem_alloc(ws_size.value) if ws_size.value > 0 else 0
ws_data = ctypes.c_void_p(int(ws_ptr))

start_bench()

libcudnn.cudnnConvolutionForward(cudnn_context, alpha, X_desc, X_data,
    filters_desc, filters_data, conv_desc, algo, ws_data, ws_size.value, beta,
    Y_desc, Y_data)

end_bench("fprop")

ws_ptr = None

# Clean up
libcudnn.cudnnDestroyTensorDescriptor(X_desc)
libcudnn.cudnnDestroyTensorDescriptor(Y_desc)
libcudnn.cudnnDestroyFilterDescriptor(filters_desc)
libcudnn.cudnnDestroyConvolutionDescriptor(conv_desc)
libcudnn.cudnnDestroy(cudnn_context)

Installation

Install from PyPi with

pip install cudnn-python-wrappers

cudnn-python-wrappers's People

Contributors

andravin avatar felix-neko avatar hannes-brt avatar kashif avatar lukepfister avatar thestew42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cudnn-python-wrappers's Issues

cudaSetDevice

Hi! Cannot you somehow add cudaSetDevice function to your library?

For example,

_libcudart.cudaSetDevice.restype = int
_libcudart.cudaSetDevice.argtypes = [ctypes.c_int]
# Added to select device to execute CUDA streams
def cudaSetDevice(devid):
    """
    Args:
        devid (int): device from 0 to N_DEV - 1
    Returns:
        int
    """
    status = _libcudart.cudaSetDevice(ctypes.c_int(devid))
    return status

Alas, there is no cudaSetDevice in pycuda, so we, the users, have to call it manually.

Support cuDNN v4

Placeholder for cuDNN v4 support development effort.

I am only set up to test on Linux, so it would be helpful if somebody else could test on Windows and Mac.

Support cudnn v5.1?

i have edited the filename of the cudnn64_x.dll,so it can load cudnnv5.1 now.
if sys.platform in ('linux2', 'linux'): _libcudnn_libname_list = ['libcudnn.so', 'libcudnn.so.6.5', 'libcudnn.so.6.5.18'] elif sys.platform == 'win32': _libcudnn_libname_list = ['cudnn64_5.dll'] else: raise RuntimeError('unsupported platform')

But in line311,it got another error. cudnnv5.1 dosen't have "cudnnCreateTensor4dDescriptor" function.But it has a function called**"cudnnCreateTensorDescriptor"** instead.
>>> import libcudnn, ctypes Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Anaconda2\Lib\site-packages\libcudnn.py", line 311, in <module> _libcudnn.cudnnCreateTensor4dDescriptor.restype = int File "C:\Anaconda2\lib\ctypes\__init__.py", line 375, in __getattr__ func = self.__getitem__(name) File "C:\Anaconda2\lib\ctypes\__init__.py", line 380, in __getitem__ func = self._FuncPtr((name_or_ordinal, self)) AttributeError: function 'cudnnCreateTensor4dDescriptor' not found

libcudnn.cudnnCreate() cudnnStatusArchMismatch Error

Hi, I am trying to run a sample code to test my cudnn installation.
I can import libcudnn without any error. But when I run:
cudnn_context = libcudnn.cudnnCreate()
to initialize cuDNN and return a handle to the cuDNN context

I get the following error:

Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda\lib\site-packages\libcudnn.py", line 277, in cudnnCreate
cudnnCheckStatus(status)
File "C:\Anaconda\lib\site-packages\libcudnn.py", line 246, in cudnnCheckStatu
s
raise cudnnExceptions[status]
libcudnn.cudnnStatusArchMismatch

I have a nvidia geforce 635m with cuda 6.5 installed on my computer, and VS2013 professional.
is there any solution please ?

Strange output data with double

I've tried to use your branch for CUDNN RC2 API, but had a strange thing: when I tried to use it with double, the convolution gave me some strange values: about 10 ^ -312. When I switched to float, everithing was okay again.

Here I made a small test for this error:
https://github.com/Felix-neko/cudnn-python-bugreport

Cannot you look at it?

P.S. Thank your for your patience.
^__^

New CUDNN release

Guten morgen!
There is a new CUDNN release (R2). Aren't you going to support it?

Just now it gives the following error:

import libcudnn as cudnn
File "/usr/local/lib/python3.4/dist-packages/libcudnn.py", line 311, in <module>
_libcudnn.cudnnCreateTensor4dDescriptor.restype = int
File "/usr/lib/python3.4/ctypes/__init__.py", line 364, in __getattr__
func = self.__getitem__(name)
File "/usr/lib/python3.4/ctypes/__init__.py", line 369, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/local/cuda-6.5/lib64/libcudnn.so: undefined symbol: cudnnCreateTensor4dDescriptor

UPD: As I can see, the API has changed a bit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.