GithubHelp home page GithubHelp logo

nvidia / cnmem Goto Github PK

View Code? Open in Web Editor NEW
284.0 42.0 76.0 59 KB

A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory

License: BSD 3-Clause "New" or "Revised" License

CMake 1.80% C 11.53% C++ 82.77% Cuda 3.90%

cnmem's Introduction

CNMeM Library

Simple library to help the Deep Learning frameworks manage CUDA memory.

CNMeM is not intended to be a general purpose memory management library. It was designed as a simple tool for applications which work on a limited number of large memory buffers.

CNMeM is mostly developed on Ubuntu Linux. It should support other operating systems as well. If you encounter an issue with the library on other operating systems, please submit a bug (or a fix).

Prerequisites

CNMeM relies on the CUDA toolkit. It uses C++ STL and the Pthread library on Linux. On Windows, it uses the native Win32 threading library. The build system uses CMake. The unit tests are written using Google tests (but are not mandatory).

CUDA

The CUDA toolkit is required. We recommend using CUDA >= 7.0 even if earlier versions will work.

  • Download from the CUDA website
  • Follow the installation instructions
  • Don't forget to set your path. For example:
    • CUDA_HOME=/usr/local/cuda
    • LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

Build CNMeM

Grab the source

% cd $HOME
% git clone https://github.com/NVIDIA/cnmem.git cnmem

Build CNMeM without the unit tests

% cd cnmem
% mkdir build
% cd build
% cmake ..
% make

Build CNMeM with the unit tests

To build the tests, you need to add an extra option to the cmake command.

% cd cnmem
% mkdir build
% cd build
% cmake -DWITH_TESTS=True ..
% make

Link with CNMeM

The source folder contains a header file 'include/cnmem.h' and the build directory contains the library 'libcnmem.so', 'cnmem.lib/cnmem.dll' or 'libcnmem.dylib', depending on your operating system.

cnmem's People

Contributors

harrism avatar jdemouth avatar lukeyeager avatar mattangus avatar nouiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cnmem's Issues

cnmem cannot allocate the full memory for a device

This is really minor but as I was playing around with Rapids and RMM, which wraps cnmem I noticed that

cnmem/src/cnmem.cpp

Lines 1103 to 1104 in 37896cc

CNMEM_CHECK_TRUE(
size > 0 && size < props.totalGlobalMem, CNMEM_STATUS_INVALID_ARGUMENT);

is checking that the memory allocated is < the total memory on the device and not <=. Not that 1 byte really matters all that much most of the time. We were playing around with unified memory and wanted to be able to over commit in the worst case so we allocated more than the device supported, and when I traced down the returned error I noticed this. If we do decide to support unified with over-commit we may come back with further issues.

Unused value warnings

I'm getting some set but unused value warnings:

cnmem.cpp:909:33: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable]
     Block *curr = mUsedBlocks, *prev = NULL;
                                 ^~~~
cnmem.cpp:933:12: warning: variable ‘result’ set but not used [-Wunused-but-set-variable]
     Block *result = curr;
            ^~~~~~
cnmem.cpp:1221:31: warning: unused variable ‘lockStatus’ [-Wunused-variable]
                 cnmemStatus_t lockStatus = mutexes[i]->unlock();
                               ^~~~~~~~~~

Not a big deal, but as far as I can tell these are warranted. I can make a PR to fix these if that's OK to do and desired.

Add versioning

Can you add a version number to CNMeM? I want a lib with the version baked into the SONAME.

Memory Leak

Is there a known memory leak in cmem?
I found this in the logs from valgrind.

==1049== HEAP SUMMARY:
==1049==     in use at exit: 1,349,153,600 bytes in 1,333,650 blocks
==1049==   total heap usage: 2,621,433 allocs, 1,287,783 frees, 2,955,906,430 bytes allocated
==1049==
==1049== 32 bytes in 1 blocks are definitely lost in loss record 85,881 of 182,145
==1049==    at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1049==    by 0x5C56659: cnmem::Manager::allocateBlockUnsafe(cnmem::Block*&, cnmem::Block*&, unsigned long) (in /opt/tritonserver/lib/libtritonserver.so)
==1049==    by 0x5C57568: cnmem::Manager::reserve(unsigned long)
==1049==    by 0x5C5806B: cnmemInit

make compilation gets errors

I run make and get this config:

-- The C compiler identification is GNU 7.2.1       
-- The CXX compiler identification is GNU 7.2.1
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /media/ephemeral0/cnmem/build

Then make

/media/ephemeral0/cnmem/src/cnmem.cpp: In member function ‘cnmemStatus_t cnmem::Manager::allocateBlockUnsafe(cnmem::Block*&, cnmem::Block*&, std::size_t)’:
/media/ephemeral0/cnmem/src/cnmem.cpp:513:59: error: too few arguments to function ‘cudaError_t cudaMallocManaged(void**, size_t, unsigned int)’
             CNMEM_CHECK_CUDA(cudaMallocManaged(&data, size));
                                                           ^
/media/ephemeral0/cnmem/src/cnmem.cpp:133:30: note: in definition of macro ‘CNMEM_CHECK_CUDA’
     cudaError_t cudaError = (call); \
                              ^~~~
In file included from /media/ephemeral0/cnmem/include/cnmem.h:35:0,
                 from /media/ephemeral0/cnmem/src/cnmem.cpp:29:
/usr/local/cuda-7.5/include/cuda_runtime_api.h:2938:58: note: declared here
 extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaMallocManaged(void **devPtr, size_t size, unsigned int flags);
                                                          ^~~~~~~~~~~~~~~~~
/media/ephemeral0/cnmem/src/cnmem.cpp:514:30: error: ‘cudaMemPrefetchAsync’ was not declared in this scope
             CNMEM_CHECK_CUDA(cudaMemPrefetchAsync(data, size, mDevice));
                              ^
/media/ephemeral0/cnmem/src/cnmem.cpp:133:30: note: in definition of macro ‘CNMEM_CHECK_CUDA’
     cudaError_t cudaError = (call); \
                              ^~~~
/media/ephemeral0/cnmem/src/cnmem.cpp:514:30: note: suggested alternative: ‘cudaMemset3DAsync’
             CNMEM_CHECK_CUDA(cudaMemPrefetchAsync(data, size, mDevice));
                              ^
/media/ephemeral0/cnmem/src/cnmem.cpp:133:30: note: in definition of macro ‘CNMEM_CHECK_CUDA’
     cudaError_t cudaError = (call); \
                              ^~~~
make[2]: *** [CMakeFiles/cnmem.dir/src/cnmem.cpp.o] Error 1
make[1]: *** [CMakeFiles/cnmem.dir/all] Error 2
make: *** [all] Error 2

The magic number in Context

Hi Julien, Frédéric, Luke and Mark,

I have a simple question regarding the way we check singleton in cnmem::Context.
As in the code, we have create() and check(),

cnmemStatus_t Context::create() {
    sCtx = new Context;
    sCtxCheck = CTX_VALID;
    return CNMEM_STATUS_SUCCESS;
}

/// Check that the context was created.
static inline bool check() { return sCtxCheck == CTX_VALID && sCtx; }

where the definition of sCtx and sCtxCheck:

    /// The global context.
    static Context *sCtx;
    /// Use a magic number to specify that the context was created.
    static int sCtxCheck;

One thing that confuses me is that, why we need the extra magic number sCtxCheck?
Why Just use sCtx != nullptr is not enough for check() ?

Thanks for helping me!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.