GithubHelp home page GithubHelp logo

dmlc / minerva Goto Github PK

View Code? Open in Web Editor NEW
692.0 692.0 180.0 36.05 MB

Minerva: a fast and flexible tool for deep learning on multi-GPU. It provides ndarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy.

License: Other

Python 37.08% C++ 49.64% CMake 1.90% C 0.05% Cuda 5.33% Shell 0.37% Protocol Buffer 5.63%

minerva's Issues

Differences with MShadow

hi, all
what's the difference between MShadow and minerva? From the documents, I know that both of them can perform tensor operations of unified form on both CPU and GPU. And their languages are both C++. So I'm wondering what's the main difference between them?

script to convert the pre-trained model to caffe format

I am very interested in the minerva project and glad to deploy minerva on our lab servers. But currently we have a lot of existing codes that are largely only compatible with caffe. I am wondering if you could provide a script that could convert the networks trained by minerva back to caffe format. I think it will help grab the attention of caffe users and gradually turn them to minerva :)

Binary Classifier - Log loss function

Fantastic Library!
I was just wondering, i am trying to use the library for a binary classifier experiment using the log loss function to train the model. This is for a university experiment around benchmarking different models. Would you have time to provide an example of how to use the library to achieve the above goal.
Also showing a visualization in how the algorithm learns and decreases the error.
Many thanks,

Failed to run MNIST example due to CudaPerformNormAddOnRow CUDA: invalid device function

(Owl Ready) ➜  mnist: python
[18:48:23] /home/zer0n/minerva/minerva/system/minerva_system.cpp:86: dag engine enabled
Training data: 235 mini-batches
Test data: 10000 samples
(256, 10)
---Start epoch #0
[18:48:28] /home/zer0n/minerva/minerva/op/impl/cuda/ Check failed: (e) == (cudaSuccess) CudaPerformNormAddOnRow CUDA: invalid device function
[1]    72399 abort (core dumped)  python

My environment:

  • Minerva built with cmake 3.2.3 and g++ 4.9 on Ubuntu 12.04
  • CUDA 7
  • CuDNN 2

FYI, I ran Caffe's MNIST example using GPU just fine.

doesn't compile with cuDNN R2

cuDNN R2 has different interface than R1. Our code does not compile with R2, giving out messages like "identifier "cudnnTensor4dDescriptor_t" is undefined". We should fix it or at least give a warning about this.

self-defined elemwise function on GPU ?


I've looked at the NIPS paper and the code snippet there presents a very convenient way of defining elem-wise function:

float Sigmoid(float x){
return 1.0 / (1.0 + exp(-x));

Then we just do:

Matrix z = (V * y + c).Map(&Sigmoid);

However, the current version seems to be completely different from the one presented in the paper, so I wonder why ? Or how do I achieve such goal in the current version of the code easily (instead of writing cuda code myself). Sorry, just start to look at this project recently so I might've missed some context


mnist_cnn failed at Epoch #0

Here is the error:
F0310 16:38:25.787619 9513 narray.cpp:126] Check failed: lhs.Size(1) == rhs.Size(0) (512 vs. -6833920) size must match
the acts[6] has size: [32845682 4 32 256 ], and minibatch_size is 256.

Difference between Minerva and MShadow

Dear all,

I was surprised too see that Minerva is now part of the umbrella project DMLC. So I'm a little bit lost in the middle of these awesome tools. What is the difference between them ? Especially between minerva, mshadow and Cxxnet. I'm interested in implementing a distributed version of a triplet convolutional network that I have in Caffe.

Request owl.elewise.pow/sqrt

Hi guys,

Is that possible for you to provide a element-wise power/sqrt function for the python interface? I'd like to adjust the weight use Adagrad or RMSprop, which requires a such operation. However, the numpy solution does not work well on multi-GPU case.
Or can you show me some instruction about how to add such operations?


Tutorial, API Documentation?

Hi ,
Is there tutorial for the API in the project ? I've found only the sample code (mnist_mlp, mnist_cnn....)


Pooling output dimension is confusing if we give a non-square matrix as input


In the given MNIST_CNN example in owl, if we set batch size as 4, the input dimension is something like [28,28,1,4], since the pic is a 28*28 square matrix. However, I found that when input matrix is non-square, the output dimension is confusing. I am wondering if it is a bug or it is implemented that way intentionally.

For example, if input.shape is [4, 2, 1, 4] in owl format, while I set "pooling = conv.Pooler(2, 2, 2, 2, 0, 0, conv.pool_op.max)" and after I did "pooling.ff(input)" I am expected to have [2,1,1,4]. But actually I got [1, 2, 1, 4] as the output dimension in owl. Should I expect to have [1, 2, 1, 4] as the result or there is something going wrong inside the pooling function?

Great appreciation for any comments and suggestions. Thanks!

build failed on ubuntu14.04

File /home/ubgpu/github/DMLC/minerva/release/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */

include <pthread.h>

int main(int argc, char** argv)

ifndef pthread_create

return ((int*)(&pthread_create))[argc];


return 0;



Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /home/ubgpu/github/DMLC/minerva/release/CMakeFiles/CMakeTmp

Run Build Command:/usr/bin/make "cmTryCompileExec1004526741/fast"
/usr/bin/make -f CMakeFiles/cmTryCompileExec1004526741.dir/build.make CMakeFiles/cmTryCompileExec1004526741.dir/build
make[1]: Entering directory /home/ubgpu/github/DMLC/minerva/release/CMakeFiles/CMakeTmp' /usr/bin/cmake -E cmake_progress_report /home/ubgpu/github/DMLC/minerva/release/CMakeFiles/CMakeTmp/CMakeFiles 1 Building C object CMakeFiles/cmTryCompileExec1004526741.dir/CheckFunctionExists.c.o /usr/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create -o CMakeFiles/cmTryCompileExec1004526741.dir/CheckFunctionExists.c.o -c /usr/share/cmake-2.8/Modules/CheckFunctionExists.c Linking C executable cmTryCompileExec1004526741 /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTryCompileExec1004526741.dir/link.txt --verbose=1 /usr/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create CMakeFiles/cmTryCompileExec1004526741.dir/CheckFunctionExists.c.o -o cmTryCompileExec1004526741 -rdynamic -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status make[1]: *** [cmTryCompileExec1004526741] Error 1 make[1]: Leaving directory/home/ubgpu/github/DMLC/minerva/release/CMakeFiles/CMakeTmp'
make: *** [cmTryCompileExec1004526741/fast] Error 2


How is Minerva/Owl different from Theano?

What advantages does it have over Theano?

Does Minerva/Owl have automatic differentiation capability that Theano has?

I had thought that Minerva is more comparable to Caffe but it looks like Minerva is more similar to Theano than it is to Caffe. Is it correct?

Unittest compilation failed

When compiling unittest, I got error on my machine (GCC 4.9.2 prerelease, Arch Linux 2015.03.01, kernel 3.18.6). This is caused by following things in this commit 2e886a4:

  1. unittest_main is built as a shared library while for gtest, usually it only builds .a.
  2. The -flto flag is not working with my GCC.

I suggest fixing them by changing back to static library for unittest and removing lto flag (or perform a corresponding check on that flag).

./main: symbol lookup error: ./main: undefined symbol: _ZN7minerva13MinervaSystem15CreateCpuDeviceEv

I can successfully run the given c++ sample code, but I ran into this problem when trying to build my own app.

g++ -std=c++11 -DHAS_CUDA -I./dmlc-core/include/ -I/usr/local/cuda-6.5/include -I./include/ -rdynamic ./ -rdynamic ./ main.cpp -o main

when I run it, it says:

./main: symbol lookup error: ./main: undefined symbol: _ZN7minerva13MinervaSystem15CreateCpuDeviceEv

the main.cpp program is just:

include <minerva.h>

using namespace std;
using namespace minerva;
int main(int argc, char ** argv){
MinervaSystem::Initialize(&argc, &argv);
MinervaSystem& ms = MinervaSystem::Instance();
uint64_t cpuDevice = ms.CreateCpuDevice();
return 0;

Which I think is strange.. since I find exactly CreateCpuDeviceEv in the binary "".
I am using CentOS 6.5

broadcasting NArray + NArray

Hello. First, thanks for the really cool library!

I am experiencing odd issues with broadcasting when doing element wise operations between two arrays. I would expect all of the following to work but not all of them do.

owl.zeros((3,3)) + 10 # works
owl.zeros((3,3)) + np.array(10) # works
owl.zeros((3,3)) + np.array([10]) # works
owl.zeros((3,3)) + np.array([[10]]) # works

owl.zeros((3,3)) + owl.from_numpy(np.array([[10]])) # Fails

what():  [22:31:13] /home/luke/Apps/minerva/minerva/op/impl/cuda.cpp:223: Check failed: (closure.dims_to_replicate.NumDims()) == (1) currently do norm on one dimension only

owl.zeros((3,3)) + owl.from_numpy(np.array(10)) # Fails

*** RuntimeError: [22:32:19] /home/luke/Apps/minerva/minerva/narray/narray.cpp:197: Check failed: (lhs.Size().NumDims()) == (rhs.Size().NumDims()) #dimension mismatch

It appears that broadcasting does not work between two owl arrays. Is this correct? Is there a way to work around this?


Can't get libminervaps.a

I'm following the instructions to integrate minerva with parameter server. But I always fail to get libminervaps.a. I have some confusions here. I don't understand "then compile with make minerva". What does this mean, should I make under the parameter server sources or under minerva? If under parameter server directory, it says not such target. And under minerva directory, it will not give me libminervaps.a. I'm sure that I run configure with parameter enabled.

indexing reference of NArray?

What I really wanna do is parameters update for LSTM.
I've realized that my vocabulary is relatively large (about 500K) and having a vector of NArray is very inefficient (of which the reason I don't know). When trying to sync (by calling WaitForAll) for the following code:

int N = 600000;
int D = 128;
vector A = vector(N,NArray::Zeros({1,D}));
for(int i = 0 ; i < N ; i ++){
A[i] = NArray::Randn({1,D},0,0.05);
//calling ms.WaitForAll() here takes a long long time.....

it takes a long long time (maybe because pushing NArray into vector one at a time makes a lot of malloc calls on GPU, which is slow ?)
So instead I am thinking about having a giant 2D matrix and do the following thing.

NArray A = NArray::Randn({N,D},0,0.01);
NArray b = NArray::Randn({1,D},0,1);
A[10] = A[10] + b; // update a

It seems that this feature is not supported?
Any suggestion or comment is very appreciated...

race condition in owl imports

Hello again,
I think I found a race condition in importing of owl.
I have a fairly simple test program that crashes maybe 1 in 3 times.

import owl
import owl.elewise as ele

c = owl.create_cpu_device()

x = owl.zeros((10, 10))
y = ele.relu(x)

The error received on a bad run looks something like:

○ → python
[22:55:26] /home/luke/Repos/minerva/minerva/system/minerva_system.cpp:86: dag engine enabled
[22:55:26] /home/luke/Repos/minerva/minerva/backend/dag/dag_scheduler.cpp:46: create new op node #1 on device #0
[22:55:26] /home/luke/Repos/minerva/minerva/backend/dag/dag_scheduler.cpp:149: node #1 running right after creation
[22:55:26] /home/luke/Repos/minerva/minerva/backend/dag/dag_scheduler.cpp:46: create new op node #3 on device #0
[22:55:26] /home/luke/Repos/minerva/minerva/backend/dag/dag_scheduler.cpp:176: dispatching node #1 to device #0
[22:55:26] /home/luke/Repos/minerva/minerva/device/device.cpp:95: CPU device #0 create output for task data #0
[22:55:26] /home/luke/Repos/minerva/minerva/device/data_store.cpp:18: create data #0 length 400
terminate called after throwing an instance of 'dmlc::Error'
  what():  [22:55:26] minerva/common/singleton.h:13: Check failed: data_ please initialize before use
^[[AAborted (core dumped)

Sadly I cannot get a legit stack trace as when I run it under gdb I get no failure.

I can throw a sleep of a bit just after the import and it seems to fix the problem.

This is under cpu, ubuntu 15.04 with the dag enabled.


Minerva GPU-- free memory for a variable


Thanks for the great tool.
I are trying to use minerva to run RNN on huge-size dataset. However, the gpu memory increase gradually and get crashed when it is beyond the gpu memory limit. My first question is that is there any way/function to free memory of unused variables? I have used wait_for_all, however it cannot solve the problem.

I try to write a memory-free function by cudaFree. The function is able to free the memory of the Narray variable, however, the pointer to the Narray variable still exists. When I reuse the Narray variable for assignment operation recursively, e.g. data=owl.from_numpy(np.array([...])), I found that it works for a new array with different size, however, it get crashed when the size of new array is same as the old one that has been deleted. For example:

//cannot work
data = owl.from_numpy(np.range(10000).reshape(100,100))
owl.free_memory(data) //I write the function using cudaFree by myself
data = owl.from_numpy(np.range(10000).reshape(100,100))
owl.free_memory(data) //core dump here

//can work
data = owl.from_numpy(np.arange(10000).reshape(100,100))
data = owl.from_numpy(np.arange(20000).reshape(200,100))

Would you please find out for me the reason? Thanks a lot

`mnist_mlp.cpp` cannot compile

In the recent merging, I have removed all codes related to file_loader since they could be completed replaced by MakeNArray interface. Then apps/mnist_mlp.cpp and apps/ps/mnist_mlp.cpp codes cannot be compiled. Need to fix these two applications by writing similar IO codes in apps/mnist_cnn.cpp.

minerva with ps


I am trying to compile minerva with the PS (as suggested in,). But I got some problems.
First, I cannot find the ps branch of minerva, did you open it in the repository ?
Second, I downloaded the ps from, after compiling, I can obtain libminervaps.a and (using a previous version of Makefile). However, when I trying to compile minerva, I got the the following errors:

Linking CXX shared library ../lib/
/usr/bin/ld: cannot find -lminervaps
collect2: error: ld returned 1 exit status
make[2]: *** [lib/] Error 1
make[1]: *** [minerva/CMakeFiles/minerva.dir/all] Error 2
make: *** [all] Error 2

I tried to include the PATH, but it doesn't work.
BTW, compiling the project without ps is fine.

Can you help me with those?


use_dag flag

I am experiencing a crash when trying to launch without the dag from python.

$ python --use_dag=false
[22:16:29] /home/luke/Repos/minerva/minerva/system/minerva_system.cpp:89: dag engine disabled
*** Error in `python': free(): invalid pointer: 0x000000000102a498 ***
Aborted (core dumped)

I tracked it back to:
where argv is being freed.

Commenting that out does fix the problem but most likely leaks.

I am running from ubuntu 15.04, building with CPU only. with no flags launches without an issue.


Can't build apps

Does minerva work with CUDA 7? I installed CUDA 7 and all the samples worked well but I cannot build minerva apps with errors like undefined reference to curandGenerateNormal.

-- cmake generator: Unix Makefiles
-- cmake build tool: /usr/bin/make
-- cmake build type: Release
-- Found cuDNN (include: ~/cudnn2, library: ~/cudnn2/
-- Found BLAS (include: /usr/include, library: /opt/openblas/lib/
-- build C++ applications              -- 1                                                                                                                                                         [9/1758]
-- build unit tests                    -- 0
-- build cpu-only version              -- 0
-- build with parameter server support -- 0
-- build with BLAS library for CPU     -- 1
-- Build CXX Applications:
--   mnist_cnn_2gpu
--   mnist_mlp
--   main
--   mnist_cnn
-- Configuring done
-- Generating done
-- Build files have been written to: ~/minerva/release
[ 16%] Built target gflags
[ 32%] Built target dmlc-core
[ 32%] Built target third-party
Linking CXX shared library ../lib/
[ 91%] Built target minerva
Linking CXX executable main
../lib/ undefined reference to `curandGenerateNormal'
../lib/ undefined reference to `curandSetPseudoRandomGeneratorSeed'
../lib/ undefined reference to `curandCreateGenerator'

Here's my file


I have successfully installed Caffe with CUDA 7 using this instruction tho.

C++ Documentation?

IS there a plan to release C++ API docs?

For instance, I saw concat function in python, is there a equivalent one in C++?


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.