cdeterman / gpur Goto Github PK
View Code? Open in Web Editor NEWR interface to use GPU's
R interface to use GPU's
Hi All,
I try to use gpuR with parallel. The code is like following.
funGPU <- function(t) {
a<-matrix(rnorm(16),nrow=16)
b<-matrix(rnorm(16),ncol=16)
a<-vclMatrix(a,type="float")
b<-vclMatrix(b,type="float")
return(a%*%b)
}
system.time({
t <- 1:500
cl <- makeCluster(4)
results <- parLapply(cl,t,funGPU)
stopCluster(cl)
})
Then the follow message is showed.
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: could not find function "vclMatrix"
What should I do?
To make rational use of GPUs and run the jobs in parallel, do you have any suggestions about running job on free gpu automatically?
Currently, when one job starts on one gpu, I create a file as a lock to record the gpu index and delete the file when the job is done. Before next job starts, check the lock folder to see which gpu is free. But sometimes, the job is out of gpu RAM and stops immediately, the lock file couldn't be removed. Then I have to delete them manually.
Do you have any idea to make it better?
Getting the following error trying to install:
I downloaded openCL code builder, then visual studio as well as cuda, but R could never find a g++ compiler so I installed cygwin. I've also added "C:\cygnus\cygwin-b20\H-i586-cygwin32\bin" to my user and system PATH variables.
I'm an analyst trying to install this, not a computer scientist. I know my way around a computer but very limited experience in compiling etc
Thanks in advance
Tried and failed. errors in attached file. i installed the C++11 headers for opencl as instructed. it looks like clang is set for C++11. i have os 10.11.6 (not Sierra). I assume the sdk for the graphics cards are shipped with the mac.
error_install_gpuR_macpro_amdfirepro500.zip
An S3 str
method could prove useful for gpuR
objects. Ideally should have same output as base R.
str(matrix(rnorm(16), 4))
num [1:4, 1:4] -1.5 0.2 -0.6 0.2 -0.3 ...
Hello Sir!
I tried to use the CRAN version of your package, but ran into the problem already discussed in #39.
I cloned the package from github and checked out the developement branch, but I ran into the following when building:
In file included from ../inst/include/gpuR/getVCLptr.hpp:6:0,
from chol.cpp:11:
../inst/include/gpuR/dynEigenMat.hpp:31:61: error: expression 'new viennacl::matrix<ScalarType, viennacl::row_major, 1u>' is not a constant-expression
Tried to browse through the issues but did not find anything similar.
I also cloned and built the newest RViennaCL version directly from github. Trying to run your package on Windows, with an AMD HD 7300 gpu.
Curiously, multiplying two gpuMatrix objects works correctly but if the user reassigns the object the program will segfault. Some initial debugging shows that the program is crashing when the OpenCL program is being built (program.build(devices)
) Could definitely use help by anyone very familiar with OpenCL code structure. For example:
ORDER = 32
Aint <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER, ncol=ORDER)
Bint <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER, ncol=ORDER)
igpuA <- gpuMatrix(Aint, type="integer")
igpuB <- gpuMatrix(Bint, type="integer")
igpuC <- igpuA %*% igpuB
# Let's reassign the values
Aint <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER, ncol=ORDER)
Bint <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER, ncol=ORDER)
igpuA <- gpuMatrix(Aint, type="integer")
igpuB <- gpuMatrix(Bint, type="integer")
# THIS WILL SEGFAULT!!!
igpuC <- igpuA %*% igpuB
It may be convenient for the install to automatically detect environmental variables for popular SDK's such as CUDA_HOME
or AMDAPPSDKROOT
.
Still likely best to continue use of OPENCL_LIB
and OPENCL_INC
which will supersede the aforementioned variables to allow customization on the users part.
Results from my laptop with Intel Iris Pro card and dev version of pkg.
I can't explain this, but seems GPU backend returns control to R interpreter before materializing result in GPU memory (check median values):
library(gpuR)
library(microbenchmark)
N = 1024
A = matrix(rnorm(N * N), nrow = N)
G = vclMatrix(A, type = 'float')
microbenchmark(
{GG <- crossprod(G);},
{AA <- crossprod(A);},
times = 100
)
#Unit: milliseconds
# expr min lq mean median uq max neval
# { GG <- crossprod(G) } 2.855407 3.261372 6.636427 3.46686 4.065919 64.58746 100
# { AA <- crossprod(A) } 12.821452 14.132459 16.936565 15.33515 16.685954 57.84831 100
microbenchmark(
{GG <- crossprod(G); GG[1]},
{AA <- crossprod(A); AA[1]},
times = 100
)
#Unit: milliseconds
# expr min lq mean median uq max neval
# { GG <- crossprod(G) GG[1] } 20.43659 25.42089 26.19543 26.43395 27.38519 30.57564 100
# { AA <- crossprod(A) AA[1] } 12.41177 14.07693 17.03150 16.03094 17.82994 68.36778 100
For example results with larger matrix:
N = 1024 * 4
A = matrix(rnorm(N * N), nrow = N)
G = vclMatrix(A, type = 'float')
microbenchmark(
{GG <- crossprod(G);}, times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval
# { GG <- crossprod(G) } 3.044952 3.150829 28.66298 3.572825 4.44852 249.6691 10
Note that lq
, uq
are similar to previous experiment with smaller matrix...
Hi Charles, recently I did some experiments with OpenCL and it seems that OpenCL.dll
or OpenCL.so
can be easily compiled from the OpenCL ICD loader source. So I think that we could just include them in the package, so that users do not even need to install any giant-sized OpenCL SDK. All they need is the OpenCL runtime library, which is usually already included in the video card driver.
This also allows the Windows and Mac binary package to be available on CRAN, and it will be much more convenient for users to try things out without stepping into the troublesome installation part.
How do you think of this idea? If it makes any sense, I'd like to do some work on that. 😄
Consider following example:
library(gpuR)
ORDER = 1024 * 6
set.seed(1)
A = matrix(rnorm(ORDER^2), nrow=ORDER)
B = matrix(rnorm(ORDER^2), nrow=ORDER)
object.size(A)/1e6
#301.990088 bytes
C = A %*% B
C[1:10]
# [1] 80.85419 80.67761 41.98283 -126.17838 -99.32701 55.94015
# [7] 108.84205 -150.05794 -84.27298 -80.38638
gpuA = gpuMatrix(A, type="float")
gpuB = gpuMatrix(B, type="float")
gpuC = gpuA %*% gpuB
str(gpuC)
# flt [1:6144, 1:6144] 0 0 0 0 0 ...>
When I try to use vclMatrix
it crashes with:
Abort trap: 6
.
When using 4 locations for both parameters I would again expect a symmetrical matrix as a result.
print(gpulocations.4)
Source: gpuR Matrix [4 x 3]
[,1] [,2] [,3]
[1,] -305.0518 -3396.980 2010.261
[2,] -239.3692 -3421.902 1976.606
[3,] 742.4894 -2864.292 2630.251
[4,] 248.7933 -3382.905 2041.504
and calling
print( distance(gpulocations.4,gpulocations.4))
Source: gpuR Matrix [4 x 4]
[,1] [,2] [,3] [,4]
[1,] 0.00000 77.89696 1328.7159 0
[2,] 77.89696 0.00000 1304.6937 0
[3,] 1328.71592 1304.69372 NaN 0
[4,] 554.90411 493.99902 926.9942 0
There is some oddity with the last column.
NaN at [3,3] is also a little bit off. The results in the diagonal should have been 0.
Cross checking with
print( distance(gpulocations.4,gpulocations.4, method="sqEuclidean"))
Source: gpuR Matrix [4 x 4]
[,1] [,2] [,3] [,4]
[1,] 0.000 6067.936 1.765486e+06 0
[2,] 6067.936 0.000 1.702226e+06 0
[3,] 1765486.005 1702225.705 -3.725290e-09 0
[4,] 307918.566 244035.032 8.593182e+05 0
Again shows 0s at the last column. [3,3] is now a small negative number. All other matrix elements seem the proper positive squares.
I can some additional distance computations with sets of 3 and 4 locations. It showed a similar pattern of issues. Therd also seems to be a difference in the allocated size of the result distance between 3 and 4 locations shows different result size from 4 and 3. There seems some implicit ordering assumption.
print( distance(gpulocations.4,gpulocations.3))
Source: gpuR Matrix [4 x 4]
[,1] [,2] [,3] [,4]
[1,] 430.9922 820.3234 1384.9828 0
[2,] 479.5628 749.3972 1365.7792 0
[3,] 1133.9736 978.7315 104.8199 0
[4,] 694.6057 307.7565 1010.0145 0
print( distance(gpulocations.3,gpulocations.4))
Source: gpuR Matrix [3 x 3]
[,1] [,2] [,3]
[1,] 430.9922 479.5628 1133.9736
[2,] 820.3234 749.3972 978.7315
[3,] 1384.9828 1365.7792 104.8199
print( distance(gpulocations.3,gpulocations.3))
Source: gpuR Matrix [3 x 3]
[,1] [,2] [,3]
[1,] 0.000 1.000599e+03 1161.673
[2,] 1000.599 6.103516e-05 1076.708
[3,] 1161.673 1.076708e+03 NaN
print( distance(gpulocations.3,gpulocations.3, method="sqEuclidean"))
Source: gpuR Matrix [3 x 3]
[,1] [,2] [,3]
[1,] 0 1.001199e+06 1.349484e+06
[2,] 1001199 3.725290e-09 1.159300e+06
[3,] 1349484 1.159300e+06 -3.725290e-09
Hope this helps.
Hello,
you made a nice piece of work! I walked through documentation and could not find an inverse of a matrix. Missed I something? Or do you plan to implement that?
Thanks for your answer!
Can this be made to work with pure open source tools, or do I have to hunt down vendor-specific tools?
Although crossprod
and tcrossprod
are available it would be valuable to have t
implemented.
I'm getting the following errors when running functions under gpuR installed in a docker container loaded under nvidia-docker:
> detectPlatforms()
Error in eval(substitute(expr), envir, enclos) :
ViennaCL: FATAL ERROR: ViennaCL encountered an unknown OpenCL error.
Most likely your OpenCL SDK or driver is not installed properly.
In some cases, this error is due to an invalid global work size or several kernel compilation errors.
If you think that this is a bug in ViennaCL, please report it at [email protected]
and supply at least the following information:`
- Operating System
- Which OpenCL implementation (AMD, NVIDIA, etc.)
- ViennaCL version
Many thanks in advance!
gpuR was compiled and installed successfully along with ViennaCL headers during image building. The NVIDIA GPU driver was exposed to the docker container through nvidia-docker.
On the container, the following files are present:
bash-4.2$ ls -al /usr/local/cuda/lib64/ | grep CL
lrwxrwxrwx. 1 root root 14 Nov 17 12:28 libOpenCL.so -> libOpenCL.so.1
lrwxrwxrwx. 1 root root 16 Nov 17 12:28 libOpenCL.so.1 -> libOpenCL.so.1.0
lrwxrwxrwx. 1 root root 18 Nov 17 12:28 libOpenCL.so.1.0 -> libOpenCL.so.1.0.0
-rw-r--r--. 1 root root 25840 Sep 5 13:15 libOpenCL.so.1.0.0
bash-4.2$ ls -al /usr/local/nvidia/lib64 | grep cl
lrwxrwxrwx. 1 969 965 26 Nov 13 18:32 libnvidia-opencl.so.1 -> libnvidia-opencl.so.367.57
-rwxr-xr-x. 2 root root 8592200 Oct 4 03:41 libnvidia-opencl.so.367.57
My docker build logs and images are here:
https://hub.docker.com/r/mjmg/centos-rstudio-opencpu-server-cuda/builds/
Hello,
I try to install gpuR through:
Installing package into ‘C:/Users/tony_/Documents/R/win-library/3.3’
(as ‘lib’ is unspecified)
Package which is only available in source form, and may need compilation of
C/C++/Fortran: ‘gpuR’
These will not be installed.
Installing package into ‘C:/Users/tony_/Documents/R/win-library/3.3’
(as ‘lib’ is unspecified)
Can some one please help? thanks in advance!
Tony.
I am loving gpuR, I would like to request that you add support for using the Rsymphony package. Rsymphony is
SYMPHONY is an open source solver for solving mixed integer linear programs (MILPs). The current
version can be found at https://projects.coin-or.org/SYMPHONY. Package Rsymphony
uses the C interface of the callable library provided by SYMPHONY, and supplies a high level
solver function in R using the low level C interface.
> devtools::install_github("cdeterman/gpuR", ref = "develop")
...
Error: package ‘RViennaCL’ 1.7.1-0 was found, but >= 1.7.1.1 is required by ‘cdeterman-gpuR-a27bc9c’
R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
MacOSX 10.10.5 (Yosemite)
Installing and running a simple example gives
> library(devtools)
> dev_mode(TRUE)
Dev mode: ON
d>
d> library(gpuR)
d> A <- gpuMatrix(rnorm(10000), 100, 100)
d> A %*% A
LLVM ERROR: Cannot select: 0x47b28c0: f64 = fpow 0x478f110, 0x44c0330 [ORD=12] [ID=59]
0x478f110: f64 = bitcast 0x47b02d0 [ORD=6] [ID=56]
0x47b02d0: i64,ch = load 0x49617d0, 0x478efe8, 0x487c510<LD8[%69(addrspace=1)](tbaa=<0x3ed1378>)> [ORD=6] [ID=54]
0x478efe8: i64 = add 0x478ec70, 0x4201038 [ORD=5] [ID=51]
0x478ec70: i64,ch = CopyFromReg 0x49617d0, 0x4590138 [ORD=5] [ID=26]
0x4590138: i64 = Register %vreg69 [ID=4]
0x4201038: i64 = shl 0x44c10f0, 0x47b1098 [ORD=5] [ID=48]
0x44c10f0: i64 = sign_extend 0x4746ea8 [ORD=4] [ID=45]
0x4746ea8: i32 = add 0x478f908, 0x4748358 [ORD=3] [ID=40]
0x478f908: i32,ch = CopyFromReg 0x49617d0, 0x482f660 [ORD=3] [ID=25]
0x482f660: i32 = Register %vreg23 [ID=3]
0x4748358: i32 = mul 0x49a1950, 0x47b0e48 [ORD=2] [ID=36]
0x49a1950: i32,ch = CopyFromReg 0x49617d0, 0x487c888 [ORD=2] [ID=23]
0x487c888: i32 = Register %vreg29 [ID=1]
0x47b0e48: i32,ch = CopyFromReg 0x49617d0, 0x49a1cc8 [ORD=2] [ID=24]
0x49a1cc8: i32 = Register %vreg73 [ID=2]
0x47b1098: i32 = Constant<3> [ID=22]
0x487c510: i64 = undef [ID=5]
0x44c0330: f64 = bitcast 0x44c8f08 [ORD=11] [ID=58]
0x44c8f08: i64,ch = load 0x49617d0, 0x458eb60, 0x487c510<LD8[%74(addrspace=1)](tbaa=<0x3ed1378>)> [ORD=11] [ID=55]
0x458eb60: i64 = add 0x458f940, 0x47b19d8 [ORD=10] [ID=52]
0x458f940: i64,ch = CopyFromReg 0x49617d0, 0x4746c58 [ORD=10] [ID=29]
0x4746c58: i64 = Register %vreg75 [ID=8]
0x47b19d8: i64 = shl 0x47b20a8, 0x47b1098 [ORD=10] [ID=49]
0x47b20a8: i64 = sign_extend 0x49a24e0 [ORD=9] [ID=46]
0x49a24e0: i32 = add 0x458f4a0, 0x487d1c8 [ORD=8] [ID=41]
0x458f4a0: i32,ch = CopyFromReg 0x49617d0, 0x482f410 [ORD=8] [ID=28]
0x482f410: i32 = Register %vreg24 [ID=7]
0x487d1c8: i32 = mul 0x49a1950, 0x4748480 [ORD=7] [ID=37]
0x49a1950: i32,ch = CopyFromReg 0x49617d0, 0x487c888 [ORD=2] [ID=23]
0x487c888: i32 = Register %vreg29 [ID=1]
0x4748480: i32,ch = CopyFromReg 0x49617d0, 0x47b0648 [ORD=7] [ID=27]
0x47b0648: i32 = Register %vreg79 [ID=6]
0x47b1098: i32 = Constant<3> [ID=22]
0x487c510: i64 = undef [ID=5]
In function: element_op
~ $
d> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gpuR_1.0.2 devtools_1.10.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.3 assertive.sets_0.0-1
[3] assertive.data.us_0.0-1 assertive.types_0.0-1
[5] assertive.properties_0.0-1 digest_0.6.9
[7] assertive.base_0.0-3 assertive.models_0.0-1
[9] assertive.code_0.0-1 assertive.strings_0.0-1
[11] assertive.matrices_0.0-1 assertive.reflection_0.0-1
[13] assertive.data_0.0-1 assertive_0.3-1
[15] assertive.datetimes_0.0-1 tools_3.2.3
[17] assertive.numbers_0.0-1 assertive.files_0.0-1
[19] assertive.data.uk_0.0-1 memoise_1.0.0
[21] knitr_1.12.3
I was trying to install gpuR in my ubuntu 16.10. I did the following to install the package
devtools::install_github("cdeterman/RViennaCL")
devtools::install_github("cdeterman/gpuR", ref = "develop")
But I am getting a error
/home/boby/R/x86_64-pc-linux-gnu-library/3.3/RViennaCL/include/viennacl/traits/size.hpp:164:44: error: ‘const class Eigen::Map<Eigen::Matrix<int, -1, -1>, 0,
/usr/lib/R/etc/Makeconf:141: recipe for target 'chol.o' failed
make: *** [chol.o] Error 1
ERROR: compilation failed for package ‘gpuR’
It would be great if you can help me in this regard.
Boby Mathew
Error from master 0e54a5d
.
library(gpuR)
A <- seq.int(from=0, to=999)
gpuA <- gpuVector(A)
Error in deviceHasDouble(platform_index, device_index) :
is_integer : gpu_idx is not of type 'integer'; it has class 'numeric'.
Seems somewhere internally we call deviceHasDouble
with numeric argument. I can open new issue for that.
gpuInfo()
$deviceName
[1] "Iris Pro"
$deviceVendor
[1] "Intel"
$numberOfCores
[1] 40
$maxWorkGroupSize
[1] 512
$maxWorkItemDim
[1] 3
$maxWorkItemSizes
[1] 512 512 512
$deviceMemory
[1] 1610612736
$clockFreq
[1] 1200
$localMem
[1] 65536
$maxAllocatableMem
[1] 402653184
$available
[1] "yes"
$deviceExtensions
[1] "cl_APPLE_SetMemObjectDestructor" "cl_APPLE_ContextLoggingFunctions"
[3] "cl_APPLE_clut" "cl_APPLE_query_kernel_names"
[5] "cl_APPLE_gl_sharing" "cl_khr_gl_event"
[7] "cl_khr_global_int32_base_atomics" "cl_khr_global_int32_extended_atomics"
[9] "cl_khr_local_int32_base_atomics" "cl_khr_local_int32_extended_atomics"
[11] "cl_khr_byte_addressable_store" "cl_khr_image2d_from_buffer"
[13] "cl_khr_gl_depth_images" "cl_khr_depth_images"
[15] "cl_khr_3d_image_writes" ""
$double_support
[1] FALSE
Need to add the following element-wise functions
sin
cos
tan
asin
acos
atin
sinh
cosh
tanh
asinh
acosh
atanh
ViennaCL is known for good sparse matrix operations. Should add new classes to allow users to leverage this.
Thank you a lot for this package and all the work done. I have issue with installation of "develop" branch on OS X.
I have pretty modern macbook with Intel Iris Pro graphics and openCl 1.2 -
I can package from install master branch, however "develop" branch produces:
.........
cd ../inst/include/loader/ && make libOpenCL.a
CC="clang" CFLAGS=" -fPIC -Wall -mtune=core2 -g -O2 -march=native -ffast-math -Ofast -mtune=native" AR="ar" RM="rm -f"
ICD_OS=icd_linux
clang -I../ -fPIC -Wall -mtune=core2 -g -O2 -march=native -ffast-math -Ofast -mtune=native -c icd.c -o icd.o
In file included from icd.c:40:
In file included from ./icd.h:53:
../CL/cl.h:703:74: error: expected function body after function declarator
...cl_command_queue /* command_queue /) CL_API_SUFFIX__VERSION_2_1;
^
../CL/cl.h:708:63: error: expected function body after function declarator
cl_ulong /* host_timestamp /) CL_API_SUF...
^
../CL/cl.h:712:52: error: expected function body after function declarator
cl_ulong * / host_timestamp /) CL_API_SUFFIX__VERSION_2_1;
^
../CL/cl.h:749:80: error: expected function body after function declarator
...cl_int * / errcode_ret /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:793:60: error: expected function body after function declarator
cl_int * / errcode_ret /) CL_API_SUFFIX...
^
../CL/cl.h:828:60: error: expected function body after function declarator
size_t * / param_value_size_ret /) CL_API_SUFFIX...
^
../CL/cl.h:841:46: error: expected function body after function declarator
cl_uint / alignment /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:845:48: error: expected function body after function declarator
void * / svm_pointer /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:851:81: error: expected function body after function declarator
...cl_int * / errcode_ret /) CL_API_SUFFIX__VERSION...
^
../CL/cl.h:894:56: error: expected function body after function declarator
cl_int /* errcode_ret /) CL_API_SUFFIX__VE...
^
../CL/cl.h:966:48: error: expected function body after function declarator
cl_int /* errcode_ret /) CL_API_SUFFIX__VERSION_2_1;
^
../CL/cl.h:983:56: error: expected function body after function declarator
const void * / arg_value /) CL_API_SUFFIX__VE...
^
../CL/cl.h:989:61: error: expected function body after function declarator
const void * / param_value /) CL_API_SUFFI...
^
../CL/cl.h:1022:82: error: expected function body after function declarator
...size_t /* param_value_size_ret / ) CL_API_SUFFIX_...
^
../CL/cl.h:1322:49: error: expected function body after function declarator
cl_event * /_ event /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:1332:51: error: expected function body after function declarator
cl_event * / event /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:1342:52: error: expected function body after function declarator
cl_event * / event /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:1352:48: error: expected function body after function declarator
cl_event * / event /) CL_API_SUFFIX__VERSION_2_0;
^
../CL/cl.h:1359:50: error: expected function body after function declarator
cl_event * / event /) CL_API_SUFFIX__VERSION_2_0;
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
make[1]: ** [icd.o] Error 1
make: *** [../inst/include/loader/libOpenCL.a] Error 2
ERROR: compilation failed for package ‘gpuR’
- removing ‘/Users/dymitriyselivanov/Library/R/3.3/library/gpuR’
Error: Command failed (1)
I would like to request a second dist method. Something like
distance(A,B, method="euclidian").
Inputs are two sets of points and metric to use for the distance calculation. The idea is to compute the distance from every point in A to every point in B. So a modification from the existing method that computes all distances between all points in a set.
Result would be a matrix containing A*B distance scores.
The scenario I am trying to solve is where I have a very large set of fixed points and need to test the distances of a secondary, variable set of points to the larger one. Hence I would also like to request a 'square euclidian' "metric" (I am aware this is not a proper mathematical metric) as this avoids computation of the square root and is therefore faster. This would still suffice to compare the squared distance against the squared value of a target/cutoff distance.
I am using OS X El Capitan and had no problems so far (well, SIP is a problem if you don't have a recovery partition but that is an Apple issue). Planning to also test the GPU selection under the other ticket. Using a Mac Pro with 2 GPUs so there is potential there.
Thanks for all the great work.
Reference in Issue #15
@vsmaier I have been testing different aspects and I am confident that when objects are removed and R's garbage collection is run, the GPU memory is freed (it was how I designed the package). That said, I am actually stunned that you could run dist
or distance
on a matrix with 600,000 rows. That would result in a distance matrix of 600,000 x 600,000. Assuming you are using the double
type by default, that would equate to almost 3TB or RAM! Your AMD FirePro GPUs only have about 2GB of RAM.
Were you actually receiving results before? A quick dim
of the resulting object would be sufficient to demonstrate the matrix was created.
Are you actually receiving an error anywhere?
How are these repeat calls being accomplished? A loop or are you just manually rerunning the same code?
Either way, if you are rapidly calling the same function, you likely want to explicitly call the rm()
and gc()
functions to make sure RAM is freed.
gpublocks <- gpuR::vclMatrix( as.matrix( mblocks[sample(1:nrow(blocks), 25000,replace=FALSE),] ))
gpuD <- dist(gpublocks)
rm(gpublocks)
gc()
In-place operations for many basic functions would be useful (e.g. +
, -
, *
, etc.).
+
-
*
/
sin
,asin
,sinh
cos
,acos
,cosh
tan
,atan
,tanh
I am sorry if this looks messy but here is the copy and pasted return my R gives me when running "install.packages('gpuR') :
install.packages('gpuR')
Installing package into ‘C:/Users/Alex/Documents/R/win-library/3.3’
(as ‘lib’ is unspecified)
Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘gpuR’
Do you want to attempt to install these from sources?
y/n: y
installing the source package ‘gpuR’
trying URL 'https://mran.revolutionanalytics.com/snapshot/2016-07-01/src/contrib/gpuR_1.1.2.tar.gz'
Content type 'application/octet-stream' length 323285 bytes (315 KB)
downloaded 315 KB
c:/Rtools/mingw_64/bin/g++ -m64 -std=c++0x -I"C:/PROGRA1/MICROS3/MRO-331.1/include" -DNDEBUG -I"C:/Users/Alex/Documents/R/win-library/3.3/Rcpp/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/RcppEigen/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/RViennaCL/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/BH/include" -I"C:/swarm/workspace/External-R-3.3.1/vendor/extsoft/include" -I../inst/include -I"C:/Users/Alex/AMD APP SDK/3.0/include" -O2 -Wall -mtune=core2 -c RcppExports.cpp -o RcppExports.o1/MICROS
c:/Rtools/mingw_64/bin/g++ -m64 -std=c++0x -I"C:/PROGRA3/MRO-331.1/include" -DNDEBUG -I"C:/Users/Alex/Documents/R/win-library/3.3/Rcpp/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/RcppEigen/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/RViennaCL/include" -I"C:/Users/Alex/Documents/R/win-library/3.3/BH/include" -I"C:/swarm/workspace/External-R-3.3.1/vendor/extsoft/include" -I../inst/include -I"C:/Users/Alex/AMD APP SDK/3.0/include" -O2 -Wall -mtune=core2 -c context.cpp -o context.o
In file included from context.cpp:11:0:
C:/Users/Alex/Documents/R/win-library/3.3/RViennaCL/include/viennacl/ocl/device.hpp:28:19: fatal error: CL/cl.h: No such file or directory
#include <CL/cl.h>
^
compilation terminated.
make: *** [context.o] Error 1
Warning: running command 'make -f "Makevars.win" -f "C:/PROGRA1/MICROS3/MRO-331.1/etc/x64/Makeconf" -f "C:/PROGRA1/MICROS3/MRO-331.1/share/make/winshlib.mk" CXX='$(CXX1X)
ERROR: compilation failed for package 'gpuR'
I believe it has something to do with RViennaCL. At the same time my work computer installed it completely fine but my home computer is having troubles. They both use the same version of R which is the microsoft open version. I tried it on an earlier version but no luck. Looking forward to some help :)
BH is only minimally used for such a large package. It shouldn't be necessary.
It would be beneficial to have objects point to an object that is already on the GPU to avoid the overhead of transferring back and forth.
Two options:
gpuMatrix
to always point to GPU memoryvclMatrix
to point to GPU memory and make sure they play well together.Currently leaning towards option 2 as the vclMatrix
objects could be very useful for writing algorithms whereas the gpuMatrix
objects are good for many separate runs without clogging up the GPU RAM. The gpuMatrix
objects will therefore be retained as the host object that represents the CPU pointer that still can be passed to the GPU (this will avoid conflicts with base
methods).
This will require helper functions to transfer between (host
, device
, toHost
, toDevice
, toGPU
?) Probably will have corresponding as.vclMatrix
and vclMatrix
calls if want to work only with GPU specific memory. Note that 'vcl' emphasizes that these objects are viennacl objects.
This should be relatively simple to have pointers directly to the objects once they are copied to the GPU. However, need to make sure of object persistence and destruction.
I will try to write tiny deep neural network project with different backends - gpu with gpuR (on top of vclMatrix
) or cpu with blas. Need some more math functions:
sign
max
, min
pmax
, pmin
Probably I can do similarly as it done with existing math functions. Will take a look when will have time.
Conducting matrix multiplication with float
type gpuMatix
objects is resulting in significant memory leakage. This first appeared to me when I switched over to a NVIDIA card (it may have been with AMD but it wasn't apparent). Not sure if this is a consequence of NVIDIA not playing nice with the OpenCL code within clBLAS or not.
This does NOT, however, appear to be an issue with integer
type gpuMatrix
objects. The biggest difference between the two is the use of clBLAS. The clBLAS code requires the use of the C API whereas there is no igemm
function so I created the kernel and utilized the C++ API to make the code more concise. It is possible an object is somehow closed with the C++ API that I am somehow overlooking with the C API.
I have also verified that a standalone script (clBLAS sgemm example) works without a problem (i.e. no memory leakage) according to valgrind. Running valgrind on a script containing:
ORDER = 1024
A <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER)
B <- matrix(sample(seq(10), ORDER^2, replace=TRUE), nrow=ORDER)
gpuA <- gpuMatrix(A, type="float")
gpuB <- gpuMatrix(B, type="float")
gpuC <- gpuA %*% gpuB
reports the 'definitely lost' data but I only can find information about some initialized values (I am not very fluent with valgrind). I suspect I may need to point directly to the NVIDIA opencl.so file.
Hi,
Just started playing with gpuR package. I am just wondering if there is a way to select graphic processor . I have a laptop with 2 GPU (intel Iris pro + AMD R9). gpuR always uses the Intel GPU (even if I force the OS to use AMD only).
Thanks,
Dennis
It would be nice to have a norm function similar to the one in base R. I have noticed that most of the functionality is there to achieve this, but I had not noticed a feature request for this particular function/operation.
OS - Debian testing (aka 'stretch')
GPU - Intel(R) HD Graphics Haswell GT2 Mobile
Reproducible script (truncated):
A <- gpuMatrix(seq.int(16), nrow=4, ncol=4, type="integer")
A %*% A
Segmentation fault
Bug filed with beignet on bugzilla (https://bugs.freedesktop.org/show_bug.cgi?id=94823)
This doesn't appear to be happening with either AMD or NVIDIA SDK's.
I am trying to install gpuR but I am not having sucess.
I installed the requisites, opencl-headers, cuda and the dependencies R packages, but the gpuR installation returns "version OPENCL_2.0 not defined in file libOpenCL.so.1 with link time reference".
How could I solve this problem?
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 3.1
year 2016
month 06
day 21
svn rev 70800
language R
version.string R version 3.3.1 (2016-06-21)
nickname Bug in Your Hair
Need to have consistency between the types, also more efficient memory handling this way.
Hello.
I've installed nVidia Cuda Toolkit, that includes OpenCL.
I have created the envirnment variables
OPENCL_INC
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include
OPENCL_LIB64 and OPENCL_LIB32
C:\Program Files\NVIDIA Corporation\OpenCL
Those folders exist, and contain files such as cl.h, OpenCL32.dll or OpenCL64.dll.
but when I try to install gpuR I get this error:
- installing source package 'gpuR' ...
** package 'gpuR' successfully unpacked and MD5 sums checked
g++ version = 4.6.3
OPENCL_INC not found!
Please set OPENCL_INC to OpenCL headers.
Warning: running command 'sh ./configure.win' had status 1
ERROR: configuration failed for package 'gpuR'- removing 'C:/Program Files/Microsoft/MRO/R-3.2.4/library/gpuR'
What can I do?
Regards
Also tried with
OPENCL_INC
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL
Should I install Intel or AMD OpenCL SDK as well? I guess not because they are made for Intel and AMD cards.
Hello,
Would it be possible to have eigenvalue & eigenvector decomposition implemented? As well as covariance matrix calculations?
Thank you
This can likely be accomplished using LU decomposition whereby the det(L) * det(U) = det(X)
. However, this requires a cumprod
type functionality for the diagonal elements in the L
and U
matrices (given that they are triangular). I want to get a more recent version of gpuR released soon so setting this back to version 1.3.0.
I was trying to run your example here, the first part where you check whether the computation in R and on the GPU return the exact same values.
I ran the code and the compare statement all(C == gpuC[])
returns FALSE
. I decided to decrease the ORDER
variable down and it returns TRUE
for all the values in the range 1:40
, then it becomes FALSE
.
I checked the norm of the difference for ORDER
as 1024 and I get
> all(C == gpuC[])
[1] FALSE
>
> norm(C-gpuC[],type="f")
[1] 3.660638e-11
So the results are very similar, but not identical, like in your example.
I am running on Ubuntu 16.04 Lenovo w540 machine and the output from gpuInfo()
is:
> gpuInfo()
$deviceName
[1] "Quadro K2100M"
$deviceVendor
[1] "NVIDIA Corporation"
$numberOfCores
[1] 3
$maxWorkGroupSize
[1] 1024
$maxWorkItemDim
[1] 3
$maxWorkItemSizes
[1] 1024 1024 64
$deviceMemory
[1] 2095251456
$clockFreq
[1] 666
$localMem
[1] 49152
$maxAllocatableMem
[1] 523812864
$available
[1] "yes"
$deviceExtensions
[1] "cl_khr_global_int32_base_atomics" "cl_khr_global_int32_extended_atomics" "cl_khr_local_int32_base_atomics"
[4] "cl_khr_local_int32_extended_atomics" "cl_khr_fp64" "cl_khr_byte_addressable_store"
[7] "cl_khr_icd" "cl_khr_gl_sharing" "cl_nv_compiler_options"
[10] "cl_nv_device_attribute_query" "cl_nv_pragma_unroll" "cl_nv_copy_opts"
[13] "cl_khr_gl_event"
$double_support
[1] TRUE
So my question is, are minor inconsistencies expected? Or is this something which is probably only happening on my system?
And btw, thanks for creating this package, it is awesome!
During installing
devtools::install_github("cdeterman/RViennaCL")
is ok
devtools::install_github("cdeterman/gpuR")
throws error
installing source package ‘gpuR’ ...
** libs
clang++ -std=c++11 -I/usr/local/Cellar/r/3.2.1_1/R.framework/Resources/include -DNDEBUG -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/RcppEigen/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/RViennaCL/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/BH/include" -I../inst/include -fPIC -g -O2 -c RcppExports.cpp -o RcppExports.o
clang++ -std=c++11 -I/usr/local/Cellar/r/3.2.1_1/R.framework/Resources/include -DNDEBUG -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/RcppEigen/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/RViennaCL/include" -I"/usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/BH/include" -I../inst/include -fPIC -g -O2 -c detectCPUs.cpp -o detectCPUs.o
In file included from detectCPUs.cpp:6:
In file included from /usr/local/Cellar/r/3.2.1_1/R.framework/Versions/3.2/Resources/library/RViennaCL/include/CL/cl.hpp:158:
In file included from /System/Library/Frameworks/OpenCL.framework/Headers/opencl.h:14:
/System/Library/Frameworks/OpenCL.framework/Headers/cl_gl_ext.h:136:29: error: unknown type name 'cl_image_desc'; did you mean 'cl_image_info'?
const cl_image_desc * /* image_desc */,
Would be great to add support for slicing for ranges (and document limitations, because of GPU RAM latency):
library(gpuR)
N = 128
A = matrix(rnorm(N * N), nrow = N)
G = gpuMatrix(A, type = 'float') # or vclMatrix
# works
G[1, 4]
# all below don't work
# Error in eval(substitute(expr), envir, enclos) : expecting a single value
G[1:4, ]
G[, 1:4]
G[1:4, 1:4]
Also would be great to have docs to understand how matrices stored in GPU: row-major or column-major.
QR and SVD appear to be implemented in ViennaCL. Cholesky appears to have limited support from what I have found so far.
I have installed the gpuR packge on my windows machine without any issues(no compilation or install errors). However when I go to run some basic commands from the packge, like gpuInfo(), detectGPUs(), detectPlatforms() etc... R studio seems to take a long time to respond and eventually comes up with a fatal error from R and the program closes. Do you have any suggestions as to why this might happen? Is it a user permissions thing?
Thanks in advance.
Hello,
I tried to install the gpuR package on my Windows computer, after having downloaded the CUBA toolkit, but it didn’t work. I also installed Rtools and changed the system PATH. I got a message telling me that the package “is only available in source form, and may need compilation of C/C++/Fortran”. I tried to install the package with the source file, but it wasn’t more successful.
Do you know how to fix my problem ?
Kind regards,
Yassin
Hi there,
I need some help with my installation of "gpuR". I'm have a Dell Alienware MX14R2 with an NVIDIA GeForce GT 650M 2G GPU and Windows 8.1 Enterprise. I'm using RStudio with the MS R Open 3.3.0 distro of R.
The function calls are not returning anything except a list. For example:
` library(gpuR)
gpuR 1.1.4
ORDER = 1024
A = matrix(rnorm(ORDER^2), nrow=ORDER)
B = matrix(rnorm(ORDER^2), nrow=ORDER)
gpuA = gpuMatrix(A, type="float")
gpuB = gpuMatrix(B, type="float")
gpuA
An object of class "fgpuMatrix"
Slot "address":
<pointer: 0x000000000863e460>
Slot ".context_index":
[1] 1
Slot ".platform_index":
[1] 1
Slot ".platform":
[1] "Intel(R) OpenCL"
Slot ".device_index":
[1] 1
Slot ".device":
[1] " Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz"`
By the looks of things gpuR is using an Intel OpenCL library. My laptop also has an Intel HD 4000 graphics card. However, a call to gpuInfo()
generated a list as follows:
`gpuInfo()
$deviceName
[1] "GeForce GT 650M"
$deviceVendor
[1] "NVIDIA Corporation"
$numberOfCores
[1] 2
$maxWorkGroupSize
[1] 1024
$maxWorkItemDim
[1] 3
$maxWorkItemSizes
[1] 1024 1024 64
$deviceMemory
[1] 2147483648
$clockFreq
[1] 835
$localMem
[1] 49152
$maxAllocatableMem
[1] 536870912
$available
[1] "yes"
$deviceExtensions
[1] "cl_khr_global_int32_base_atomics" "cl_khr_global_int32_extended_atomics" "cl_khr_local_int32_base_atomics"
[4] "cl_khr_local_int32_extended_atomics" "cl_khr_fp64" "cl_khr_byte_addressable_store"
[7] "cl_khr_icd" "cl_khr_gl_sharing" "cl_nv_compiler_options"
[10] "cl_nv_device_attribute_query" "cl_nv_pragma_unroll" "cl_nv_d3d9_sharing"
[13] "cl_nv_d3d10_sharing" "cl_khr_d3d10_sharing" "cl_nv_d3d11_sharing"
[16] "cl_nv_copy_opts"
$double_support
[1] TRUE`
I followed these steps in the installation:
Sys.setenv(OPENCL_INC = 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v7.5/include') Sys.setenv(OPENCL_LIB32 = 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v7.5/lib/Win32') Sys.setenv(OPENCL_LIB = 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v7.5/lib/x64')
devtools::install_local
after Github clone as followsgit clone -b develop https://[email protected]/cdeterman/gpuR.git
I'm in all likelihood overlooking something simple, but I've been pushing the bounds of my abilities to get this far. If it helps, my PATH is like this:
Sys.getenv('PATH') [1] "C:\\Program Files\\Microsoft\\MRO\\R-3.3.0\\bin\\x64;C:\\Rtools-3.3\\bin;C:\\Rtools-3.3\\gcc-4.6.3\\bin;C:\\Rtools-3.3\\bin;C:\\Rtools-3.3\\gcc-4.6.3\\bin;C:\\Rtools-3.3\\bin;C:\\Rtools-3.3\\mingw_32\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v7.5\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v7.5\\libnvvp;C:\\Rtools-3.3\\bin;C:\\Rtools-3.3\\mingw_32\\bin;C:\\ProgramData\\Oracle\\Java\\javapath;C:\\Program Files (x86)\\Intel\\iCLS Client\\;C:\\Program Files\\Intel\\iCLS Client\\;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\System32\\Wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common;C:\\Program Files\\Intel\\Intel(R) Management Engine Components\\DAL;C:\\Program Files\\Intel\\Intel(R) Management Engine Components\\IPT;C:\\Program Files (x86)\\Intel\\Intel(R) Management Engine Components\\DAL;C:\\Program Files (x86)\\Intel\\Intel(R) Management Engine Components\\IPT;C:\\Program Files (x86)\\Skype\\Phone\\;C:\\Program Files (x86)\\PuTTY\\;C:\\Program Files (x86)\\Windows Kits\\8.1\\Windows Performance Toolkit\\;C:\\Program Files\\Microsoft SQL Server\\110\\Tools\\Binn\\;C:\\Program Files (x86)\\Microsoft SDKs\\TypeScript\\1.0\\;C:\\Program Files\\Git\\cmd;C:\\RailsInstaller\\Git\\cmd;C:\\RailsInstaller\\Ruby2.2.0\\bin"
Any help would be great. Thankyou.
> devtools::install_github("cdeterman/gpuR")
Downloading github repo cdeterman/gpuR@master
Installing gpuR
Skipping 1 packages not available: RViennaCL
Skipping 4 packages ahead of CRAN: evaluate, knitr, mime, Rcpp
Installing 2 packages: assertive, assertive.base
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ \
--no-save --no-restore CMD INSTALL \
'/private/var/folders/f2/9jwh0h8s4y70r1jl3s7cq_5c0000gn/T/Rtmpk1kn3Z/devtools5d004d66aa49/cdeterman-gpuR-0e54a5d' \
--library='/Library/Frameworks/R.framework/Versions/3.2/Resources/library' \
--install-tests
ERROR: dependency ‘RViennaCL’ is not available for package ‘gpuR’
* removing ‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/gpuR’
Error: Command failed (1)
> install.packages("RViennaCL")
Warning message:
package ‘RViennaCL’ is not available (for R version 3.2.2)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.