bigmemory's People
Forkers
cdeterman xinchoubiology linearregression eddelbuettel r2evans czarrar adamryczkowski hoardboard yaohuizeng nemochina2008 jeblundell itsmenotu kaikaiguo gridl rtobar sashikant123 twesleyb privefl gmweaver rnaimehaom isabelle-c-s-oliveira liyunfei5126 vineetp6bigmemory's Issues
Session crash when using two versions of bigmemory.
I have a package that depends on bigmemory. If I build it with one version of bigmemory, it is OK.
Then if I rebuild another version of bigmemory (without rebuilding my package), my package makes the session crash when I use a function.
Rebuilding the package with the current version of bigmemory solves the problem.
If someone can tell why this happens, I am interested.
TL;DR: Each time you build another version of bigmemory, you may need to rebuild your packages that depends on bigmemory if it makes your session crash.
Migrate util.h and util.cpp to Rcpp
We are currently resorting to undefing length in util.h on the Mac (and Widows?) because we are mixing R libraries with Rcpp. It would be nice if this file was migrated to using Rcpp only.
sub.big.matrix of a non-shared big.matrix?
a <- big.matrix(5, 5, shared = FALSE)
a.sub <- sub.big.matrix(a, 1, 3)
Error in DescribeBigMatrix(x) :
you can't describe a non-shared big.matrix.
Is it normal? Why?
Non-contiguous sub.big.matrix?
Currently we have sub.big.matrix
but it explicitly states that non-contiguous subsets are not possible. It would be great if we could find a way to get this implemented before the next release.
Bigmemory gives an error in R CMD CHECK environment
From @cdeterman. The primary problem appears to be in the big.matrix.Rd file with the example
x <- big.matrix(10, 2, type='integer', init = -5)
Strangely this returns the error
Error: memory could not be allocated for instance of type big.matrix
where somehow CreateSharedMatrix is returning a null address. This is very odd as it works without a problem when run in the R console even when multiple big.matrix objects are created consecutively. The only way I have been able to reproduce the error is by rerunning R CMD CHECK or R --vanilla < bigmemory-Ex.R where bigmemory-Ex.R was previously produced by R CMD CHECK.
Get rid of dependence on boost uuid
rbind-like function?
Hi,
I am wondering is there is rbind-like function, as it seems that rbind() can not handle big.matrix objects
Thanks
Vincent
Problem is SetElements
Fix this and add a test.
N <- 1000
K <- 26
x <- big.matrix(N, K, type="char", backingfile="toy.bin", descriptorfile="toy.desc")
dim(x)
# [1] 1000 26
# # Some interesting points, here...
options(bigmemory.typecast.warning=FALSE)
options(bigmemory.allow.dimnames=TRUE)
# # The do some basic things with the natural R syntax:
x[1:2, 1:2] <- 1:4 # Simple assignment
# Error in SetElements.bm(x, i, j, value) :
# number of items to replace is not a multiple of replacement length
Question about multiplyr fix
@privefl Thanks very much for the multiplyr
pull request. @jeblundell has committed it and the new version of bigmemory
should be on CRAN soon.
I was just reviewing the change here. Can you tell me what the problem was? I'm having trouble isolating it with a simple example:
library(bigmemory)
x <- big.matrix(5, 2, type="integer", init=0,
dimnames=list(NULL, c("alpha", "beta")))
# I thought the following would reproduce the problem.
x[x[,1] < 3, 2]
Get bigmemory to build for Travis-CI
We need to fix some documentation: https://travis-ci.org/kaneplusplus/bigmemory
big.matrix doesn’t store row names
Someone else described the same problem in stackoverflow:
http://stackoverflow.com/questions/12576735/bigmemory-and-rownames-dimnames-of-matrix
Current state on Windows unclear
Hi Michael,
I see that a lot of work has been done since this questions has been asked last time (see #2), and I had no problems installing the package from CRAN on a virtual machine running Windows, so I am wondering if there are any remaining issues with Windows support? http://bigmemory.org still mentions that "updating bigmemory with restored support for Windows" is a work in progress.
Thanks,
Alex
Get rid of startup message
Questions on dev
I'm starting to put my nose in the code of this package to
- understand it better,
- try to solve some problems,
- try to add some features.
I have some questions on some choices of implementation. It will be mostly C/C++ questions as I'm more comfortable with R. They may sound "stupid" to you.
I will use this conversation to ask my questions and hope for your answers, so that I can contribute more to this package.
Ordering Columns?
Right now we currently have mpermute
which will reorder rows. It seems to me we should have the option to reorder columns as well. Thoughts?
Resource leak when shared big.matrix used with multicore
library(bigmemory)
library(doMC)
registerDoMC(cores=2)
a = big.matrix(nrow=3, ncol=3, shared=TRUE)
desc = describe(a)
foreach (i=1:3) %dopar% {
m = attach.big.matrix(desc)
m[i,1] = i
NULL
}
causes a shared memory resource leak. The following does not.
library(bigmemory)
library(doMC)
registerDoMC(cores=2)
a = big.matrix(nrow=3, ncol=3, shared=TRUE)
desc = describe(a)
foreach (i=1:3) %dopar% {
m = attach.big.matrix(desc)
m[i,1] = i
rm(m)
gc()
NULL
}
What are the current limitations for windows?
I am well aware that bigmemory is not currently installable for Windows. I was wondering what are the current reasons that it is limited? What about Windows is preventing the package from being platform independent?
big.matrix alters R's RNG seeds
If I do the following sequence of functions calls multiple times, the outputs are always the same:
set.seed(123)
x <- rbinom(25, 1, 0.2)
m <- matrix(1, nrow = 5, ncol = 5)
rbinom(25, 1, 0.2)
but if I use big.matrix
then the calls are different each time:
set.seed(123)
x <- rbinom(25, 1, 0.2)
m <- big.matrix(nrow = 5, ncol = 5, init = 1)
rbinom(25, 1, 0.2)
Any idea why this is happening? I suspected somewhere in the C++ code a 'random' function is called but I couldn't find anything to that effect.
Windows won't create consecutive filebacked.big.matrix objects
For some strange reason, Windows builds but errors out when you try to build a filebacked.big.matrix
object a second time. This results in problems when trying to run the testing framework. The following reproduces the problem:
z <- filebacked.big.matrix(3, 3, type='integer', init=123,
backingfile="example.bin",
descriptorfile="example.desc",
dimnames=list(c('a','b','c'), c('d', 'e', 'f')))
z <- filebacked.big.matrix(3, 3, type='integer', init=123,
backingfile="example.bin",
descriptorfile="example.desc",
dimnames=list(c('a','b','c'), c('d', 'e', 'f')))
Error in CreateFileBackedBigMatrix(as.character(backingfile), as.character(backingpath), :
Problem creating filebacked matrix.
So this is clearly having issues with the create
function on line 2558 of bigmemory.cpp. I have no idea at the moment why this is happening as it is not reproducible on linux.
allow.duplicates not working?
From an email to bigmemoryauthors:
I've been trying to use the "bigmemory" R package but I'm having some issues when trying to remove all duplicated rows (across all columns) from a big.matrix. According to the manual I should be able to use mpermute with "allow.duplicates = FALSE", but don't think it is working. For example:
m = matrix(as.double(as.matrix(iris)), nrow=nrow(iris))
x=m[,c(1,2)]
mpermute(x, cols=c(1:ncol(x)), allow.duplicates=FALSE)
I still get duplicated lines (e.g. 134, 135 and 136).
Is it a bug? Or am I doing something wrong?
when I use biglasso,my data was a sparse matrix class of Matrix package,biglasso seems only support a big.matrix,I can not onvert a sparse matrix to a big.matrix. any suggestions? thanks
`UndefinedBehaviorSanitizer`: object of type `SharedMemoryBigMatrix` is not `BigMatrix`
Hi @kaneplusplus ,
I use library(bigmemory)
in my package bigKRLS (with @rbshaffer) (and it's been highly effective, increasing what a typical laptop can do fivefold). bigKRLS
also uses Rcpp
and RcppArmadillo
. bigKRLSuses
library(parallel) for the marginal effects. (
bigKRLSregresses
yon some matrix
X. If
ncol(X) = p,
bigKRLSuses
por
Ncores - 2`, whichever is less).
When I submit to CRAN, I get the following error:
> N <- 500 # proceed with caution above N = 5,000 for system with 8 gigs made available to R
> P <- 4
> X <- matrix(rnorm(N*P), ncol=P)
> b <- 1:P
> y <- sin(X[,1]) + X %*% b + rnorm(N)
> out <- bigKRLS(y, X, Ncores=1)
gauss_kernel.cpp:38:40: runtime error: member call on address 0x6120001ac8c0 which does not point to an object of type 'BigMatrix'
0x6120001ac8c0: note: object is of type 'SharedMemoryBigMatrix'
25 00 80 47 38 80 92 67 87 7f 00 00 04 00 00 00 00 00 00 00 f4 01 00 00 00 00 00 00 f4 01 00 00
Similar errors repeat for the other function calls.
This means that X
is coerced to a big.matrix
by as.big.matrix()
without issue but then by the time the Rcpp
function is called there is apparently some slippage between SharedBigMatrix
and BigMatrix
. Any thoughts or suggestions on how to address this issue? (Also, please let me know if more detail re: bigKRLS
would be helpful.)
Add Coveralls Testing
object of type 'S4' is not subsettable when using dim names for subsetting
Hi,
the newest version of the bigmemory package on CRAN (4.5.28) introduced an issue. big.matrix used to support subsetting by dim names, e.g. row and column names. Now this leads to the error: object of type S4 not subsettable
Example:
test_big_matrix <- big.matrix(nrow = 5, ncol = 5,
dimnames = list(
c("A", "B", "C", "D", "E"),
c("F", "G", "H", "I", "J")))
test_big_matrix[,] <- 0 #works
test_big_matrix[c("A", "D"), c("H", "J")] <- 1 #used to work
I also tried if setting this option helps but it did not:
$bigmemory.allow.dimnames
[1] TRUE
Missing GetIndivMatrixElements in RcppExports
When building I'm getting the following note:
GetIndivElements.bm: no visible global function definition for
‘GetIndivMatrixElements’
It looks like this is happening because GetIndivElements.bm
in bigmemory.R is calling GetIndivMatrixElements
, which isn't defined.
I think the there should be a GetIndivMatrixElements
in RcppExports.R should call GetIndivMatrixElements
in bigmemory.cpp. However, I'm a little bit confused by the signature.
bigmemory is not installing on the Mac.
How can I create a shared big.matrix without using the backing file?
How can I create a shared big.matrix
without using the backing file so as to force the matrix to always be used in memory? The reason I wish to do this is it seems that the big.matrix
will use a swap file when it thinks it needs to, drastically slowing performance, so I would like to make the swap file usage off limits.
attaching a file backed big matrix fails when using '~' in backingpath on OS X
Currently I am on R 3.3.2, with bigmemory 4.5.19, OS X 10.11.6.
I noticed that when I use the tilde ('~') in a backing path of a file backed big matrix, attaching the matrix fails. It's a minor issue, I can work around it, but I think it's worth sharing with the community since it took me quite some time to find out what went wrong.
Here is the code that reproduces the error:
library(bigmemory)
options(bigmemory.typecast.warning=FALSE)
# create a simple filebacked matrix
BM = filebacked.big.matrix(10, 10, type = "integer", init = 5,
backingfile = 'M.bin',
backingpath = '~/Documents', descriptorfile='M.desc')
BMdescription <- describe(BM)
# attach it from the backing file
y <- attach.big.matrix(BMdescription, backingpath='~/Documents')
and the error:
Error in attach.resource(obj, path = list(...)[["backingpath"]], ...) :
Fatal error in attach: big.matrix could not be attached.
Memory leak in bigmemory?
Hi bigmemory authors,
I am resubmitting to CRAN my R package biglasso, which depends on bigmemory, and noticed that the Memtest with clang-UBSAN
reveals some strange runtime errors related to BigMatrix
object, though I didn't encounter any errors with R CMD check --as-cran
, nor with win-builder
.
I attached some output as below. Detailed output can be found here.
Could you clarify me how to fix this? In addition, I have my package depend on bigmemory (>= 4.0.0)
, should I change the dependency to the latest version, say >=4.5.0 ?
> cleanEx()
> nameEx("biglasso")
> ### * biglasso
>
> flush(stderr()); flush(stdout())
>
> ### Name: biglasso
> ### Title: Fit lasso penalized regression path for big data
> ### Aliases: biglasso
>
> ### ** Examples
>
> ## Linear regression
> data(colon)
> X <- colon$X
> y <- colon$y
> X.bm <- as.big.matrix(X)
> # lasso, default
> par(mfrow=c(1,2))
> fit.lasso <- biglasso(X.bm, y, family = 'gaussian')
gaussian_hsr.cpp:87:17: runtime error: member call on address 0x6120001bd540 which does not point to an object of type 'BigMatrix'
0x6120001bd540: note: object is of type 'SharedMemoryBigMatrix'
36 00 80 0f 38 c1 ae 75 ad 7f 00 00 d0 07 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 3e 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'SharedMemoryBigMatrix'
SUMMARY: AddressSanitizer: undefined-behavior gaussian_hsr.cpp:87:17 in
/data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/BigMatrix.h:44:37: runtime error: member access within address 0x6120001bd540 which does not point to an object of type 'const BigMatrix'
0x6120001bd540: note: object is of type 'SharedMemoryBigMatrix'
36 00 80 0f 38 c1 ae 75 ad 7f 00 00 d0 07 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 3e 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'SharedMemoryBigMatrix'
SUMMARY: AddressSanitizer: undefined-behavior /data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/BigMatrix.h:44:37 in
/data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/MatrixAccessor.hpp:37:39: runtime error: member call on address 0x6120001bd540 which does not point to an object of type 'BigMatrix'
0x6120001bd540: note: object is of type 'SharedMemoryBigMatrix'
36 00 80 0f 38 c1 ae 75 ad 7f 00 00 d0 07 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 3e 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'SharedMemoryBigMatrix'
SUMMARY: AddressSanitizer: undefined-behavior /data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/MatrixAccessor.hpp:37:39 in
/data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/BigMatrix.h:41:28: runtime error: member access within address 0x6120001bd540 which does not point to an object of type 'BigMatrix'
0x6120001bd540: note: object is of type 'SharedMemoryBigMatrix'
36 00 80 0f 38 c1 ae 75 ad 7f 00 00 d0 07 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 3e 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'SharedMemoryBigMatrix'
SUMMARY: AddressSanitizer: undefined-behavior /data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/BigMatrix.h:41:28 in
/data/gannet/ripley/R/test-clang/bigmemory/include/bigmemory/MatrixAccessor.hpp:38:23: runtime error: member call on address 0x6120001bd540 which does not point to an object of type 'BigMatrix'
0x6120001bd540: note: object is of type 'SharedMemoryBigMatrix'
36 00 80 0f 38 c1 ae 75 ad 7f 00 00 d0 07 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 3e 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
vptr for 'SharedMemoryBigMatrix'
read from gzfile
Hi,
Thank you for this useful package. I use to read my matrices from text files using read.big.matrix. I was wondering whether it would be possible to support input from gzfiles?
Best regards,
Marc
Binary backing file not consistent
Here is the code:
# test big.matrix
x = matrix(4, 2, 2)
y = as.big.matrix(backingfile = "test.bin", descriptorfile="test.desc", backingpath = "/tmp", binarydescriptor=TRUE, x = x)
y[, ]
On Ubuntu 15.04:
kaiyin@tron 18:58:11 | /tmp =>
xxd -b test.bin
0000000: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000006: 00010000 01000000 00000000 00000000 00000000 00000000 .@....
000000c: 00000000 00000000 00010000 01000000 00000000 00000000 ...@..
0000012: 00000000 00000000 00000000 00000000 00010000 01000000 .....@
0000018: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000001e: 00010000 01000000
On Mac OS 10.10:
kaiyin@kaiyins-mbp 18:59:48 | /tmp =>
xxd -b test.bin
0000000: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000006: 00010000 01000000 00000000 00000000 00000000 00000000 .@....
000000c: 00000000 00000000 00010000 01000000 00000000 00000000 ...@..
0000012: 00000000 00000000 00000000 00000000 00010000 01000000 .....@
0000018: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000001e: 00010000 01000000 00000000
As you can see, there is an extra null byte on mac.
Can we replace bigmemory with a small ALTREP?
Add the size of the allocation in the Create* functions in BigMatrix.cpp
RProtect Possible Stack Imbalance
See here.
float type matrices?
Given the focus of this package is to use the minimal amount of memory as efficiently as possible I believe it should include big.matrix
objects of type float
. There may be situations in which single precision is sufficient and therefore would need only ~half the space as a double
matrix. I know R does not have any single precision data type but seeing how all the 'heavy lifting' is done in C++ it seems like this should be an approachable thing. However I would like to have additional opinions on the following points before I begin writing a bunch of code.
- Naturally, do you agree that
float
type matrices should be a part of this package? - The current structure has the
matrix_type
representing the byte size of each type. Continuing this withfloat
types would lead a conflict in variouscase
statements. The only solution I could come up off the top of my head was to add another field to thebig.matrix
object that is more specific to the data type and not the byte size. e.g.pMat->matrix_data_type()
would return a string (i.e. 'int', 'float', 'double', etc). This would lead to other code requiring updating, such as the Rcpp Gallery posts unless a more elegant solution can be conceived. - Approaches likely would involved the use of
typeid
from<typeinfo>
unless we would want to also begin moving towards C++11 standards where we could use the newerdecltype
function but possibly a moot point here (but worth beginning thoughts about C++11.
Any thoughts are appreciated :)
Is it possible to access a memory-backed bigmemory deterministically?
I want to use the bigmemory::matrix as a means of communication between two unrelated R processes. The bigmemory::matrix can be shared using its descriptor file, but one needs to send the descriptor file to the other process, so there is a chicken-and-the-egg problem.
One way of solving the problem is to save the descriptor file in some place on the mutually agreed location within the file structure.
Is there any way to get the access to the bigmemory::matrix without using the filesystem? Something like "named bigmemory::matrix" that work similar to the named mutexes?
Winbuilder notes to fix in CRAN submission.
* using log directory 'd:/RCompile/CRANguest/R-devel/bigmemory.Rcheck'
* using R Under development (unstable) (2017-03-24 r72390)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: ISO8859-1
* checking for file 'bigmemory/DESCRIPTION' ... OK
* this is package 'bigmemory' version '4.5.22'
* checking CRAN incoming feasibility ... Note_to_CRAN_maintainers
Maintainer: 'Michael J. Kane <[email protected]>'
* checking package namespace information ... OK
* checking package dependencies ... NOTE
Package which this enhances but not available for checking: 'synchronicity'
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package 'bigmemory' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking 'build' directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* loading checks for arch 'i386'
** checking whether the package can be loaded ... OK
** checking whether the package can be loaded with stated dependencies ... OK
** checking whether the package can be unloaded cleanly ... OK
** checking whether the namespace can be loaded with stated dependencies ... OK
** checking whether the namespace can be unloaded cleanly ... OK
** checking loading without being on the library search path ... OK
** checking use of S3 registration ... OK
* loading checks for arch 'x64'
** checking whether the package can be loaded ... OK
** checking whether the package can be loaded with stated dependencies ... OK
** checking whether the package can be unloaded cleanly ... OK
** checking whether the namespace can be loaded with stated dependencies ... OK
** checking whether the namespace can be unloaded cleanly ... OK
** checking loading without being on the library search path ... OK
** checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking compiled code ... NOTE
File 'bigmemory/libs/i386/bigmemory.dll':
Found no calls to: 'R_registerRoutines', 'R_useDynamicSymbols'
File 'bigmemory/libs/x64/bigmemory.dll':
Found no calls to: 'R_registerRoutines', 'R_useDynamicSymbols'
It is good practice to register native routines and to disable symbol
search.
See 'Writing portable packages' in the 'Writing R Extensions' manual.
* checking sizes of PDF files under 'inst/doc' ... OK
* checking installed files from 'inst/doc' ... OK
* checking files in 'vignettes' ... OK
* checking examples ...
** running examples for arch 'i386' ... [2s] OK
** running examples for arch 'x64' ... [2s] OK
* checking for unstated dependencies in 'tests' ... OK
* checking tests ...
** running tests for arch 'i386' ... [4s] OK
Running 'testthat.R' [4s]
** running tests for arch 'x64' ... [7s] OK
Running 'testthat.R' [6s]
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking re-building of vignette outputs ... [6s] OK
* checking PDF version of manual ... OK
* DONE
Status: 2 NOTEs
Support for n-dimensional arrays?
I was wondering if you would consider supporting n-dimensional arrays in your package. It seems it would be relatively straightforward to do so. ff does this, but it doesn't provide a C++-level interface that would allow me to work with ff objects from Rcpp, as bigmemory does.
Using macros instead of extensive dispatch functions.
I've just come across the Cleaner Generic Functions with RCPP_RETURN Macros which are implemented there.
I think, with some tweaking, we could use macros for MatrixAccessor
s and types raw, char, short, int, float, double, complex
in order to factorize code by using macros instead of extensive dispatch functions.
Yet, I don't know anything about C++ macros.
@eddelbuettel @nathan-russell Could this be feasible?
Add Travis-CI support
as.big.matrix() with type "raw"
I tried:
> x <- matrix(as.raw(sample(0:255, 100)), 10, 10)
> class(x)
[1] "matrix"
> typeof(x)
[1] "raw"
> as.big.matrix(x, type = "raw")
Error in SetMatrixElements(x@address, as.double(j), as.double(i), value) :
RAW() can only be applied to a 'raw', not a 'double'
In addition: Warning messages:
1: In as.big.matrix(x, type = "raw") : Casting to numeric type
2: In SetElements.bm(x, i, j, value) :
Assignment will down cast from double to raw
Hint: To remove this warning type: options(bigmemory.typecast.warning=FALSE)
Am I doing something wrong or is it a missing implementation?
object of type 'S4' is not subsettable
a <- big.matrix(3, 3)
m <- matrix(1:6, 2, 3)
a[1:2, ] <- m
now gives
Error in a[1:2, ] <- m : object of type 'S4' is not subsettable
This seems to be a problem only when accessing a subset of rows.
It worked in earlier versions (don't know which though).
I now use a[1:2, ] <- as.numeric(m)
which less straight-forward but works.
How to use big.matrix with raw bytes?
It seems, that big.matrix of type char is... char, i.e. signed byte. For R it is unsigned (there is no singed byte in R).
How to store a byte in a big.matrix?
Or: how to make this test pass?
test_that("Reading and writing byte>=128 on memory-backed file",{
m<-big.matrix(10,1,type='char')
m[4,1]<-as.raw(130)
expect_equal(m[4,1],130)
})
Define Location of 'backing counter' files?
When I create a given big.matrix
object a series of associated files (counter, counter_mutex, etc.) are created in to a system defined temporary directory. For example, I have seen /dev/shm
and /run/shm
. Is there a way to define an alternate directory where these files could go? If not, perhaps a similar option that is available in bigalgebra
with options(bigalgebra.tempdir)
would be preferable. This way a user can place these files in a defined location and can deal with them separately from other temp files if they so wish.
deepcopy appears to drops dimnames when `cols` argument used
Even when options(bigmemory.allow.dimnames = TRUE)
it appears that deepcopy
is dropping dimnames (notably column names) when you wish to subset the columns.
For example:
df <- data.frame(A = seq(2), B = rnorm(2))
bm <- as.big.matrix(df)
dm <- deepcopy(bm, cols = 1)
colnames(bm)
[1] "A" "B"
colnames(dm)
NULL
This is not the case though if the cols
argument is not used:
dm <- deepcopy(bm)
colnames(bm)
[1] "A" "B"
convert a sparse matrix to a big.matrix.
when I use biglasso,my data was a sparse matrix class of Matrix package,biglasso seems only support a big.matrix,I can not onvert a sparse matrix to a big.matrix. any suggestions? thanks
Test coverage is low
@kaneplusplus The bigmemory package has low test coverage, so it is easy to add a feature, that breaks something else. (I will add the quoted m[1:2,1] == c(NA, -5)
to the tests)
When I was browsing the code, I found a lot of repetitions, which are places that can easily introduce errors when the package evolve. I am willing to put some time and effort to write the unit tests and to refactor the code to eliminate the redundacies and possible bugs. But since it is your project, I would like to first discuss with you your vision of this package, so I don't waste my time doing things that would ultimately get rejected.
Can we talk via VoIP (e.g. Skype) sometime?
Add Appveyor CI
timeline for big.sparse.matrix support?
Thanks for this great package. I have my package that depends on bigmemory
, and now I wish to add the support for sparse matrix to my package. I noticed that you put big.sparse.matrix
on your wish list. Just wondering whether you have any plan to implement and any timeline expected for that?
Of course I can use sparse matrix from Eigen
or Armadillo
libraries. But those don't support memory mapping, which is the key feature I need.
Do big.matrices know where they are filebacked?
Is there a way to know where a filebacked big.matrix
is stored on disk (the directory)?
If not, I think it should be easy to add one slot to the big.matrix
object with its stored backingpath
so that we can directly attach
or sub
a big.matrix
without asking the user to specify the directory (backingpath
). Or maybe add it to the description object instead?
Do you want to do it? I think I can do it if you want to.
If not, I will have to make an object that extends a big.matrix.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.