GithubHelp home page GithubHelp logo

bedmatrix's Introduction

BEDMatrix

CRAN_Status_Badge Rdoc Travis-CI Build Status AppVeyor Build Status Coverage status

BEDMatrix is an R package that provides a matrix-like wrapper around .bed, one of the genotype/phenotype file formats of PLINK, the whole genome association analysis toolset. BEDMatrix objects are created in R by simply providing the path to a .bed file and once created, they behave similarly to regular matrices with the advantage that genotypes are retrieved on demand without loading the entire file into memory. This allows handling of very large files with limited use of memory.

This package is deliberately kept simple. For computational methods that use BEDMatrix check out the BGData package.

Example

This example uses a dummy .bed file that is bundled with this R package. It was generated using plink --dummy 500 1000 0.02 acgt --seed 4711 --out example with PLINK 1.90 beta 3.452.

To get the path to the example .bed file (system.file finds the full file names of files in packages and is only used to find the example data):

path <- system.file("extdata", "example.bed", package = "BEDMatrix")

To wrap the example .bed file in a BEDMatrix object:

m <- BEDMatrix(path)
#> Extracting number of samples and rownames from example.fam...
#> Extracting number of variants and colnames from example.bim...

To get the dimensions of the BEDMatrix object:

dim(m)
#> [1] 50 1000

To extract a subset of the BEDMatrix object:

m[1:3, 1:5]
#>           snp0_A snp1_C snp2_G snp3_G snp4_G
#> per0_per0      0      1      1      1      0
#> per1_per1      1      1      1      1     NA
#> per2_per2      1      0      0      2      0

Installation

Install the stable version from CRAN:

install.packages("BEDMatrix")

Alternatively, install the development version from GitHub:

# install.packages("remotes")
remotes::install_github("QuantGen/BEDMatrix")

Documentation

Further documentation can be found on RDocumentation.

Contributing

bedmatrix's People

Contributors

agrueneberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bedmatrix's Issues

Eliminate compiler warnings

subsetBED.cpp:72:24: warning: variable 'mapping' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
            } else if (genotype == 1) {
                       ^~~~~~~~~~~~~
subsetBED.cpp:77:26: note: uninitialized use occurs here
            out(idx_i) = mapping;
                         ^~~~~~~
subsetBED.cpp:72:20: note: remove the 'if' if its condition is always true
            } else if (genotype == 1) {
                   ^~~~~~~~~~~~~~~~~~~
subsetBED.cpp:65:24: note: initialize the variable 'mapping' to silence this warning
            int mapping;
                       ^
                        = 0
subsetBED.cpp:176:26: warning: variable 'mapping' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
              } else if (genotype == 1) {
                         ^~~~~~~~~~~~~
subsetBED.cpp:181:35: note: uninitialized use occurs here
              out(idx_i, idx_j) = mapping;
                                  ^~~~~~~
subsetBED.cpp:176:22: note: remove the 'if' if its condition is always true
              } else if (genotype == 1) {
                     ^~~~~~~~~~~~~~~~~~~
subsetBED.cpp:169:26: note: initialize the variable 'mapping' to silence this warning
              int mapping;
                         ^
                          = 0
2 warnings generated.

installation error for R version 3.4.1

Hi,

I ran into the following error when installing, though the package supports R (≥ 3.0.0) according to the manual. Any advice? Thanks!

Best,
Rosie

Warning: unable to access index for repository https://mirrors.sorengard.com/cran/src/contrib:
  cannot open URL 'https://mirrors.sorengard.com/cran/src/contrib/PACKAGES'
Warning message:
package ‘BEDMatrix’ is not available (for R version 3.4.1)

Error in BEDMatrix(bd1) : File not found.

Hello, I was trying to read some plink bed file in R using BEDMatrix. I am getting:

> path=system.file("extdata", "1321.maf5.indep0.1.bed", package = "BEDMatrix")
> bd2=BEDMatrix(path)
Error in BEDMatrix(path) : File not found.

But the file exists in the working directory, I also tried with relative/absolute path and always getting this error. Any Help? Best, Zillur

error message on convertIndex

Hi there,

When I typed

mat <- BEDMatrix("test.bed")
mat[1:2,1:2]

I received the following error message:

Error in convertIndex(x, i, "i", allowDoubles = allowDoubles) :
  could not find function "convertIndex"

Any suggestions? Thanks!

Best,
Rosie

Extract ranges more efficiently

Right now each element is extracted individually. We could just read in chunks and go from there. This should be a significant optimization for most use-cases.

Mutualize code

Hi,

I'm adding some features to my package {bigsnpr} (https://github.com/privefl/bigsnpr) to directly work on memory-mapped bed files.
My code is: https://github.com/privefl/bigsnpr/blob/bedpca/src/bed-acc.h.

I see that you already developed this feature in this package and wonder if we could mutualize the code.
I also see that you directly use the memory-mapping in recent versions of R, which is nice.

How easy is to use this package to develop C++ code while accessing bed files?
(What I need to do at the moment: https://github.com/privefl/bigsnpr/blob/bedpca/src/bed-fun.cpp#L13-L14)

BEDMatrix instance has been unmapped

Hi all,
I have been trying to run a simple chunkedApply() function using parallel processing in R, either with slurm_apply or parLapply. In both cases, I get the error "BEDMatrix instance has been unmapped". I have no problems running the function below when using serial processing.

Any idea what's causing problems here?

Many thanks for your help,
Tabea

chunkedApply(X = geno(bg), MARGIN = 2, FUN = sd, j="rs11665242_G")

Support individual-major mode

For a potential component that reads in a BEDMatrix from a RAW file through a generated BED file it would be much easier to write the BED file in individual-major mode. The overhead of reading BED files in individual-mode might be too large, so it might make more sense to write SNP-major mode, though.

automatically detect number of markers in .bim file

Hello, Alex, the map file associated with bed file is generally ended with .bim, instead of .map. It might be good to be able to automatically detect number of markers from .bim file also.

Since .bed file is almost always associated with a .bim and .fam file, maybe it is also good being able to create another object that has the phenotype and map information ready in R. (Maybe the same structure as BGData?)

SNPs 0,1,2

Hi!

I am using this snp from illumina exm1615904... whe I use plink it only uses the name of the SNP to get the geenotype (eg, as in this case normal would be CC vs GC or GG).
When usign BEDmatrix, I need to add a '_X' (underscore somthing) to the SNP name and then when I call the SNP it shows me 0,1,2, ... So I wonder what doees it means.

Kind regards

image

BEDMatrix error

Hello,

I have a .bed file that I am trying to extract genotypes from. I can read it into R just fine. However, I cannot extract any genotypes.
M<-BEDMatrix("my filename") works fine as does "dim" "rownames" and "colnames".

If I try to extract the 1st three individuals I get:
M[1:3]
Error in convertIndex(x, i, "k") : unused argument ("k")

Running R 3.3.2. Installed package "xts".

Any help appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.