GithubHelp home page GithubHelp logo

coxpresdbr's People

Contributors

russhyde avatar

Stargazers

 avatar

Watchers

 avatar  avatar

coxpresdbr's Issues

`get_coex_partners` is really slow for 20k genes from a CoxpresDbDataframeAccessor

This should be really fast.

Could be implemented as:

extract_dataframe(importer) %>%
  group_by(source_id) %>%
  top_n(-how_many_are_required, mutual_rank) %>%
  ungroup()

It's currently mapping over each source gene, extracting rows for that gene, filtering to the best hits, then running bind_rows() [ensure it still respects mr_threshold etc]

replace [overwrite/rewrite]_in_bunzip2 with overwrite and rewrite arguments

Add the following arguments for pass-through to Rutils::bunzip2:

overwrite # overwrite the extracted archive if it already exists
remove # remove the compressed archive once it's extracted
skip # if already extracted, use the existing extracted archive

Or, pass arguments along as dots or provide bunzip2_args

use `validity = function(x) my_validity_fn(x)` rather than `validity = my_validity_fn`

covr does not run the code for S4-object validity test functions when they are added in this format:

my_validity_fn <- function(object) {
  # blajh blah blah
}
methods::setClass("className", slots = ..., validity = my_validity_fn)

I think the discussion of function-factories here explains the issue

To ensure that S4-object validity functions are ran by covr when instantiating CoxpresDbPartners and CoxpresDbAccessor objects, use the following syntax:

methods::setClass("className", slots = ..., validity = function(x) my_validity_fn(x))
  • - rewrite setClass for CoxpresDbPartners
  • - rewrite setClass for CoxpresDbDataframeAccessor

Like, dude, where's the README?

  • - how to import a coxpresdb dataset from the files provided by coxpresdb
  • - how to annotate an imported dataset
  • - how to identify genes with behaviour similar to their neighbours
    • - z-score based test (node-versus-neighbourNodes)
    • - correlation based test (node-versus-outEdges)

remove dependence on purrr

We only use purrr::map twice in R/; so could easily rewrite to not import purrr (would probably still need to use purrr in tests; so move to suggests from imports)

rewrite to use *.zip rather than *.tar.bz2

... since the coxpresdb.jp database is now released using .zip files.

  • - convert the Spo data subset into .zip format (as we did for .bz2)
  • - separate the code for importing from .bz2
  • - write new tests for importing from .zip
    • - function to check a .zip is a valid CoxpresDb archive
    • - construction of CoxpresDbImporter
    • - get the file path for a specific gene from the CoxpresDbImporter
    • - get names of all genes that are defined in the CoxpresDb archive
    • - import all coexpression partners for a given gene
  • - write new code for importing from .zip
  • - replace [overwrite/rewrite]_in_bunzip2 with overwrite and rewrite arguments

rewrite to work with a collapsed all-gene database

In the current version, the coxpresdb files look like

# file for target gene_a
gene_b    MR_ab    COR_ab
gene_c    MR_ac    COR_ac
gene_d    MR_ad    COR_ad
...

This leads to inefficient sampling - for each gene sampled you have to re-read it's coexpression data

It would be more efficient to read in the data for all genes at one time from a single file

The file should look like

gene_a    gene_b    MR_ab    COR_ab
gene_a    gene_c    MR_ac    COR_ac
gene_a    gene_d    MR_ad    COR_ad
...
gene_b    ...
...

generalise the input statistics: allow correlation-coefficients as input

Either

  • allow the user to pass in a gene-id-indexed matrix and get coxpresdbr to compute correlations between sources & targets
  • or pass in correlation-coefficients instead of p-values (coxpresdbr should convert corrs to z-score)

? Find out alternatives for meta-analysis of correlation scores

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.