GithubHelp home page GithubHelp logo

biobroom's People

Contributors

aaronwolen avatar ajbass avatar jdstorey avatar laurentgatto avatar ltobalina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

biobroom's Issues

CRAN Check Failure for Upcoming broom Release

Hi there! The broom dev team just ran reverse dependency checks on the upcoming broom 0.7.0 release and found new errors/test failures for the CRAN version of this package. I've pasted the results below, which seem to result from our decision to no longer export the fix_data_frame() function (for maintainability purposes.)

  • checking tests ...
     ERROR
    Running the tests in ‘tests/testthat.R’ failed.
    Last 13 lines of output:
      [1mBacktrace:[22m
      [90m 1. [39mgenerics::tidy(dds)
      [90m 2. [39mbiobroom::tidy.EList(dds)
      [90m 3. [39mbiobroom:::tidy_matrix(x$E)
      [90m 7. [39mbroom::fix_data_frame
      [90m 8. [39mbase::getExportedValue(pkg, name)
      
      ══ testthat results  ═══════════════════════════════════════════════════════════
      [ OK: 33 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 3 ]
      1. Error: limma tidier works as expected (@test-limma_tidiers.R#5) 
      2. Error: voom tidier adds weight column (@test-limma_tidiers.R#26) 
      3. Error: voomWithQualityWeights tidier adds weight and sample.weight columns (@test-limma_tidiers.R#49) 
      
      Error: testthat unit tests failed
      Execution halted
    

I've pasted the most recently exported function definition below as a place to start from in making the necessary fixes.🙂

fix_data_frame <- function(x, newnames = NULL, newcol = "term") {
  if (!is.null(newnames) && length(newnames) != ncol(x)) {
    stop("newnames must be NULL or have length equal to number of columns")
  }

  if (all(rownames(x) == seq_len(nrow(x)))) {
    # don't need to move rownames into a new column
    ret <- data.frame(x, stringsAsFactors = FALSE)
    if (!is.null(newnames)) {
      colnames(ret) <- newnames
    }
  }
  else {
    ret <- data.frame(
      ...new.col... = rownames(x),
      unrowname(x),
      stringsAsFactors = FALSE
    )
    colnames(ret)[1] <- newcol
    if (!is.null(newnames)) {
      colnames(ret)[-1] <- newnames
    }
  }
  as_tibble(ret)
}

We hope to submit this new version of the package to CRAN in the coming weeks. If you encounter any problems fixing these issues, please feel free to reach out!

tbl_df() is deprecated

Running biobroom::tidy on a DeSeq2 object, I saw the warning:

Warning message:
`tbl_df()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.

This was with biobroom v1.20.0, and dplyr v1.0.2.

Ideally, biobroom would be updated to avoid this warning.

Thank you!

typo in augment.DGEList leads to error

stops with error
stop("No columns to augment in DGEList")
independently of input.

Change
if (is.null(names(list())))
to
if (is.null(names(ret)))
in line 13 of the function augment.DGEList

Handling columnames with special character

Hi,
Thanks for the really useful package. Sometimes sample names get mangled if they contain special characters, eg:

> data(hammer)
> pData(hammer)
                  sample.id num.tech.reps protocol         strain     Time
SRX020102         SRX020102             1  control Sprague Dawley 2 months
SRX020103         SRX020103             2  control Sprague Dawley 2 months
SRX020104         SRX020104             1   L5 SNL Sprague Dawley 2 months
SRX020105         SRX020105             2   L5 SNL Sprague Dawley  2months
SRX020091-3     SRX020091-3             1  control Sprague Dawley  2 weeks
SRX020088-90   SRX020088-90             2  control Sprague Dawley  2 weeks
SRX020094-7     SRX020094-7             1   L5 SNL Sprague Dawley  2 weeks
SRX020098-101 SRX020098-101             2   L5 SNL Sprague Dawley  2 weeks

> tidy(hammer)
# A tibble: 236,128 x 3
                 gene    sample value
                <chr>     <chr> <int>
1  ENSRNOG00000000001 SRX020102     2
2  ENSRNOG00000000007 SRX020102     4
3  ENSRNOG00000000008 SRX020102     0
4  ENSRNOG00000000009 SRX020102     0
5  ENSRNOG00000000010 SRX020102    19
6  ENSRNOG00000000012 SRX020102     7
7  ENSRNOG00000000014 SRX020102     0
8  ENSRNOG00000000017 SRX020102     4
9  ENSRNOG00000000021 SRX020102     7
10 ENSRNOG00000000024 SRX020102    86
# ... with 236,118 more rows
> pData(hammer) %>% dplyr::filter(grepl('SRX020091',sample.id))
    sample.id num.tech.reps protocol         strain    Time
1 SRX020091-3             1  control Sprague Dawley 2 weeks

> tidy(hammer) %>% dplyr::filter(grepl('SRX020091',sample))
# A tibble: 29,516 x 3
                 gene      sample value
                <chr>       <chr> <int>
1  ENSRNOG00000000001 SRX020091.3     7
2  ENSRNOG00000000007 SRX020091.3     5
3  ENSRNOG00000000008 SRX020091.3     0
4  ENSRNOG00000000009 SRX020091.3     0
5  ENSRNOG00000000010 SRX020091.3    50
6  ENSRNOG00000000012 SRX020091.3    31
7  ENSRNOG00000000014 SRX020091.3     0
8  ENSRNOG00000000017 SRX020091.3    21
9  ENSRNOG00000000021 SRX020091.3    30
10 ENSRNOG00000000024 SRX020091.3   257
# ... with 29,506 more rows

unitdy() function

Hi,

would you be able to easily add a untidy() function, which reverts the tidy object back to original formatting including any changes made to the tidy version ?

Smth like:
edgeR_oject_tidy <- edgeR_oject %>% tidy()
edgeR_oject <- edgeR_oject_tidy %>% untidy()

I do often get into the situation that I have to jump between formatting, as the package functions need the base formatting.

Cheers
Jakob

Add fasta format tidier?

I've been using a broom-style function to tidy seqinr::read.fasta objects. Would there be any interest in adding this to biobroom if I do a pull request?

read_fasta <- function(fasta_filename, annot = FALSE){
    fasta <- seqinr::read.fasta(fasta_filename, as.string = TRUE)

    # Convert seqinr SeqFastadna object to data.frame
    fasta_df <- fasta %>%
                   sapply(function(x){x[1:length(x)]}) %>%
                   as.data.frame %>%
                   broom::fix_data_frame(newcol = "ID", newnames = "Sequence")

    if(annot == TRUE){
        annot_df <- getAnnot(fasta) %>%
                         sapply(function(x){x[1:length(x)]}) %>%
                         as.data.frame() %>%
                         broom::fix_data_frame(newnames = "Annot")

        fasta_df <- cbind(fasta_df, annot_df)
    }
    return(fasta_df)
}
read_fasta('https://www.uniprot.org/uniprot/?query=PGH1&format=fasta&limit=10')

https://gist.github.com/clairemcwhite/a5e889f6192a664be45c0226d0ab5813

`tidy.DESeqTransform` method

Hi,

(firstly, thanks a lot for such a convenient package!)

I was wondering what your view is on having a tidy() method for DESeqTransform objects (coming from rlog() and varianceStabilizingTransform() functions?

Here's a gist with one:
https://gist.github.com/tavareshugo/3973461a7daf8a43e65e3566d5deed14

So, this should work:

# load libraries
library(DESeq2)
library(biobroom)
library(magrittr)

# Source gist
devtools::source_gist("3973461a7daf8a43e65e3566d5deed14", filename = "tidy_DESeqTransform.R")

# Example
dds <- makeExampleDESeqDataSet(betaSD = 1)

# transformations
vst_norm <- varianceStabilizingTransformation(dds)
rlog_norm <- rlog(dds)

# tidying
tidy(vst_norm)
tidy(vst_norm, colData = TRUE)
tidy(rlog_norm)
tidy(rlog_norm, colData = TRUE)

I'm happy to fork and submit a pull request, if you think something along these lines is worth it.

Standardized location for dplyr-like methods on SE objects

The scater package implements a number of dplyr verbs for SingleCellExperiment objects, e.g., mutate. I have been trying to get rid of these functions for a while, and I was wondering whether biobroom would be a better home for them (once generalized to work on SummarizedExperiment objects).

This would be a win-win for all of us. For tidyverse/BioC users, who no longer have to put up with masking issues (alanocallaghan/scater#74); for biobroom, by adding and centralizing functionality relating to tidyverse/BioC integration; and for me, who no longer has to maintain these verbs that I never use.

Let me know if this is of interest - I am willing to put in a PR.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.