GithubHelp home page GithubHelp logo

taylor-lab / annotatemaf Goto Github PK

View Code? Open in Web Editor NEW
11.0 3.0 2.0 264 KB

Add functional variant annotation to MAF file

License: Other

R 96.17% Python 3.83%
r bioinformatics variant-annotation mutations

annotatemaf's Introduction

annotateMaf

lifecycle Build Status Coverage status

A set of functions to add variant annotation to a MAF file. Sources currently include OncoKB, BRCA Exchange and somatic hotspots from the Taylor Lab.

Installation

This package python modules urllib3 and ga4gh, the latter of which only works in python (< 3.0).
2019-02-14: Disabled due to this issue. Instead of querying the API, we now use a static table.

Load and install the library this way:

devtools::install_github('taylor-lab/annotateMaf')
library(annotateMaf)

Examples

Run the functions simply with your MAF (as a data.table, not the file path) as the input.

hotspot_annotate_maf requires a VEP-annotated MAF file.

# Note that the BRCA Exchange database is geared towards germline variants but by default the variant allele in a MAF is called Tumor_Seq_Allele2
annotated_maf = brca_exchange_annotate_maf(input_maf)

# Only retain oncogenic or likely oncogenic mutations after OncoKB annotation
maf %>% 
    oncokb_annotate_maf(input_maf) %>% 
    filter(oncogenic %like% 'Oncogenic) 

Annotation sources:

  • OncoKB: Queries latest version of OncoKB, version number included but currently no support for querying older versions.
  • BRCA Exchange: Queries latest version of BRCA Exchange, also currently does not support versioning.
  • Somatic hotspots: List generated from PMIDs 26619011, 29247016, 28115009. Semi-manual curation was carried out to remove false-positive germline variants that were in the oldest publication.

annotatemaf's People

Contributors

gongyixiao avatar kpjonsson avatar tischfis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

annotatemaf's Issues

brca_annotate_maf doesn't work as expected

query_brca_exchange works as expected and returned three additional fields, however, brca_annotate_maf returned all 3 fields with NA values. guess the parsing function of annot field returned from query_brca_exchange does not work properly.

also query_brca_exchange checked start position, end, ref and alt. It seems too stringent as i can see a few insertion/deletion variants had annotation missing. The reason is the standard to set start and end for insertion or deletion is not well established. It might be better to check either start or end, plus ref and alt. Thanks!

tidyr update cause failure when replacing na's.

In the line referenced below, an error results when using tidyr 1.2.0+.

tidyr::replace_na(list(start_residue = 0, end_residue = 0)) %>%

The error generates is as follows:
Error in vec_assign():
! Can't convert replace$start_residue to match type of data$start_residue .
Backtrace:

  1. └─global annotate_maf_with_hotspots(combined_maf)
  2. └─annotateMaf::hotspot_annotate_maf(maf)
  3. └─... %>% ...
    
  4.   ├─base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
    
  5.   └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
    
  6.     └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
    
  7.       └─annotateMaf `_fseq`(`_lhs`)
    
  8.         └─magrittr::freduce(value, `_function_list`)
    
  9.           └─function_list[[i]](value)
    
  10.             ├─tidyr::replace_na(., list(start_residue = 0, end_residue = 0))
    
  11.             └─tidyr:::replace_na.data.frame(., list(start_residue = 0, end_residue = 0))
    
  12.               └─vctrs::vec_assign(...)
    
  13.                 └─vctrs `<fn>`()
    
  14.                   └─vctrs::vec_default_cast(...)
    
  15.                     └─vctrs::stop_incompatible_cast(...)
    
  16.                       └─vctrs::stop_incompatible_type(...)
    
  17.                         └─vctrs:::stop_incompatible(...)
    
  18.                           └─vctrs:::stop_vctrs(...)
    
  19.                             └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))
    

The issue is caused by the following tidyr updates:
replace_na() no longer allows the type of data to change when the replacement is applied. replace will now always be cast to the type of data before the replacement is made. For example, this means that using a replacement value of 1.5 on an integer column is no longer allowed. Similarly, replacing missing values in a list-column must now be done with list("foo") rather than just "foo".

replace_na() no longer replaces empty atomic elements in list-columns (like integer(0)). The only value that is replaced in a list-column is NULL (#1168).

This might be fixed by casting start_residue and end_residue to double in the replace_na() call.

Filtering BRCA Exchange Variants not working properly

qq = dplyr::filter(brca_exchange_variants, Chr == chrom, start - 1, end + 1) %>%

Error message as follows, looks like a syntax error:

Error: Problem with `filter()` input `..2`.
i Input `..2` is `start - 1`.
x Input `..2` must be a logical vector, not a double.

This can be replicated by any BRCA1/2 mutations.

Correct syntax should be:

qq = dplyr::filter(brca_exchange_variants, Chr == chrom, pyhgvs_Hg37_Start == start - 1 ... )

Although the exact filtering criteria remains a discussion need to be done. Also mentioned here #2

Any help is appreciated here @md09 @tischfis @arichards2564

`brca_annotate_maf` error with newest installation

We are finding the following error with our latest docker build including taylor-lab/annotateMaf. The error is:

Error: Problem with `mutate()` column `annot`.
i `annot = purrr::pmap(...)`.
x Problem with `filter()` input `..2`.
i Input `..2` is `start - 1`.
x Input `..2` must be a logical vector, not a double.
Backtrace:
     x
  1. +-annotateMaf::brca_annotate_maf(maf)
  2. | +-`%>%`(...)
  3. | +-dplyr::mutate(...)
  4. | \-dplyr:::mutate.data.frame(...)
  5. |   \-dplyr:::mutate_cols(.data, ..., caller_env = caller_env())
  6. |     +-base::withCallingHandlers(...)
  7. |     \-mask$eval_all_mutate(quo)
  8. +-dplyr::select(., -annot)
  9. +-dplyr::mutate(...)
 10. +-purrr::pmap(...)
 11. | \-annotateMaf:::.f(...)
 12. |   +-`%>%`(...)
 13. |   +-dplyr::filter(...)
 14. |   \-dplyr:::filter.data.frame(...)
 15. |     \-dplyr:::filter_rows(.data, ..., caller_env = caller_env())
 16. |       +-base::withCallingHandlers(...)
 17. |       \-mask$eval_all_filter(dots, env_filter)
 18. +-purrr::transpose(.)
 19. +-dplyr:::abort_glue(...)
 20. | +-rlang::exec(abort, message = message, class = class, !(!(!data)))
 21. | \-(function (message = NULL, class = NULL, ..., trace = NULL, parent = NULL, ...
 22. |   \-rlang:::signal_abort(cnd)
 23. |     \-base::signalCondition(cnd)
 24. \-(function (e) ...
Execution halted

We haven't changed much about the R packages in Dockerfile but some of the dependencies are not version-controlled. I noticed that plyr == 1.8.4 and dplyr == 0.8.3 in the last working image, and the newest image has plyr == 1.8.6 and dplyr == 1.0.7. When I added a line at the end of the dockerfile to revert plyr and dplyr back to 1.8.4 and 0.8.3, the brca_annotate_maf function completed without error.

I am not sure if reverting both packages was necessary or if just one would have been sufficient. Although I was able to design a workaround, perhaps users of the package would benefit from an update that is compatible with recent versions of plyr/dplyr.

definition of truncating_mutations is missing

Hi There,

I tried to call hotspot_annotate_maf and got the error message below:
Error in Variant_Classification %in% truncating_mutations :
object 'truncating_mutations' not found

Checked your source package and I didn't find anywhere truncating_mutation was defined. Just wonder if it's possible to fix.

thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.