GithubHelp home page GithubHelp logo

somalogic / somaplotr Goto Github PK

View Code? Open in Web Editor NEW
3.0 5.0 3.0 59.95 MB

A highly specialized suite of standardized plotting routines based on the "Grammar of Graphics" framework of mapping variables to aesthetics used in 'ggplot2'. Graphics types are biased towards visualizing SomaScan (proteomic) data.

Home Page: https://somalogic.github.io/SomaPlotr/

License: Other

Makefile 1.30% R 98.70%
dataviz ggplot2 proteomics proteomics-data-analysis r r-package somascan

somaplotr's People

Contributors

amanda-hi avatar stufield avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

somaplotr's Issues

Add ROC curve functionality and associated support

ROC curves are pretty common in this predictive space, we should probably add some, at minimum basic, plotting functionality to the package

  • this could come from a geom_roc() style framework to hook into the existing ggplot2 system of workflows.

`plotVolcanoHTML()` tooltip AptName field is wrong in vignette

Randomly generate some AptNames and TargetNames and add them to the simulated plotVolcano() dataset in the package vignette. Currently, the example dataset doesn't have this information, so it appears either A) incorrect, or B) NA in the interactive hovering tooltip.

Add grouping capability to boxplotSubarray()

Note: This issue/feature request was originally made by Eshita Mutt from GSE on Nov 17th, 2023

Checks

  • Please search previous issues before creating a new
    one, to ensure yours is not a duplicate
  • The proposed feature fits in the scope of SomaPlotr
    data visualization of SomaScan.

Problem and/or Feature

Is it possible to add group.var = c("SampleType", "PlateId"), as possible in boxplotGrouped()? From the vignettes, there isn't evidence of this functionality for boxplotSubarray(). Primarily, the goal is to make boxplots for all analytes (log10) and group by SampleType & then color by PlateId.

There are two potential methods to group a boxplotSubarray() plot:

  1. Use facet_wrap() or facet_grid() to split the plot by a variable of interest. This will not require the addition of a new argument to boxplotSubarray(), and will utilize already-available ggplot2 functions. The problem: Metadata variables can't currently be used to group a plot made by boxplotSubarray() via facet_*() because the unused metadata columns are stripped away internally before the final plot is produced.

Example:

reprex::reprex({
  suppressPackageStartupMessages({
    library(SomaDataIO)
    library(SomaPlotr)
    library(ggplot2)
  })
  
  p <- boxplotSubarray(SomaDataIO::example_data, color.by = "PlateId", do.log = TRUE)
  p + facet_wrap(~PlateId) # Produces an error, `PlateId` isn't found
  
  colnames(p$data) # PlateId is not present in plot data
})
  1. Add an argument to boxplotSubarray() that will instruct the function to group based on the specified variable. This variable can then be retained when the plot data object is made, and should resolve the variable not found error produced when using facet_wrap(). The problem: this can already be accomplished by adding a facet_*() layer to the plot.

Example:

boxplotSubarray(SomaDataIO::example_data, color.by = "SampleType", group.by = "PlateId", do.log = TRUE)

Priority Level

  • High
  • Medium
  • Low

Thanks for contributing 🥳!

Review internal code chunks in README

  • the README.Rmd contains some figure generation code that may (or may not) want to expose to users
  • the README mentions the use of patchwork, but we may not need to show/discuss that officially in the README
  • unless we wish users to be able to follow and execute the code in the README locally
  • clean up README for consistency and decide on "full disclosure" or "no disclosure" (right now it's a little of both)
  • we may want to consider if patchwork is an official dependency (what are the downstream dependencies of patchwork?)
  • do we wish to rely on patchwork in the future of the package as it evolves?

Clean up Getting Started vignette examples

  • some examples can be improved in the vignette
  • take some examples from the README or from the individual function examples
  • setting seed will help with more "realistic" figures (and reproducible)

Create short palettes/themes vignette

Problem

The current package vignette mentions a "themes" vignette, but this vignette does not currently exist.

Feature

The package needs a short vignette that details all the ggplot2-compatible themes and color palettes that are available in SomaPlotr.


Thanks for contributing 🥳!

Re-assess OS requirements for snapshot unit tests

The Problem

Currently, snapshots for plots are not generated when run on Mac or Windows operating systems (this is defined in tests/testthat/helper.R, see here). This is an artifact from previous versions of SomaLogic plotting code, and no longer makes sense in the context of SomaPlotr package development and maintenance, which is primarily done on Mac OS (previously, it was Linux).

Skipping snapshots on Mac OS or Windows should be re-assessed or potentially removed. Snapshots will most likely be generated on Mac OS in the future.

Issues to consider:

  • How will this effect snapshots generated on Windows operating systems? Will this be a problem?

Thanks for reporting 🥳!

boxplotSubarray() produces a "dangling" boxplot legend when color.by=NULL

Description

When boxplotSubarray() is used without specifying a color variable, a "dangling" legend is produced on the top of the plot. "Dangling" here refers to a legend with only 1 color category (i.e. an uninformative legend). By default, the legend title is removed, so only the box icon is displayed. This shrinks the overall size of the boxplot area (to accommodate the legend), and results in an unclean final plot. The ultimate goal of this package is to produce out-of-the-box, polished figures for SomaScan, and this unused legend would need to be trimmed off of any final figure.

I think this issue stems from the use of a dummy variable for the fill argument of ggplot(). When fill= or color= is specified, a legend is produced by default. In the example below, "class" is a dummy variable that contains no grouping information and is generated and added to plot_data when boxplotSubarray(..., color.by=NULL):

p <- plot_data |>
    ggplot(aes(x = as.character(.id), y = RFU_values, fill = class)) +
    geom_boxplot(notch = TRUE, alpha = 0.75, outlier.alpha = 0.2) +
    scale_fill_soma() +
NULL

Note: this is not an urgent issue, because boxplotSubarray() is typically used with a column specified for the color.by= argument. However, when using the default (color.by=NULL), this uninformative legend is produced.

Output

reprex::reprex(
    SomaPlotr::boxplotSubarray(SomaDataIO::example_data, color.by = NULL)
)

Priority Level

  • High
  • Medium
  • Low

Thanks for reporting 🥳!

Clean up params for `plotVolcanoHTML()`

  • currently the arguments do not match plotVolcano()
  • uses the stat_table method; which we may not want to expose
  • uses target.label as opposed to label as in plotVolcano()
  • remove the signed.log2.fold.change, p.value` defaults as they are internal only defaults
  • clean up for consistency

Add plotting routine for creating "Manhattan" plots

  • create a "Manhattan" plotting routine for package
  • there is an existing routine from our internal code base, called plotManhattan() the could be used as a starting point.
  • doesn't have to be identical, but could act as a basis
  • we would have to figure out how to generate the effect sizes/p-values or (similar to plotVolcano()) have the user provide those as an argument, this is probably preferable, because calculating statistics is out of scope for this plotting package

Fix broken snapshot unit tests

Snapshots for plotConcord() are broken because they were build on a previous version of the function.

  • they need to be updated and run on Mac or Linux?
  • not sure which is best but which ever gives best generalized coverage (Windows may not)
  • there are currently skip()s in place which is why the build checks pass in Windows

Add "wrap" param for wrapping variables to `plotLongitudinal()`

Feature

  • data (and variables) are internally selected (subset) prior to plotting
  • this limits the post plotting options, e.g. facet_wrap()
  • possibly add a new param for additional variables to be carried forward into the plotting routing, gg$data,
    to allow post plotting modification easier
  • possible name: include_vars = NULL

Thanks for contributing 🥳!

Fix `plotPDFlist()` and `plotCDFlist()` so that they can plot arbitrary length vectors

  • currently they must be the same size (length)
  • this shouldn't be necessary; but because they are converted into a data.frame prior to plotting, R chokes
  • data frame is part of pipeline into plotPDFbyGroup(), which must be a data frame
  • this bug is diabolical because for small lengths, the bug disappears and elements are recycled as necessary! Which is absolutely the wrong thing to do!

Fix deprecated usage of `scale_name=` param in `discrete_scale()`

Need to update to newest ggplot2 usage

  • scale_name is deprecated in discrete_scale()
Warning (test-plotCDFbyGroup.R:51:3): `plotCDFbyGroup()` throws warning when non-positive RFU values are detected
The `scale_name` argument of `discrete_scale()` is deprecated as of ggplot2 3.5.0.
Backtrace:
     ▆
  1. ├─testthat::expect_warning(...) at test-plotCDFbyGroup.R:51:3
  2. │ └─testthat:::expect_condition_matching(...)
  3. │   └─testthat:::quasi_capture(...)
  4. │     ├─testthat (local) .capture(...)
  5. │     │ └─base::withCallingHandlers(...)
  6. │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  7. └─SomaPlotr::plotCDFbyGroup(data, "seq.5489.18", Sex)
  8.   └─SomaPlotr::scale_color_soma() at SomaPlotr/R/plotCDFbyGroup.R:91:5
  9.     └─ggplot2::discrete_scale(...) at SomaPlotr/R/style-soma.R:31:3
 10.       └─ggplot2:::deprecate_soft0("3.5.0", "discrete_scale(scale_name)")

Fix in test-plotCDFbyGroup.R and test-plotLongitudinal.R

Add new plotting feature for median normalization scale factors

Feature

Users would likely benefit from a simple plotting routine to plot normalization scale factors (by group).
To identify potential bias in the sample groups that might skew analyses.

R code

This would involve porting over some functionality from SomaLogic's internal code base, SomaNormalization.

plotMedNorm()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.