GithubHelp home page GithubHelp logo

nanostring-biostats / geodiff Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 6.0 10.39 MB

GeoDiff, an R package for count generating models for analyzing Geomx RNA data. Note that this version of the package is still under development, undergoing submission process to Bioconductor 3.14 release and still needs to complete NanoString internal verification process.

License: MIT License

R 74.76% C++ 25.24%

geodiff's People

Contributors

maddygriz avatar nicoleeo avatar nturaga avatar yangleicq avatar zhiiiyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

geodiff's Issues

Vignette blurbs

  • 1. The mouse data still errors at line 37 because of the duplicate probe names
  • 2. In the aggreprobe section if there are no all zero probes all probes get removed at line 89
  • 3. In the score test, when removeoutlier = TRUE it says “n outliers are removed prior to the score test”. This number n is different than the number of probes called outliers in the which(assayDataElement(SZ018_diag, "low_outlier") == 1, arr.ind = TRUE) and which(assayDataElement(SZ018_diag, "up_outlier") == 1, arr.ind = TRUE) section. It would be good to understand why those probes are being removed and how outliers are called
  • 4. Line 176-179 says “We also need the mean of the background, since all gene has 1 probes so adjustment factor=1.” – is this still true if we’re trying to make the vignette generalizable to any number of probes per gene?
  • 5. In general whenever plots are output there needs to be an explanation of what you expect to see and how to interpret the results (Maddy may already be doing this)
  • 6. I didn’t run the random slope and random intercept models, my data didn’t have the right structure and the vignette said it would take hours. It was a little confusing to figure out where to pick up the vignette after that. In general there are a lot of DE options and it would help users to guide users more explicitly through this section (Maddy may already be doing this)
  • 7. I got stuck for a while at line 500 because I couldn’t figure out what the possible values of the variable are. I eventually figured it out but I had to try different combinations of the name of the grouping variable in the test and the levels. It would help to point users to where they can see this
  • 8. When you run DE on a grouping variable with more than two levels, what’s the comparison? Is it each level vs all other ROIs? It doesn’t seem like it’s all pairwise comparisons since there’s one result for each level
  • 9. For the DE results table, it would be helpful to indicate somewhere what is the numerator and what is the denominator for the FC

aggreprobe function fails when there is all 0 probe

GeoDiff:::aggreprobe() function is also failing with my dataset, giving the error “Error in if (all(x > 0.85)) y <- names(x) else y <- setdiff(names(x), : missing value where TRUE/FALSE needed” when there are all 0 probe in the dataset.

Add `split` in `fitNBth`

The split parameter needs to be added for the selection for initial values of threshold_start.

Address reviewer comments

DESCRIPTION

  • Please wrap your Description field with a limit of 80 characters so that it is legible.
  • Add URL and BugReports fields.
  • Update the R dependency version to R (>= 4.1.0).

NAMESPACE

  • Consider using the native pipe |> rather than magrittr's.
  • Consider suggesting NanoStringNCTools if you are only using one function from the package.
  • Move GeomxTools to Imports @yangleicq this is new per reviewer comment

size_scale not passed to matrix function call

fitNBthDE has optional parameter size_scale in the NanoStringGeoMxSet method. This parameter if set by user, is not passed to the matrix method call within this function and will override user setting in S4 definition with default, "sum".

Normalisation methods for TCR spike-in add-on

Hi,

We have tried using Q3 and TMM normalisation for DSP WTA data. But when it comes to TCR spike-in add-on data, which normalisation methods (Q3, TMM or Poisson threshold model based normalisation in the GeoDiff) is applicable?

As the TCR data only contains probes for 146 variable and joining segments (number of genes a lot less than WTA), and expression levels vary among different TCR genes (assumption of TMM - most of the genes are not differentially expressed - may not hold in this setting).

Cheers!

Error in solve.default(cov_mat)

I am receiving the following error from the fitPoisthNorm() function.

probe finished
Error in solve.default(cov_mat) :
system is computationally singular: reciprocal condition number = 1.04181e-19

Could you please let me know what might be causing this error? Is there an upstream step where I assign a value that could be leading to the downstream error?

I am running code from your vignette with a few modifications and have had success with other datasets, but this one for some reason is giving me this error. All the preceding steps of the vignette code work just fine, but I cannot apply the normalization.

Any help you could provide would be greatly appreciated.

Error when running fitPoisthNorm multiple times

I get this error when I run fitPoisthNorm a second time with minimal variable setting.
Error in sample.int(length(x), size, replace, prob) : invalid first argument

Ran code that produces error
kidney <- fitPoisthNorm(object = kidney)

But if I set ROIs_high it runs without an error.

fitNBthDE error with prior_type="equal"

This error occurs when prior_type = "equal". It doesn't happen every time but I don't see a pattern for when it does.

Error in t(X) %*% preci1con : non-conformable arguments

The Traceback doesn't seem helpful but I'm adding it anyway.
Traceback:
6. .local(object, ...)
5. fitNBthDE(form = form, annot = annot[ROIs_high, ], object = countmat[, ROIs_high], probenum = probenum, features_high = features_high, features_all = features_all, sizefact_start = sizefact_start, sizefact_BG = sizefact_BG, threshold_mean = threshold_mean, ...
4. fitNBthDE(form = form, annot = annot[ROIs_high, ], object = countmat[, ROIs_high], probenum = probenum, features_high = features_high, features_all = features_all, sizefact_start = sizefact_start, sizefact_BG = sizefact_BG, threshold_mean = threshold_mean, ...
3. .local(object, ...)
2. fitNBthDE(form = ~region, split = FALSE, object = kidney, preci2 = 10000, prior_type = "equal", preci1con = 1/25, sizescalebythreshold = TRUE)

  1. fitNBthDE(form = ~region, split = FALSE, object = kidney, preci2 = 10000, prior_type = "equal", preci1con = 1/25, sizescalebythreshold = TRUE)

Handle probenum >=1 as default

  • Make the probenum compatible with WTA
  • Remove the default values for probenum (not all 1)
  • Add unit tests/specs/reqs for aggreprobe for WTA
  • Modify vignette to aggregate probes for all data (default methods: use = "cor")

Question re methods behind the function calls?

Firstly thanks for releasing this package. I'm looking forward to trying out the mixed effect model methods to handle repeated sampling from individuals (which has been an ongoing struggle to get my head around with limma approaches). And something comparable with other packages!

From working through the vignette - I have a few questions about the methods behind the functions that I can't figure out from the docs. Please let me know if there's a resource somewhere I should check out.

  1. In the aggreprobe function - what are the “cor” and “score” tests? How are these used to include/exclude probes, and how are probes for a target combined?

  2. In BGScoreTest function - how is the ‘score’ is being calculated? Why is the suggested p-value threshold 1e-3?

  3. When filtering ROIs ("... keep those which have a high enough signal in comparison to the background."), how is the thresholding actually happening in this line?

ROIs_high <- sampleNames(kidney)[which((quantile(fData(kidney)[["para"]][, 1],
                                                  probs = 0.90, na.rm = TRUE) -
                                          notes(kidney)[["threshold"]])*kidney$sizefact_fitNBth>2)
  1. Are these the same filtering functions and DE methods used in the backend of the geomx gui?

Thanks!

default values

fitNBthDE

  • prior_type = c("equal", "contrast") should be changed to prior_type = c("contrast", "equal")
  • preci2 needs a default value
  • preci1con needs a default value

fitNBthmDE

  • preci1 needs a default value
  • preci2 needs a default value
  • threshold_mean needs default value

fitPoisthNorm

  • preci2 needs a default value

CTA workflow

Hi, I don't see any specification of processing WTA or CTA data using this package. I only see a WTA workflow. Not sure if there is somewhere need to be taken care of on CTA data different from WTA. Can I simply follow the WTA workflow on my CTA data?

Thank you!
Zee

Add documentation

  • reqs with links to specs
  • specs with links to test
  • edit links to tests based on latest updates
  • corcutoff needs updating in rox comments and rmd
    image
  • @zhiiiyang I also noticed a second .gitignore file in the vignette folder, does this need deletion?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.