cnio-bu / beyondcell Goto Github PK

Beyondcell is a computational methodology for identifying tumour cell subpopulations with distinct drug responses in single-cell RNA-seq and Spatial Transcriptomics data.

License: Other

R 100.00%

bioinformatics cancer drugs single-cell spatial-transcriptomics

beyondcell's Introduction

Package status

News: Beyondcell 2.2.1 has been added to bu_cnio conda channel.

Introduction

Beyondcell is a methodology for the identification of drug vulnerabilities in single-cell RNA-seq (scRNA-seq) and Spatial Transcriptomics (ST) data. To this end, beyondcell focuses on the analysis of drug-related commonalities between cells/spots by classifying them into distinct Therapeutic Clusters (TCs).

Workflow overview

Beyondcell workflow. Given two inputs, the expression matrix and a collection of drug signatures, the methodology calculates a Beyondcell Score (BCS) for each drug-cell/spot pair. The BCS ranges from 0 to 1 and measures the susceptibility of each cell/spot to a given drug. The resulting BCS matrix can be used to determine the sample’s TCs. Furthermore, drugs are prioritized in a table and each drug score can be visualized in a UMAP. When using ST data, the TCs and individual scores can also be visualized on top of the tissue slice to dissect the therapeutic architecture of the sample.

Depending on the evaluated signatures, the BCS represents the cell/spot perturbation susceptibility (PSc) or the sensitivity to the drug effect (SSc). BCS can also be estimated from functional signatures to evaluate each cell/spot functional status.

Beyondcell's key applications

Analyse the intratumoural heterogeneity (ITH) of your experiment
Classify your cells/spots into TCs
Prioritize cancer treatments
If time points are available, identify the changes in drug tolerance
Identify mechanisms of resistance

Installing beyondcell

The beyondcell package is implemented in R >= 4.0.0. We recommend running the installation via mamba:

# Create a conda environment.
conda create -n beyondcell 
# Activate the environment.
conda activate beyondcell
# Install beyondcell package and dependencies.
mamba install -c bu_cnio r-beyondcell

Results

We have validated beyondcell in a population of MCF7-AA cells exposed to 500nM of bortezomib and collected at different time points: t0 (before treatment), t12, t48, and t96 (72h treatment followed by drug wash and 24h of recovery) obtained from Ben-David U, et al., Nature, 2018. We integrated all four conditions using the Seurat pipeline (left). After calculating the BCS for each cell using PSc, a clustering analysis was applied. Beyondcell was able to cluster the cells based on their treatment time point, to separate untreated cells from treated cells (center), and to recapitulate the changes arising from the treatment with bortezomib (right).

How to run

For general instructions on running beyondcell, check out the analysis workflow and visualization tutorials. For more information about how beyondcell normalization works, please refer to this vignette. You can also find an example ST analysis here.

Authors

Coral Fustero-Torre
María José Jiménez-Santos
Santiago García-Martín
Carlos Carretero-Puche
Luis García-Jimeno
Tomás Di Domenico
Gonzalo Gómez-López
Fátima Al-Shahrour

Citation

Fustero-Torre, C., Jiménez-Santos, M.J., García-Martín, S. et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med 13, 187 (2021). https://doi.org/10.1186/s13073-021-01001-x

Support

If you have any questions regarding the use of beyondcell, feel free to submit an issue.

beyondcell's People

Contributors

Stargazers

Watchers

Forkers

hayesss daswind tc86 huizhong1993

beyondcell's Issues

Test failure: bcScore(pbmc.raw, gs = gs10)` did not throw the expected error.

Failure (test-bcScore.R:142): errors
`bcScore(pbmc.raw, gs = gs10)` did not throw the expected error.

Warning (test-bcScore.R:142): errors
Arguments in `...` must be used.
x Problematic argument:
* fixed = TRUE
Backtrace:
 1. testthat::expect_error(...)
      at test-bcScore.R:142:2
 2. testthat:::expect_condition_matching(...)
 3. rlang (local) `<fn>`()
 4. rlang:::check_dots(env, error, action, call)
 5. rlang:::action_dots(...)
 6. base (local) try_dots(...)

bcMerge on a list

I propose to expand bcMerge so that it can merge a given bc object with a subsequent list of bc objs, like Seurat does.

questions of PSc and SSc

I would like to know how you identify the drug perturbation (PSc) and the drug sensitivity (SSc)? I am a little confused about this, since your tool starts with these.
Thanks!

Test warning: warning (test-bcScore.R:349): default values

Warning (test-bcScore.R:349): default values
no non-missing arguments to min; returning Inf
Backtrace:
 1. beyondcell::bcScore(pbmc, gs = ssc, expr.thres = 0)
      at test-bcScore.R:349:2
 3. base::apply(bc@normalized, 1, scales::rescale, to = c(0, 1))
      at beyondcell/R/Score.R:171:2
 5. scales:::rescale.numeric(newX[, i], ...)
 7. base::range(x, na.rm = TRUE, finite = TRUE)

This is actually a sister issue to #64 In this case, it's failing a min(x) whereas in #64 is a max(x)

bcAddMetaData doesn't allow for existing columns to be replaced

Hi,

Could you edit the bcAddMetaData function so that if some metadata columns with identical names already exist in [email protected], the function simply removes and replaces the existing relevant columns (similar to how Seurat::AddMetaData works).

Thank you,
Lucas

Create bioconda package

I think it'd be good to have a conda package for the project. Happy to help set it up.

Variable drugs option in bcUMAP

In GitLab by @cfustero on Dec 9, 2020, 13:37

Add option in bcUMAP to calculate the UMAP using the X most variable drugs in order to enhance differences between the analyzed populations.

Drug screening against cells in a pathological state

In GitLab by @rpmoraga on Jan 4, 2022, 13:15

Hello, congratulations for the publication and the package 😄

I was wondering that if will be possible to use the package in a scenario to find a drug capable to reverse the transcriptomic signature of a diseases cell to the healthy cell state.

In this case, I could use the L1000 level 5 information using the z-scores for each of the drugs tested on cell lines considered healthy. Using the suggested procedure in this issue to generate a custom GeneSet of signatures. However, my main question would be if there is any way to verify, in a in silico way, that the list of drugs "effective" according to the BCS against the group of pathological cells. It is also capable of reversing the transcriptomic signal from pathological to healthy.

Thank you very much in advance

Error in bcUMAP

bcUMAP raises an error when add.DSS = TRUE

Failure (test-bcScore.R:178): errors

bcScore(pbmc, gs = gs1, expr.thres = 1)` did not throw the expected error.

Add r-hdf5r as dependency

Add novel cutoff SSc signatures

Related to the current SSc branch.

Can beyondcell package be installed under windows system?

Thank you very much for providing such a great tool!
I tried to install beyondcell in R 4.2.1 under windows， but failed.

Error (test-GenerateGenesets.R:11): errors

Error in `readGMT(x)`: a does not exist.

can not find the object "path_to_sc"

when I run Beyondcell following the [analysis workflow] tutorials，I got an error. Please tell me where is the object "path_to_sc"

Setup CI

GitHub has tons of functionality pertaining continuous integration and we should take advantage of it. I'll handle it soonTM

Modify bcSubset function in order to eliminate the spatial coordinates for the unwanted cells

In GitLab by @cfustero on May 26, 2022, 15:52

The bcSubset function does not take into account some particularities of Spatial objects. If we are interested in eliminating cells from a BC object using bcSubset, we should also delete these cells from the coordinates slot (which is a Spatial-specific slot and can be found here: bc@SeuratInfo$images$slice@coordinates) If not, the function still works, but not eliminating these cells has an effect on other downstream functions, such as bcSignatures.

Convert PSc and SSc into preloaded GS objects

The t matrices are of no particular interest and hinder development.

GetCollection default values test fails

── Error (test-GetCollection.R:203): default values ────────────────────────────
Error in `paste0(z, ": ", paste0(filters[[z]], collapse = ", "), ".\n")`: object 'z' not found
Backtrace:
  1. testthat::expect_equal(...)
       at test-GetCollection.R:203:2
  4. beyondcell::GetCollection(SSc, filters = list(MoAs = "NFkB signaling inhibitor"))
  6. base::lapply(...)
       at beyondcell/R/Genesets.R:173:4
  7. beyondcell (local) FUN(X[[i]], ...)
  8. base::tryCatch(...)
       at beyondcell/R/Genesets.R:174:6
  9. base (local) tryCatchList(expr, classes, parentenv, handlers)
 12. base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
 13. base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 14. value[[3L]](cond)
 15. base::paste0(...)
       at beyondcell/R/Genesets.R:176:17

Imputation step is pretty slow

In GitLab by @KANG-BIOINFO on Jan 10, 2022, 23:50

Thanks for the package and congratulation on your paper.
When I ran the beyondcell, I observe cluster and UMAP driven by sequencing depth, so I tried to regress out that effect. However, it was stuck in "Imputing normalized BCS..." forever, even though my input size was just 3x200 when I ran following code:
bc = bcRegressOut(bc, vars.to.regress = "nCount_RNA")

I saw you mentioned "You migth need to refine the filtering of your single-cell experiment based on the amount of detected features." in the tutorial. Could you explain more about refinement of filtering here? Is it about the refinement of cells or genes?

Thanks!

Remove print line in bcUMAP

In line 196.

Imputing normalized BCS...

When I was using beyondcell to analyze my rds data, I stopped at Imputing normalized BCS... I couldn't continue the analysis, the software did not report an error but kept running (nearly more than ten hours). My rds data is smaller than test data.It only takes about an hour for my computer to analyze the test data。

Failure (test-bcRegressOut.R:224): warnings

bcRegressOut(bc.corrupt5, vars.to.regress = "nFeature_RNA")` did not throw the expected warning.

Replace qusages::read.gmt with a custom function

Output should be the same as current but with builtin code.

Replace useful::find.case with a tidyverse alternative

Related to #74

Error (test-bcRegressOut.R:187): errors

Error in `bcRegressOut(bc.object.norm.complete.bg, vars.to.regress = "nFeature_RNA", 
    k.neighbors = n.complete.bg, add.DSS = FALSE)`: k.neighbors must be lower than the number of complete cases in @normalized slot: 80.

GetCollection warnings test fail

── Error (test-GetCollection.R:88): warnings ───────────────────────────────────
Error in `paste0(z, ": ", paste0(filters[[z]], collapse = ", "), ".\n")`: object 'z' not found
Backtrace:
  1. testthat::expect_warning(...)
       at test-GetCollection.R:88:2
  7. beyondcell::GetCollection(...)
  9. base::lapply(...)
       at beyondcell/R/Genesets.R:173:4
 10. beyondcell (local) FUN(X[[i]], ...)
 11. base::tryCatch(...)
       at beyondcell/R/Genesets.R:174:6
 12. base (local) tryCatchList(expr, classes, parentenv, handlers)
 15. base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
 16. base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 17. value[[3L]](cond)
 18. base::paste0(...)
       at beyondcell/R/Genesets.R:176:17

Add feedback on bcRegressOut

DSS background not computed. The imputation will be computed with just the drugs (not pathways) in the beyondcell object.
Imputing normalized BCS...

Could we add a progress bar here? I'm currently regressing ~40k cells and I'm unsure whether the imputation has crashed due to a lack of complete cases or it's just taking forever.

Test failure: no non-missing arguments to max; returning -Inf

Warning (test-bcScore.R:349): default values
no non-missing arguments to max; returning -Inf
Backtrace:
 1. beyondcell::bcScore(pbmc, gs = ssc, expr.thres = 0)
      at test-bcScore.R:349:2
 3. base::apply(bc@normalized, 1, scales::rescale, to = c(0, 1))
      at beyondcell/R/Score.R:171:2
 5. scales:::rescale.numeric(newX[, i], ...)
 7. base::range(x, na.rm = TRUE, finite = TRUE)

I have the same problem Imputing normalized BCS...

Can I use other scRNA-seq imputation tools to the raw data and then input it into Beyondcell? Thank you very much.

Failure in test-bcRegressOut.R messages

Failure (test-bcRegressOut.R:249): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[2]]) vs lines(expected[[2]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

Failure (test-bcRegressOut.R:260): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[2]]) vs lines(expected[[2]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

lines(actual[[4]]) vs lines(expected[[4]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:270): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[2]]) vs lines(expected[[2]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

lines(actual[[4]]) vs lines(expected[[4]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:285): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[2]]) vs lines(expected[[2]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

lines(actual[[6]]) vs lines(expected[[6]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:301): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[1]]) vs lines(expected[[1]])
- "Computing background BCS using DSS signatures..."
+ "Background BCS already computed. Skipping this step."
  ""

lines(actual[[2]]) vs lines(expected[[2]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

lines(actual[[6]]) vs lines(expected[[6]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:314): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[2]]) vs lines(expected[[2]])
- "Computing background BCS using DSS signatures..."
+ "Background BCS already computed. Skipping this step."
  ""

lines(actual[[3]]) vs lines(expected[[3]])
- "No NaN values were found in bc@normalized. No imputation needed."
+ "Imputing normalized BCS..."
  ""

lines(actual[[5]]) vs lines(expected[[5]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:353): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[4]]) vs lines(expected[[4]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:380): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[6]]) vs lines(expected[[6]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Failure (test-bcRegressOut.R:396): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[1]]) vs lines(expected[[1]])
- "Computing background BCS using DSS signatures..."
+ "Background BCS already computed. Skipping this step."
  ""

Failure (test-bcRegressOut.R:409): messages

testthat::capture_messages(...) (`actual`) not equal to c(...) (`expected`).

lines(actual[[1]]) vs lines(expected[[1]])
- "DSS background not computed. The imputation will be computed with just the drugs (not pathways) in the beyondcell object."
+ "Background BCS already computed. Skipping this step."
  ""

lines(actual[[6]]) vs lines(expected[[6]])
- "No NaN values were found in bc@background. No imputation needed."
+ "Imputing background BCS..."
  ""

Replace gdata::trim with builtincode or a tidy alternative

UPDATE: I'm looking for actual calls to gdata::trim and It seems there are none? I see an import statement in GenerateGenesets and GetCollection, but no calls in the code.

It may be the case that I already removed the call when separating the old GenerateGenesets into two standalone functions.

Remove see dependency in bcCellCycle

We are still evaluating whether to keep r:see around for now.

Beyondcell for non-cancerous and non-drug related data?

In GitLab by @gwellem on Nov 19, 2021, 11:19

Thank you very much for this great package. Please can I also use the Beyondcell algorithm to analyse treatment vs control conditions, in non-cancerous and non-drug related single cell RNA seq data?
Thanks

Failure in test-bcScore.R default values

Failure (test-bcScore.R:306): default values

bc.object@scaled (`actual`) not equal to `scaled` (`expected`).

Failure (test-bcScore.R:309): default values

bc.object.up@scaled (`actual`) not equal to `scaled.up` (`expected`).

Failure (test-bcScore.R:312): default values

bc.object.down@scaled (`actual`) not equal to `scaled.down` (`expected`).

Failure (test-bcScore.R:340): default values

bc.object@normalized (`actual`) not equal to `norm` (`expected`).

Failure (test-bcScore.R:343): default values

bc.object.up@normalized (`actual`) not equal to `norm.up` (`expected`).

Failure (test-bcScore.R:346): default values

bc.object.down@normalized (`actual`) not equal to `norm.down` (`expected`).

Failure (test-bcScore.R:380): default values

unname([email protected]) (`actual`) not equal to sp.up.down(norm, scaled) (`expected`).

Make the regression step mandatory

In GitLab by @cfustero on Nov 27, 2020, 13:22

As the number of features per cell seems to always influence the results (generating "feature-poor" clusters). And also, the residuals allow us to obtain comparable metrics for all drugs, the bcRegressOut step should be mandatory in order to obtain the signature statistics.

Users should follow these steps for a correct analysis of their samples:

Compute BCS
Compute UMAP
Check clustering and look for unwanted sources of variation (bcClusters function)
Regress out unwanted sources of variation
Recompute UMAP
Obtain signature's statistics

Integrated bc-viability score

In GitLab by @cfustero on May 14, 2020, 16:44

Cell cycle, proliferation or apoptosis signatures can help us identify positive or adverse drug effects. We need to combine the bc scores obtained from both drugs and viability-related signatures in order to generate a bc-viability score.

Failure in test-bcScore.R messages

Failure (test-bcScore.R:223): messages

bcScore(pbmc, gs = gs10, expr.thres = 0.1)` did not throw the expected message.

Failure (test-bcScore.R:228): messages

bcScore(pbmc, gs = gs10, expr.thres = 0.3)` did not throw the expected message.

Change UMAP method to uwot

In GitLab by @SGMartin on Jan 21, 2021, 10:10

Seurat v3.1.0 changed its default UMAP method from umap-learn (Python) to that from r-uwot. Adopting uwot as the UMAP method for beyondcell would solve the following issues:

The need to use a wrapper to call Python from R.
Numerous issues regarding umap-learn installation (i.e.): satijalab/seurat#3361

Proposal:

Change the following code from R/Reductions.R, line 172:

sc <- Seurat::RunUMAP(sc, dims = 1:pc, umap.method = "umap-learn",
                          n.components = 2, verbose = FALSE)

sc <- Seurat::RunUMAP(sc, dims = 1:pc, umap.method = "uwot",
                          n.components = 2, verbose = FALSE)

And update the recipe accordingly.

Failure (test-bcScore.R:198): warnings

bcScore(pbmc, gs = gs.warning)` did not throw the expected warning.

bcMerge fails if bc object background matrix is empty

In GitLab by @SGMartin on Jun 4, 2021, 15:14

Running a previously working piece of code, I get the following error when performing a merge of two beyondcell objects:

Error in bc2@background[, cells] : no 'dimnames' attribute for array

Here are the contents of both objects' background:

<0 x 0 matrix

I guess some kind of check is needed before this line of code is executed:

bc@background <- unique(rbind(bc1@background, bc2@background[, cells]))[, cells]

Update beyondcell recipe

Once we have replaced the dependencies with built-in functions or tidyverse functions, we should

Remove:

qusages
bnstruct
see
useful
plyr
ggplot2

Upgrade:

Seurat

Add:

tidyverse

Check if images slot has any image in bcSubset

Add parameters to bcUMAP

We should add a parameter for the number of PCs used to compute the PCA and another one for the random seed of both the PCA and UMAP.

Update project badges that point to the anaconda.org/bu_cnio repo

Since new versions will be in bioconda instead, we should update/remove those badges.

@SGMartin not sure if this is you, please assign it as needed.

Test failure:gs genes are in uppercase and sc genes are capitalized.

Warning (test-bcScore.R:187): warnings
gs genes are in uppercase and sc genes are capitalized. Please check your Seurat object and translate the genes if necessary.
Backtrace:
 1. testthat::expect_warning(...)
      at test-bcScore.R:187:2
 7. beyondcell::bcScore(pbmc, gs = gs.mouse)

Failure (test-bcScore.R:187): warnings
`bcScore(pbmc, gs = gs.mouse)` did not throw the expected warning.

bcScore/bcMerge/bcRecompute errors when using just 1 signature

Hello, I ran into some issues with the bcScore, bcMerge, and bcRecompute functions.

When I tried inputting my own GMT files/functional signatures to a beyondcell object (bcScore/bcMerge) and then run bcRecompute, I get an error:

Error in bc@data[x, cells] : subscript out of bounds

This might have to do with how R tries to handle one-row matrices as vectors, causing downstream problems along the way such as creating unequal metacolumns. Would you be able to take a look at this?

suppressWarnings(bcRegressOut(bc.sub.reg, add.DSS = FALSE, vars.to.regress = "nFeature_RNA")@regression[ordering]) (`actual`) not equal to list(...) (`expected`).

`actual$order.background`:   "subset" "regression"
`expected$order.background`: ""       ""