GithubHelp home page GithubHelp logo

Comments (2)

jakezhusph avatar jakezhusph commented on August 21, 2024

Hi! Thanks for the cool software. I downloaded the example simulated data from processed_data/sim_MOB_pattern2_fc3_tau35_count_power1.csv and when I ran SpatialDE and SPARK on the data, I got 0 (SpatialDE) and 112 (SPARK) significant genes at qval < 0.05. Is this (a single replicate of) the same data used for figure 1c/the expected result? If I understand the plot correctly, it seems that at FDR = 0.05, more genes should be recovered.

Code for spatialDE

counts = pd.read_csv("../SPARK-Analysis/processed_data/sim_MOB_pattern2_fc3_tau35_count_power1.csv", index_col = 0)
counts = counts.T[(counts > 0).sum(0) >= 3].T

x, y = zip(*[pos.split('x') for pos in counts.index])
sample_info = pd.DataFrame({
    'x': np.array(x).astype(float), 
    'y': np.array(y).astype(float), 
    'total_counts': counts.sum(1)
})

norm_expr = NaiveDE.stabilize(counts.T).T
resid_expr = NaiveDE.regress_out(sample_info, norm_expr.T, 'np.log(total_counts)').T

X = sample_info[['x', 'y']]
results = SpatialDE.run(X, resid_expr)

(results.qval < 0.05).sum() # returns 0

Code for SPARK

countdata <- read.csv("../SPARK-Analysis/processed_data/sim_MOB_pattern2_fc3_tau35_count_power1.csv", row.names = 1)

rn <- row.names(countdata)
info <- cbind.data.frame(x = as.numeric(sapply(strsplit(rn, split = "x"), "[", 1)), 
                         y = as.numeric(sapply(strsplit(rn, split = "x"), "[", 2)))
rownames(info) <- row.names(countdata)

spark <- CreateSPARKObject(counts = t(countdata), location = info[,1:2], 
    percentage = 0.1, min_total_counts = 10)
spark@lib_size <- apply(spark@counts, 2, sum)

spark <- spark.vc(spark, covariates = NULL, lib_size = spark@lib_size, 
    num_core = 1, verbose = T, fit.maxiter = 500)
spark <- spark.test(spark, check_positive = T, verbose = T)

spark <- spark.test(spark, check_positive = T, verbose = F)

sum(spark@res_mtest$adjusted_pvalue < 0.05) # returns 102

Hi, this is not the way to calculate the power in the simulation. In the simulation, we know the true signal genes, and therefore we calculate the power directly by counting the number of true signal and false signals among the top genes. The power is essentially the number of true signals detected given certain number of false signals detected. let's say we simulate 1000 genes with 100 signals and 900 non signals. Then you apply both methods to the data, and order the pvalues first, then count how many top genes are signals and how many are non signals. Given detected one false signal, how many true signal you can get, that's the power.

The q-value or the adjusted p values are mainly for the real data, where we don't know the fundamental truth.

Also, the example data is just a toy example, for the detail simulation setting, you can check the supplementary material in our paper.

Let me know if you have any further questions.

from spark-analysis.

yhtgrace avatar yhtgrace commented on August 21, 2024

Thanks for the clarification!

from spark-analysis.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.