GithubHelp home page GithubHelp logo

Error in AggregatePeakCounts() about sierra HOT 10 CLOSED

vccri avatar vccri commented on May 28, 2024
Error in AggregatePeakCounts()

from sierra.

Comments (10)

rj-patrick avatar rj-patrick commented on May 28, 2024

Hi, thanks for trying out the package.

Can I just confirm that "AML_ALL_merged_peaks.txt" is the file that you're using for CountPeaks for all of your data-sets? Theoretically all the peak identifiers should be in the count matrices as well (i.e. in the code you pointed to I would expect all row names of this.data to be in the peak file used for counting). That said, I should add a check, so thanks for pointing it out.

Regarding your alternative workflow with Seurat merge, that should work fine.

from sierra.

pangxueyu233 avatar pangxueyu233 commented on May 28, 2024

yes, "AML_ALL_merged_peaks.txt" is my whole datasets @reprobate .
Thanks again, and I would move on using your pipline.

from sierra.

pangxueyu233 avatar pangxueyu233 commented on May 28, 2024

Hello,
I saw a graph in your paper was meaningful, can you show me your code ?
image

from sierra.

rj-patrick avatar rj-patrick commented on May 28, 2024

Actually I was planning on incorporating that functionality into the R package anyway, will let you know when it is ready, but hopefully can make that update this week coming.

from sierra.

rj-patrick avatar rj-patrick commented on May 28, 2024

Hi @pangxueyu233,

I've added the functionality for performing the 3'UTR length analysis and generating the above visualisations. The details are in updated vignette under section 5. Let me know if you're able to run the new functions.

from sierra.

pangxueyu233 avatar pangxueyu233 commented on May 28, 2024

Thanks for your maintaining @reprobate ! I have tried your new function as follows:

Idents(peaks.seurat) <- peaks.seurat$new_anno3
res.table = DetectUTRLengthShift(peaks.object = peaks.seurat, 
                          gtf_gr = gtf_gr,
                          gtf_TxDb = gtf_TxDb,
                          population.1 = "Neutrophil like", 
                          population.2 = NULL, ncores = 25)

But I got an Error

>       sel_clu <- unique(as.character(peaks.seurat$new_anno3))[i]
>       res.table = DetectUTRLengthShift(peaks.object = peaks.seurat,
+                                   gtf_gr = gtf_gr,
+                                   gtf_TxDb = gtf_TxDb,
+                                   population.1 = sel_clu,
+                                   population.2 = NULL, ncores = 25)
[1] "1204 expressed peaks in feature types UTR3"
[1] "1152 peaks after filtering out A-rich annotations"
[1] "111 genes detected with multiple peak sites expressed"
[1] "243 Individual peak sites to test"
converting counts to integer mode
[1] "Running DEXSeq test..."
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
[1] "Detecting shifts in 3'UTR length usage"
Error in data.frame(SiteLocation = site.diff, NumSites = num.sites, row.names = diff.site,  :
  arguments imply differing number of rows: 0, 1
In addition: Warning message:
In vst(exp(alleffects), object) :
  Dispersion function not parametric, applying log2(x+ 1) instead of vst...

It seems like my data cannot be normalised by log2(X+1), and I don't know how to avoid this error.
But when I changed the cluster "Neutrophil like"into other clusters, it would be okay (but some can't, neither). But I do want know what the APA events in "Neutrophil like cluster", so whether there is a way to achive that?
Thanks.

from sierra.

rj-patrick avatar rj-patrick commented on May 28, 2024

Hi @pangxueyu233

There are a couple of things going on here. The log2(x+1) message isn't an error - this comes from DEXSeq trying to fit its dispersion function. Is your "Neutrophil like" population a minor cluster? Having a small number of cells will result in a lower 'sequencing depth' of the pseudo-bulk profiles provided to DEXSeq and can cause this issue to occur. One thing you could try would be to decrease the expression threshold for a peak - e.g. try setting exp.thresh to 0.05 and see what happens.

The actual error seems to be in the post-processing step, but I might need a bit more info to resolve it. Can you tell me if you run the below function what your output is?

res.table = DUTest(peaks.object = peaks.seurat,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

from sierra.

pangxueyu233 avatar pangxueyu233 commented on May 28, 2024

I have tried your code @reprobate , and the results were as followings:

> res.table = DUTest(peaks.object = peaks.seurat,
+                      population.1 = "Neutrophil like",
+                      population.2 = NULL,
+                      feature.type = c("UTR3"),
+                      filter.pA.stretch = TRUE)
[1] "1204 expressed peaks in feature types UTR3"
[1] "1152 peaks after filtering out A-rich annotations"
[1] "111 genes detected with multiple peak sites expressed"
[1] "243 Individual peak sites to test"
converting counts to integer mode
[1] "Running DEXSeq test..."
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
Warning message:
In vst(exp(alleffects), object) :
  Dispersion function not parametric, applying log2(x+ 1) instead of vst...

and my Neutrophil like is a second largest population as you can see

> table(peaks.seurat$new_anno3)

           HSPC  Macrophages II   Macrophages I             MEP    Erythrocytes
           1032             943            2385             322             368
       GMP like    Erythroblast      Neutrophil             GMP        Mono pro
          11495            1772            5619             964            1042
Neutrophil like
           7405

from sierra.

rj-patrick avatar rj-patrick commented on May 28, 2024

Thanks @pangxueyu233.

The error that needs to be resolved is:

Error in data.frame(SiteLocation = site.diff, NumSites = num.sites, row.names = diff.site,  :
  arguments imply differing number of rows: 0, 1

Which is occurring in the post-processing steps. I haven't been able to generate that error myself though, which makes debugging a bit tricky. Are you able to show me what is in res.table?

In the meantime, I've added a check to the code that should allow you to run the function without that error occurring. However, from your output, it seems that there are a small number of peaks being detected as expressed so I wouldn't expect there to be many detected examples of APA.

As I mentioned above, you could try reducing the stringency of the expression threshold to potentially increase the number of peaks considered and see if that makes a difference, for example:

res.table = DUTest(peaks.object = peaks.seurat,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     exp.thresh = 0.05,
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

or

res.table = DetectUTRLengthShift(
                     peaks.object = peaks.seurat,
                     gtf_gr = gtf_gr,
                     gtf_TxDb = gtf_TxDb,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     exp.thresh = 0.05,
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

from sierra.

pangxueyu233 avatar pangxueyu233 commented on May 28, 2024

Thanks a lot! I got right results @reprobate

from sierra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.