Hello, it's really an impressive tool to broaden the single cell analysis. And I w

yes, "AML_ALL_merged_peaks.txt" is my whole datasets <a class="user-mention notranslat

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for your maintaining <a class="user-mention notranslate" data-hovercard-type="u

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I have tried your code <a class="user-mention notranslate" data-hovercard-type="user"

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Thanks a lot! I got right results <a class="user-mention notranslate" data-hovercard-t

Error in AggregatePeakCounts() about sierra HOT 10 CLOSED

vccri commented on May 28, 2024

Error in AggregatePeakCounts()

from sierra.

Comments (10)

rj-patrick commented on May 28, 2024

Hi, thanks for trying out the package.

Can I just confirm that "AML_ALL_merged_peaks.txt" is the file that you're using for CountPeaks for all of your data-sets? Theoretically all the peak identifiers should be in the count matrices as well (i.e. in the code you pointed to I would expect all row names of this.data to be in the peak file used for counting). That said, I should add a check, so thanks for pointing it out.

Regarding your alternative workflow with Seurat merge, that should work fine.

from sierra.

pangxueyu233 commented on May 28, 2024

yes, "AML_ALL_merged_peaks.txt" is my whole datasets @reprobate .
Thanks again, and I would move on using your pipline.

from sierra.

pangxueyu233 commented on May 28, 2024

Hello,
I saw a graph in your paper was meaningful, can you show me your code ?

from sierra.

rj-patrick commented on May 28, 2024

Actually I was planning on incorporating that functionality into the R package anyway, will let you know when it is ready, but hopefully can make that update this week coming.

from sierra.

rj-patrick commented on May 28, 2024

Hi @pangxueyu233,

I've added the functionality for performing the 3'UTR length analysis and generating the above visualisations. The details are in updated vignette under section 5. Let me know if you're able to run the new functions.

from sierra.

pangxueyu233 commented on May 28, 2024

Thanks for your maintaining @reprobate ! I have tried your new function as follows:

Idents(peaks.seurat) <- peaks.seurat$new_anno3
res.table = DetectUTRLengthShift(peaks.object = peaks.seurat, 
                          gtf_gr = gtf_gr,
                          gtf_TxDb = gtf_TxDb,
                          population.1 = "Neutrophil like", 
                          population.2 = NULL, ncores = 25)

But I got an Error

>       sel_clu <- unique(as.character(peaks.seurat$new_anno3))[i]
>       res.table = DetectUTRLengthShift(peaks.object = peaks.seurat,
+                                   gtf_gr = gtf_gr,
+                                   gtf_TxDb = gtf_TxDb,
+                                   population.1 = sel_clu,
+                                   population.2 = NULL, ncores = 25)
[1] "1204 expressed peaks in feature types UTR3"
[1] "1152 peaks after filtering out A-rich annotations"
[1] "111 genes detected with multiple peak sites expressed"
[1] "243 Individual peak sites to test"
converting counts to integer mode
[1] "Running DEXSeq test..."
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
[1] "Detecting shifts in 3'UTR length usage"
Error in data.frame(SiteLocation = site.diff, NumSites = num.sites, row.names = diff.site,  :
  arguments imply differing number of rows: 0, 1
In addition: Warning message:
In vst(exp(alleffects), object) :
  Dispersion function not parametric, applying log2(x+ 1) instead of vst...

It seems like my data cannot be normalised by log2(X+1), and I don't know how to avoid this error.
But when I changed the cluster "Neutrophil like"into other clusters, it would be okay (but some can't, neither). But I do want know what the APA events in "Neutrophil like cluster", so whether there is a way to achive that?
Thanks.

from sierra.

rj-patrick commented on May 28, 2024

Hi @pangxueyu233

There are a couple of things going on here. The log2(x+1) message isn't an error - this comes from DEXSeq trying to fit its dispersion function. Is your "Neutrophil like" population a minor cluster? Having a small number of cells will result in a lower 'sequencing depth' of the pseudo-bulk profiles provided to DEXSeq and can cause this issue to occur. One thing you could try would be to decrease the expression threshold for a peak - e.g. try setting exp.thresh to 0.05 and see what happens.

The actual error seems to be in the post-processing step, but I might need a bit more info to resolve it. Can you tell me if you run the below function what your output is?

res.table = DUTest(peaks.object = peaks.seurat,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

from sierra.

pangxueyu233 commented on May 28, 2024

I have tried your code @reprobate , and the results were as followings:

> res.table = DUTest(peaks.object = peaks.seurat,
+                      population.1 = "Neutrophil like",
+                      population.2 = NULL,
+                      feature.type = c("UTR3"),
+                      filter.pA.stretch = TRUE)
[1] "1204 expressed peaks in feature types UTR3"
[1] "1152 peaks after filtering out A-rich annotations"
[1] "111 genes detected with multiple peak sites expressed"
[1] "243 Individual peak sites to test"
converting counts to integer mode
[1] "Running DEXSeq test..."
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
Warning message:
In vst(exp(alleffects), object) :
  Dispersion function not parametric, applying log2(x+ 1) instead of vst...

and my Neutrophil like is a second largest population as you can see

> table(peaks.seurat$new_anno3)

           HSPC  Macrophages II   Macrophages I             MEP    Erythrocytes
           1032             943            2385             322             368
       GMP like    Erythroblast      Neutrophil             GMP        Mono pro
          11495            1772            5619             964            1042
Neutrophil like
           7405

from sierra.

rj-patrick commented on May 28, 2024

Thanks @pangxueyu233.

The error that needs to be resolved is:

Error in data.frame(SiteLocation = site.diff, NumSites = num.sites, row.names = diff.site,  :
  arguments imply differing number of rows: 0, 1

Which is occurring in the post-processing steps. I haven't been able to generate that error myself though, which makes debugging a bit tricky. Are you able to show me what is in res.table?

In the meantime, I've added a check to the code that should allow you to run the function without that error occurring. However, from your output, it seems that there are a small number of peaks being detected as expressed so I wouldn't expect there to be many detected examples of APA.

As I mentioned above, you could try reducing the stringency of the expression threshold to potentially increase the number of peaks considered and see if that makes a difference, for example:

res.table = DUTest(peaks.object = peaks.seurat,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     exp.thresh = 0.05,
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

res.table = DetectUTRLengthShift(
                     peaks.object = peaks.seurat,
                     gtf_gr = gtf_gr,
                     gtf_TxDb = gtf_TxDb,
                     population.1 = sel_clu, 
                     population.2 = NULL, 
                     exp.thresh = 0.05,
                     feature.type = c("UTR3"), 
                     filter.pA.stretch = TRUE)

from sierra.

pangxueyu233 commented on May 28, 2024

Thanks a lot! I got right results @reprobate

from sierra.

Error in AggregatePeakCounts() about sierra HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs