Comments (9)
Sorry for the slow response. I wouldn't expect three datasets to take a long time to merge. How many peaks are in each of the files?
from sierra.
Hi @rj-patrick. We are facing a similar issue. With a total of 37 samples, the merge step has been running for three weeks and not finished yet. There are > 2.8 million peaks across all samples. "ncores" is set to 50. Do you have any insights regarding our situation? Thanks.
from sierra.
Hi @alanlamsiu,
Thanks for that. Do you have output from the MergePeaks function? It would be helpful to know where it has gotten up to. Three weeks is too long, but this is a much larger dataset than what we've tested it on. There's a couple of potential issues, but output would be the best for diagnosing the issue.
Cheers,
Ralph
from sierra.
Thanks @rj-patrick for getting back.
I can see the size of the log file keeps increasing, which has now 27,482,128 lines, yet the output specified by "output.file" is not generated. Based on the log file, I guess that "internal peak merging" is done for all 37 samples, while the messages "Comparing peaks from to remaining data-sets" are already printed. For computing resources, the average memory usage is 4.5Gb and peak CPU usage is 39.05.
I recalled that when I ran a test using five samples, with a total of > 490 thousand peaks, it took less than a week to complete.
Please let me know if you need more details.
Thanks.
from sierra.
Thanks, to clarify, the message "Comparing peaks from [DATASET X] to remaining data-sets" is printed for how many of the 37 datasets?
from sierra.
The message has been printed for all 37 datasets.
from sierra.
Thanks. I think I know where the problem is. There is a final step where peaks are iteratively checked for merging, but with your dataset, my guess is it's getting stuck in a loop. Perhaps the best thing at this point would be to merge whatever is remaining, but for now I've set a limit on the number of iterations to go through. Pull the latest update and see if that fixes the issue.
from sierra.
Thanks @rj-patrick for the fix. I tried the updated version. The run was finished within a few days. I think I am good to go with it.
from sierra.
Thank you for solving the problem!
from sierra.
Related Issues (20)
- Generate GitHub Releases HOT 2
- Cellranger mkref function parameters for Sierra HOT 1
- FindPeaks Error--'x' values larger than vector length 'sum(width)' HOT 2
- DUTest function Error HOT 1
- Sierra dataframe has 0 length HOT 1
- [E::hts_open_format] Failed to open file HOT 1
- is it possible to generate a plot that shows global 3'UTR length change?
- issues with generating splice junction file HOT 7
- Paired-end & PlotRelativeExpression functions. HOT 1
- Getting "Error in (function (x) : attempt to apply non-function" HOT 3
- Using Sierra with Singleron Biotechnologies Platform HOT 1
- CountPeaks error when using Singleron Biotechnologies 3'-end BAM files HOT 4
- Error in PlotRelativeExpression functions HOT 1
- Coordinates of peaks across junctions after merging HOT 5
- PlotRelativeExpressionUMAP - can an additional function be added to bring out cells that are buried in the UMAP plot?
- Interpretation of log2 fold changes in the result HOT 1
- Changes in gene 3'UTR length HOT 1
- Can I force Sierra to include exons from gtf? HOT 2
- Question on counting overlapping peaks HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sierra.