Comments (3)
RM2 is described here: https://www.pnas.org/doi/10.1073/pnas.1921046117. Fig 1 shows the workflow. Currently, the whole RM2 workflow is executed, and SINE/LINE elements are harvested at the end output of RM2. If a particular module can be separated, or RM2 being further acclerated, it would be great!
Shujun
from edta.
Hi Artem,
Unfortunately, this is the case. The LINE search function is carried out by RepeatModeler which is slow on even small genomes. Because RepeatModeler's search is based on copy number and multiple alignments, splitting the genome into small subsets may lose families that are already low copy. You can run EDTA on SSD, which will significantly improve your RepeatModeler/RepeatMasker runs because they are I/O intense.
Shujun
from edta.
Thanks, Shujun,
The cluster I ran EDTA on has only SSD, I think :) I see the problem now: we need to parallelize RM, but it has to establish communication between all the jobs in parallel. Could you please tell me what specific part of RM is assigned for LINE search?
from edta.
Related Issues (20)
- Can't locate SearchResult.pm in @INC (you may need to install the SearchResult module) HOT 3
- Hey!Have you solved your problem? I had the same problem. HOT 1
- I just found that this script in the (../../share/RepeatMasker/) folder will not have this error, maybe can copy the input file, I think it can try?If you tried, can you tell me the result? HOT 3
- TIR not found? HOT 1
- 文件缺失 HOT 1
- Stuck by BLAST in LTR finding HOT 2
- PanEDTA test output
- [No LINE, EDTA 2.2.0] Empty LINE file after RM2
- LINE and SINE results files has 0 bp!
- ERROR: TE annotation stats results not found in B.purpurea.fasta.mod.EDTA.TE.fa.stat! HOT 1
- '调用失败' HOT 8
- Statistical genome size
- solve '*.mod.EDTA.TEanno.sum' empty HOT 3
- Unusual Output and Failure During Regular Annotation
- TE_XXX in gff3 from panEDTA
- panEDTA timing out on large genomes HOT 2
- For the RepeatModeler step why throw logs to 2>null?
- Using CDS from multiple species
- several issues in the ruuning of EDTA2
- test data HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from edta.