Comments (10)
My genome Chromosome ID were shown followed:
Chr01
Chr10
Chr11
Chr12
Chr13
Chr14
Chr15
Chr16
Chr17
Chr18
Chr19
Chr02
Chr20
Chr03
Chr04
Chr05
Chr06
Chr07
Chr08
Chr09
from ltr_retriever.
Hello Ou: I have two questions to need you help. The first one :
when I run the LTR_retriever, I met a lot of same error messages lines as follow: substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129
When I manual run the call_seq_by_list.pl, using the follow command lines: perl call_seq_by_list.pl genome.fasta.retriever.scn.full -C genome.fasta > ltrTE.fa In addition to get the same as above error messeges, I also found some ltrTE.fa result line is emply, for example:
Chr02:99911289..99791824|Chr02:99911289..99918781
Chr02:99978658..99791824|Chr02:99978658..99989418
Chr02:99993211..99791824|Chr02:99993211..100003060
Chr02:100005808..99791824|Chr02:100005808..100015472
Chr02:100028374..99791824|Chr02:100028374..100041986
Chr02:100081604..99791824|Chr02:100081604..100092164
Chr02:100103379..99791824|Chr02:100103379..100113164
Chr02:100160631..99791824|Chr02:100160631..100170165
The length of my genome Chr02 is 99791824. What doubt to me is that : the start pos is large than the length of Chr02(for example Chr02:99911289..99791824|Chr02:99911289..99918781).
The second question is : In the directory RM_716953.ThuDec12115012022 , a lot error messages log files were create and the error message is that : ncResults-1669900505-717007.err BLAST engine error: Empty CBlastQueryVector
My genome have 20 chromosomes, but in the result genome.fasta.pass.list.gff3,only Chr01 and Chr02 have Identified LTR result information, while the other 18 chomosome fail to identified LTR result.
Thanks
Dear wlhCNU
I have the same issue did you find a solution?
from ltr_retriever.
Hello @wlhCNU @MoradMMokhtar,
You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue.
Best,
Shujun
from ltr_retriever.
Hello @wlhCNU @MoradMMokhtar,
You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue.
Best, Shujun
Dear @oushujun
Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1"
from ltr_retriever.
from ltr_retriever.
Hello @oushujun @MoradMMokhtar ,
It is right that two different versions of genomes were wrong used. The reason for this problem is that the genome were splited single chrome sequence to run ltr_finder analysis parallelly to save running time. But I ignored combine the ltr_finder result with the input genome chromosome order. But in LTR_retriever analysis process, convert_ltr_finder.pl need the right order input ltr_finder combine result, otherwise it will result in " call_seq_by_list.pl line 129" error messages.
In the original ltr_finder result have the cleary chromosome information,for example ">Sequence: Chr01 Len:111624253" in my ltr_finder result. It is better if convert_ltr_finder.pl could abstract the chromoseme id directly from original ltr_finder result.
Thanks
from ltr_retriever.
Can you please try again?
…
On Sat, Dec 3, 2022 at 4:41 PM MoradMMokhtar @.> wrote: Hello @wlhCNU https://github.com/wlhCNU @MoradMMokhtar https://github.com/MoradMMokhtar, You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue. Best, Shujun Dear @oushujun https://github.com/oushujun Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1" — Reply to this email directly, view it on GitHub <#140 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.>
@oushujun @wlhCNU
Thank you for your reply
I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever for Arabidopsis thaliana and it worked well and the results were good. When I did the same work in Arabidopsis lyrata (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/004/255/GCF_000004255.2_v.1.0/GCF_000004255.2_v.1.0_genomic.fna.gz), I had the error messages "call_seq_by_list.pl line 128 & 129". I have attached the combined file and the output from LTR_retriever. If you have time, can you please check what is wrong?
from ltr_retriever.
from ltr_retriever.
You may want to use LTR_FINDER_parallel and LTRharvest_parallel to accelerate your run to get input files. Shujun On Sun, Dec 4, 2022 at 10:03 AM MoradMMokhtar @.> wrote:
…
Can you please try again? … <#m_6586522269264675891_> On Sat, Dec 3, 2022 at 4:41 PM MoradMMokhtar @.> wrote: Hello @wlhCNU https://github.com/wlhCNU https://github.com/wlhCNU https://github.com/wlhCNU @MoradMMokhtar https://github.com/MoradMMokhtar https://github.com/MoradMMokhtar https://github.com/MoradMMokhtar, You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue. Best, Shujun Dear @oushujun https://github.com/oushujun https://github.com/oushujun https://github.com/oushujun Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1" — Reply to this email directly, view it on GitHub <#140 (comment) <#140 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.> @oushujun https://github.com/oushujun @wlhCNU https://github.com/wlhCNU Thank you for your reply I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever for Arabidopsis thaliana and it worked well and the results were good. When I did the same work in Arabidopsis lyrata ( https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/004/255/GCF_000004255.2_v.1.0/GCF_000004255.2_v.1.0_genomic.fna.gz), I had the error messages "call_seq_by_list.pl line 128 & 129". I have attached the combined file and the output from LTR_retriever. If you have time, can you please check what is wrong? LTR_retriever.zip https://github.com/oushujun/LTR_retriever/files/10148827/LTR_retriever.zip — Reply to this email directly, view it on GitHub <#140 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NDAK5VXWHF62XD3OOTWLS6DRANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.**>
Thanks for your suggestion, I will use LTR_FINDER_parallel and LTRharvest_paralle
from ltr_retriever.
Hello @oushujun @MoradMMokhtar , It is right that two different versions of genomes were wrong used. The reason for this problem is that the genome were splited single chrome sequence to run ltr_finder analysis parallelly to save running time. But I ignored combine the ltr_finder result with the input genome chromosome order. But in LTR_retriever analysis process, convert_ltr_finder.pl need the right order input ltr_finder combine result, otherwise it will result in " call_seq_by_list.pl line 129" error messages. In the original ltr_finder result have the cleary chromosome information,for example ">Sequence: Chr01 Len:111624253" in my ltr_finder result. It is better if convert_ltr_finder.pl could abstract the chromoseme id directly from original ltr_finder result.
Thanks
There's some good digging here. Yes, the convert_ltr_finder.pl
script is based on the order of input sequence and it's not good for split genomes. A much faster way to acclerate large genomes is LTR_FINDER_parallel
and LTRharvest_parallel
. Thus this old way is not supported anymore and I decide not to fix it for the sake of time.
Best,
Shujun
from ltr_retriever.
Related Issues (20)
- Invalid value for shared scalar HOT 10
- awk: fatal: cannot open file `genome.fa.mod.retriever.scn.extend.fa.rexdb.cls.tsv' for reading (No such file or directory) HOT 1
- awk: cannot open N02.fa.retriever.scn.extend.fa.rexdb.cls.tsv HOT 5
- ERROR: No candidate is found in the file(s) you specified. HOT 10
- LOC list New_genome.nextpolish.exceptChl.FINAL.fa.retriever.scn.full is empty HOT 3
- an error in Calculate LAI from EDTA HOT 2
- fa.mod.nmtf.pass.list is empty HOT 2
- LTR_retriever failed to generate a file - data passing in a multithreaded run HOT 2
- Are there any changes in Module 1 between versions 2.9.0 and 2.9.9? HOT 2
- cleanup.pl Bug? HOT 3
- same bug before about LTR.identifier.pl HOT 3
- A reviewer's comment related to the setting of mutaion rate HOT 2
- The LTR_retriever fails to identify LTRs with tandem repeats, such as Dasheng. HOT 3
- Dependency checker looking in wrong directory for makeblastdb HOT 1
- Assessing Chrysanthemum lavandulifolium genome assembly HOT 2
- Exploring the transposition profile of specific LTRs HOT 2
- LTR retriever is not compatible with RepeatModeler2 since v2.9.8? HOT 5
- Kimura2Paramter to ks value/Time HOT 2
- Calculated LAI is too large HOT 3
- LTR_retriever produce different result in different data combine ways
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ltr_retriever.