GithubHelp home page GithubHelp logo

Comments (10)

wlhCNU avatar wlhCNU commented on June 12, 2024

My genome Chromosome ID were shown followed:

Chr01
Chr10
Chr11
Chr12
Chr13
Chr14
Chr15
Chr16
Chr17
Chr18
Chr19
Chr02
Chr20
Chr03
Chr04
Chr05
Chr06
Chr07
Chr08
Chr09

from ltr_retriever.

MoradMMokhtar avatar MoradMMokhtar commented on June 12, 2024

Hello Ou: I have two questions to need you help. The first one :

when I run the LTR_retriever, I met a lot of same error messages lines as follow: substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129. substr outside of string at call_seq_by_list.pl line 128. Use of uninitialized value $seq in string eq at call_seq_by_list.pl line 129

When I manual run the call_seq_by_list.pl, using the follow command lines: perl call_seq_by_list.pl genome.fasta.retriever.scn.full -C genome.fasta > ltrTE.fa In addition to get the same as above error messeges, I also found some ltrTE.fa result line is emply, for example:

Chr02:99911289..99791824|Chr02:99911289..99918781

Chr02:99978658..99791824|Chr02:99978658..99989418

Chr02:99993211..99791824|Chr02:99993211..100003060

Chr02:100005808..99791824|Chr02:100005808..100015472

Chr02:100028374..99791824|Chr02:100028374..100041986

Chr02:100081604..99791824|Chr02:100081604..100092164

Chr02:100103379..99791824|Chr02:100103379..100113164

Chr02:100160631..99791824|Chr02:100160631..100170165

The length of my genome Chr02 is 99791824. What doubt to me is that : the start pos is large than the length of Chr02(for example Chr02:99911289..99791824|Chr02:99911289..99918781).

The second question is : In the directory RM_716953.ThuDec12115012022 , a lot error messages log files were create and the error message is that : ncResults-1669900505-717007.err BLAST engine error: Empty CBlastQueryVector

My genome have 20 chromosomes, but in the result genome.fasta.pass.list.gff3,only Chr01 and Chr02 have Identified LTR result information, while the other 18 chomosome fail to identified LTR result.

Thanks

Dear wlhCNU
I have the same issue did you find a solution?

from ltr_retriever.

oushujun avatar oushujun commented on June 12, 2024

Hello @wlhCNU @MoradMMokhtar,

You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue.

Best,
Shujun

from ltr_retriever.

MoradMMokhtar avatar MoradMMokhtar commented on June 12, 2024

Hello @wlhCNU @MoradMMokhtar,

You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue.

Best, Shujun

Dear @oushujun
Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1"

from ltr_retriever.

oushujun avatar oushujun commented on June 12, 2024

from ltr_retriever.

wlhCNU avatar wlhCNU commented on June 12, 2024

Hello @oushujun @MoradMMokhtar ,
It is right that two different versions of genomes were wrong used. The reason for this problem is that the genome were splited single chrome sequence to run ltr_finder analysis parallelly to save running time. But I ignored combine the ltr_finder result with the input genome chromosome order. But in LTR_retriever analysis process, convert_ltr_finder.pl need the right order input ltr_finder combine result, otherwise it will result in " call_seq_by_list.pl line 129" error messages.
In the original ltr_finder result have the cleary chromosome information,for example ">Sequence: Chr01 Len:111624253" in my ltr_finder result. It is better if convert_ltr_finder.pl could abstract the chromoseme id directly from original ltr_finder result.

Thanks

from ltr_retriever.

MoradMMokhtar avatar MoradMMokhtar commented on June 12, 2024

Can you please try again?

On Sat, Dec 3, 2022 at 4:41 PM MoradMMokhtar @.> wrote: Hello @wlhCNU https://github.com/wlhCNU @MoradMMokhtar https://github.com/MoradMMokhtar, You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue. Best, Shujun Dear @oushujun https://github.com/oushujun Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1" — Reply to this email directly, view it on GitHub <#140 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.>
@oushujun @wlhCNU
Thank you for your reply
I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever for Arabidopsis thaliana and it worked well and the results were good. When I did the same work in Arabidopsis lyrata (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/004/255/GCF_000004255.2_v.1.0/GCF_000004255.2_v.1.0_genomic.fna.gz), I had the error messages "call_seq_by_list.pl line 128 & 129". I have attached the combined file and the output from LTR_retriever. If you have time, can you please check what is wrong?

LTR_retriever.zip

from ltr_retriever.

oushujun avatar oushujun commented on June 12, 2024

from ltr_retriever.

MoradMMokhtar avatar MoradMMokhtar commented on June 12, 2024

You may want to use LTR_FINDER_parallel and LTRharvest_parallel to accelerate your run to get input files. Shujun On Sun, Dec 4, 2022 at 10:03 AM MoradMMokhtar @.> wrote:

Can you please try again? … <#m_6586522269264675891_> On Sat, Dec 3, 2022 at 4:41 PM MoradMMokhtar @.
> wrote: Hello @wlhCNU https://github.com/wlhCNU https://github.com/wlhCNU https://github.com/wlhCNU @MoradMMokhtar https://github.com/MoradMMokhtar https://github.com/MoradMMokhtar https://github.com/MoradMMokhtar, You are likely using different versions of genomes. For example, if you generate the candidate list with the v1 genome and use it on the v2 genome, you will run into this issue. Best, Shujun Dear @oushujun https://github.com/oushujun https://github.com/oushujun https://github.com/oushujun Thank you for your reply, I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever. Maybe it is because of the Id length more than 15 characters "JAEFBJ010000014.1" — Reply to this email directly, view it on GitHub <#140 (comment) <#140 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI https://github.com/notifications/unsubscribe-auth/ABNX4NBNT7EGXA5Y2IKD52LWLPEAFANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.
> @oushujun https://github.com/oushujun @wlhCNU https://github.com/wlhCNU Thank you for your reply I combined the results from LTR _finder and LTR _harvest and used the same fasta file for LTR_finder, LTR_harvest and LTR_retriever for Arabidopsis thaliana and it worked well and the results were good. When I did the same work in Arabidopsis lyrata ( https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/004/255/GCF_000004255.2_v.1.0/GCF_000004255.2_v.1.0_genomic.fna.gz), I had the error messages "call_seq_by_list.pl line 128 & 129". I have attached the combined file and the output from LTR_retriever. If you have time, can you please check what is wrong? LTR_retriever.zip https://github.com/oushujun/LTR_retriever/files/10148827/LTR_retriever.zip — Reply to this email directly, view it on GitHub <#140 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NDAK5VXWHF62XD3OOTWLS6DRANCNFSM6AAAAAASRUH2JI . You are receiving this because you were mentioned.Message ID: @.
**>

Thanks for your suggestion, I will use LTR_FINDER_parallel and LTRharvest_paralle

from ltr_retriever.

oushujun avatar oushujun commented on June 12, 2024

Hello @oushujun @MoradMMokhtar , It is right that two different versions of genomes were wrong used. The reason for this problem is that the genome were splited single chrome sequence to run ltr_finder analysis parallelly to save running time. But I ignored combine the ltr_finder result with the input genome chromosome order. But in LTR_retriever analysis process, convert_ltr_finder.pl need the right order input ltr_finder combine result, otherwise it will result in " call_seq_by_list.pl line 129" error messages. In the original ltr_finder result have the cleary chromosome information,for example ">Sequence: Chr01 Len:111624253" in my ltr_finder result. It is better if convert_ltr_finder.pl could abstract the chromoseme id directly from original ltr_finder result.

Thanks

There's some good digging here. Yes, the convert_ltr_finder.pl script is based on the order of input sequence and it's not good for split genomes. A much faster way to acclerate large genomes is LTR_FINDER_parallel and LTRharvest_parallel. Thus this old way is not supported anymore and I decide not to fix it for the sake of time.

Best,
Shujun

from ltr_retriever.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.