GithubHelp home page GithubHelp logo

Comments (4)

rsharris avatar rsharris commented on June 11, 2024

In lastz, if the --segments option is used, and the input sequences are NOT accessible in random order, the items in the segments file must be consistent with the order in which they 'will be needed'. For most users, "accessible in random order" means YES for twoBit, NO for fasta.

For fasta this should just mean segment names are in the same order as in the fasta file, and all + segments for a given sequence appear before all - segments.

There's some detail here (starting with the paragraph that begins with "Query sequence names must appear in the same order as ..."):
http://www.bx.psu.edu/~rsharris/lastz/README.lastz-1.04.03.html#fmt_segments

That implementation choice seemed more important 15 yrs ago that it does now. Now it would make more sense to read all the segments into memory, sort them by sequence name (to facilitate binary search), and use the appropriate segments as each new fasta sequence arrives.

from segalign.

gsneha26 avatar gsneha26 commented on June 11, 2024

@glennhickey the error is definitely in one of the lastz jobs that Bob has pointed out. Ideally, SegAlign divides and sorts the segments in a way to take care of this requirement. I can look into it. Could you share the input files?

from segalign.

glennhickey avatar glennhickey commented on June 11, 2024

I can confirm that this error doesn't happen with this older commit ComparativeGenomicsToolkit@1d2d38c

To reproduce:

  • install cactus binaries
  • make a config.xml as described here (I've been meaning to add a command line but haven't gotten around to it yet)
  • copy and unzip the input files from /public/home/hickey/dev/work/gpu-lastz on courtyard
  • make sure segalign's in your path
  • run cactus-blast ./jobstore 10mammalsplus.txt Anc05.cigar --root Anc05 --pathOverrides panTro6.fa.pp equCab3.fa.pp hg38.fa.pp canFam3.fa.pp felCat8.fa.pp --pathOverrideNames Chimp Horse Human Dog Cat --realTimeLogging --logInfo --retryCount 0 --maxCores 64 --cleanWorkDir never --configFile ./config.xml

from segalign.

gsneha26 avatar gsneha26 commented on June 11, 2024

@glennhickey I have fixed the error in the latest commit. Thanks!

from segalign.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.