GithubHelp home page GithubHelp logo

at-cg / panaligner Goto Github PK

View Code? Open in Web Editor NEW
34.0 4.0 4.0 3.88 MB

Long read aligner for cyclic and acyclic pangenome graphs

License: Other

Makefile 0.45% C 70.10% Perl 0.45% Gnuplot 0.73% Shell 2.94% C++ 15.17% Roff 1.76% JavaScript 8.41%
pangenome read-alignment variation-graphs

panaligner's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

panaligner's Issues

Understanding chaining + extension: why don't valid paths in a GFA have end-to-end alignments?

Hi,

Thanks for this new addition to the graph alignment ecosystem, it is very much appreciated.

I am having issues with a simple test in which I take a 3 node GFA and then concatenate a path and align the resulting sequence back to the GFA to see if it is found again.

Here is the GFA:

S	1	tcatggtacttgtcaccttctgtcctatttgagtcatgatgcatttgtctatcatctatctccctgcaatgaacatccctggggtggggccagggtctgttttgcttggtgctgcattcccattgccaagaagacctcacaggcatatcaagcacacagcaggtgcataacacttgctgagtttgctgaatgaatCCCTCCTCTGTATCCTACACATCCAACTGGTCACCACGTCCTGTGCGGTAGATGTCCTTTTTAGCAGGTCCCCTGCACTTCACTGTGTTTGTTAAGCCTTGGCTTGCTCTGtcatttcattttgcttaggTGGCGGCAGCAGCCCCTGACAGCCCCTGGCCTCACAGCCCCTGGCCTCAACCCATATCCTCTCCACTCTGCACACAGCTGCTGGGAGACTTTTCCTTAAAACTtggtttctctcctttcccttctgaAAACTGTCAGTGGCTTCCTGTGGCTCAGAGGAATCAATCGAAACATGGTGCTGCTCACAAGGAAACACAACGTTTCTAGGGCCAGTGCTGTCTTTGTCCTGCCTTCCTCCAAATTCAGCCACGTTAAGCCTCAAGCAGATCCTCCCACACAGCTGGGCTTCTTCTGCCTTTCTGCCAGGACTTGCTGCTCCCAGGAGCCTCCTCCCCCAGTCCCTGCAGCCCCAGGTCAGGTGTCTCCCCACCCAGCCTGGCTCAGCTGTCCTTTCCCGTGCAACTACAGGTTCATACAATTTGCTATTGCTCCATCTACTCTATGTGACTCTTGGTTTCTTGAAGGCAGGTATGggactctttctttccttctttgctttcttttctctttctctttctttctttttctttctttctttctttctttc
S	2	tttctttctttctttctttctttctttctttctttctt
S	3	tctctctctctctctctcttttcttttctttctttctctctttctgtttttagagactgggtcttgctgtcacccaggttggagtgcagtggtgcaatcatggctcaccgaactcctgggctcaagtgagcctcttgcctcagcctcccaattagttgggactacaggcatgtgccactacacctggctaattgttattattattattattattattattatttttgtaaagacagggtcttgctctgtttcctaggctggtcttgaacccctggcctcaaatgatcctcctgcctcagcctcccaaagtgctgggattccaggagtaagataccatgtctggccTGAGAAATTTTCTAAAAGGCATGTTTTTGCACTGtccaatatgtttttctttttatcccagcctctagcacaagtgcctgactcgaacgaagcagggactctgaaatatcttgaatgaacaagtAAATGGCACTTGAATAGGTGGCCTCTAATGtgcagaggaaggaaaagggagctgacgctttccgagtgttcagtaccttctaggcaccatgctgggccttttatgtagttctcgatgaatcctcatggcatcccGCTTTGCAGAATGATTCTAGTTACCCAGAGAGCAAGTAGCCCTTAGGGTCAGGACATGCATCTGAGTTGGTCTAGTCCCAAAGCTCCTGGTCCTCTCCTGACATCActtaaacagagaaacagaactccCCTTGGGCTTCTAGGGGCGCTGGGTTCAGGAGGCACAGCCACTCCCTTTGTTCTTCCTGGCAGCTGCCCCACCAGCAGTGAGCCCATCCCacctctgggttttcttttt
L	1	+	2	+	0M
L	1	+	3	+	0M
L	2	+	3	+	0M

When I concatenate 1,2,3 or 1,3 and align them to the GFA, I never get an end-to-end alignment, and node 2 is always missing:

1       871     1       871     +       >1      871     1       871     870     870     60      tp:A:P  NM:i:0  cm:i:51 s1:i:714        s2:i:0  dv:f:0.0756     cg:Z:870=
3       841     5       836     +       >3      841     5       836     831     831     60      tp:A:P  NM:i:0  cm:i:48 s1:i:672        s2:i:0  dv:f:0.0790     cg:Z:831=
1_2_3   1750    1       857     +       >1      871     1       857     856     856     60      tp:A:P  NM:i:0  cm:i:50 s1:i:714        s2:i:0  dv:f:0.0751     cg:Z:856=
1_2_3   1750    930     1745    +       >3      841     21      836     815     815     60      tp:A:P  NM:i:0  cm:i:47 s1:i:671        s2:i:0  dv:f:0.0795     cg:Z:815=
1_3     1712    1       857     +       >1      871     1       857     856     856     60      tp:A:P  NM:i:0  cm:i:50 s1:i:700        s2:i:0  dv:f:0.0751     cg:Z:856=
1_3     1712    892     1707    +       >3      841     21      836     815     815     60      tp:A:P  NM:i:0  cm:i:47 s1:i:672        s2:i:0  dv:f:0.0795     cg:Z:815=

Here is a visualization showing how neither the graph nor the linear query are fully covered:
image

I have tried multiple different combinations of parameters to attempt to find seeds/minimizers in node 2:

Defaults:

PanAligner -c -x lr -o /tmp/tmpn2oq6iyx/reads_vs_graph.gaf bad_tandem_graph.gfa bad_tandem_nodes.fasta
/tmp/tmpn2oq6iyx/reads_vs_graph.gaf

Permissive:

PanAligner -c -k 14 -n 1,1 -m 1,1 -x lr -o /tmp/tmpyid87tic/reads_vs_graph.gaf bad_tandem_graph.gfa bad_tandem_nodes.fasta
/tmp/tmpyid87tic/reads_vs_graph.gaf

I've also tried increasing the -g and -r parameters to allow the extension step to find more alignments, but that does not seem to have any effect. Alternatively, extreme 0/1 values of -f also have no effect.

Could you help me with this case? I am having difficulty understanding why the base level alignment step doesn't bridge the chains here. Ideally, I would use base level alignment for a majority of the work, and only rely on minimizers for anchoring the flanks of the tandem repeat.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.