GithubHelp home page GithubHelp logo

Smith-waterman aligner bug about gkl-rs HOT 6 CLOSED

philipc avatar philipc commented on September 26, 2024
Smith-waterman aligner bug

from gkl-rs.

Comments (6)

philipc avatar philipc commented on September 26, 2024

Which OverhangStrategy is this using? I know that OverhangStrategy::Ignore can return a longer alignment (I think this is the relevant code). Also, related to that, the return offset in Align should be isize, not usize, but I think you're already handling that.

If you can get a test case I'm happy to look into it.

from gkl-rs.

rhysnewell avatar rhysnewell commented on September 26, 2024

Okay, I don't think this is a problem with the actual algorithm implementation at all. I think this might be a C memory issue. The problem is only reproducible when you are performing multiple alignments in parallel. See https://github.com/rhysnewell/Lorikeet/blob/master/tests/smith_waterman_aligner_unit_tests.rs#L406

This test incorporates several test alignments using the basic smith waterman aligner and the gkl implementation across a number of alignment parameters and overhang strategies. I noticed that when running these alignments without using rayon, the tests never failed. The results would always be the same between the two alignment methods. As soon as I wrapped the whole thing in a rayon parallel iterator the started failing very consistently, with the gkl aligner giving completely different alignments than the standard implementation:

cargo test --release --test smith_waterman_aligner_unit_tests

test test_avx_mode ... FAILED

failures:

---- test_avx_mode stdout ----
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `Some([Match(18), Ins(879), Match(398)])`,
 right: `Some([Del(6), Ins(427), Match(9), Ins(89), Match(4), Ins(64), Match(3), Ins(15), Match(5), Ins(283), Match(396)])`: Alignments are not equal:
 Some([Match(18), Ins(879), Match(398)])
 Some([Del(6), Ins(427), Match(9), Ins(89), Match(4), Ins(64), Match(3), Ins(15), Match(5), Ins(283), Match(396)])', tests/smith_waterman_aligner_unit_tests.rs:446:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    test_avx_mode

Additionally, sometimes the test would just hang indefinitely suggesting that there is a sporadic deadlock somewhere too.

I had no idea that using rayon would cause an issue like this, but I suppose it makes sense? I'm not sure how you would hunt down a fix for this. Hopefully the stable rust implementation will fix this bug. Pretty interesting interaction though, maybe play around with the test and see if you see the same thing. Definitely fails very consistently for me when in parallel, but safe single-threaded

from gkl-rs.

philipc avatar philipc commented on September 26, 2024

That might be the fault of this global variable.

from gkl-rs.

philipc avatar philipc commented on September 26, 2024

#13 should fix it. I can do a patch release of that if you want.

from gkl-rs.

rhysnewell avatar rhysnewell commented on September 26, 2024

That would be great, thank you!

from gkl-rs.

philipc avatar philipc commented on September 26, 2024

Published 0.1.1

from gkl-rs.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.