philipc / gkl-rs Goto Github PK
View Code? Open in Web Editor NEWRust bindings for Intel Genomics Kernel Library (GKL)
License: Apache License 2.0
Rust bindings for Intel Genomics Kernel Library (GKL)
License: Apache License 2.0
Hi Philip,
I've been trying to compile Lorikeet + gkl-rs with x86_64-unknown-linux-musl as the target in order to produce a statically linked binary that can easily be distributed in releases. However, I'm having some difficulties with the custome build script that gkl-rs uses.
These are the errors I get when I've attempted it:
The following warnings were emitted during compilation:
warning: In file included from gkl/pairhmm/avx-pairhmm.h:26,
warning: from gkl/pairhmm/avx_impl.cc:25:
warning: gkl/pairhmm/Context.h:27:10: fatal error: cmath: No such file or directory
warning: #include <cmath> // std::isinf
warning: ^~~~~~~
warning: compilation terminated.
error: failed to run custom build command for `gkl v0.1.0 (https://github.com/philipc/gkl-rs#11a88b99)`
Caused by:
process didn't exit successfully: `/github/workspace/target/release/build/gkl-3421105fe0419bef/build-script-build` (exit status: 1)
--- stdout
cargo:rerun-if-changed=gkl
TARGET = Some("x86_64-unknown-linux-musl")
OPT_LEVEL = Some("3")
HOST = Some("x86_64-unknown-linux-gnu")
CC_x86_64-unknown-linux-musl = None
CC_x86_64_unknown_linux_musl = None
TARGET_CC = None
CC = None
CROSS_COMPILE = None
CFLAGS_x86_64-unknown-linux-musl = None
CFLAGS_x86_64_unknown_linux_musl = None
TARGET_CFLAGS = None
CFLAGS = None
CRATE_CC_NO_DEFAULTS = None
DEBUG = Some("false")
CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
running: "musl-gcc" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-o" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/pairhmm/pairhmm_common.o" "-c" "gkl/pairhmm/pairhmm_common.cc"
exit status: 0
running: "musl-gcc" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-o" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/smithwaterman/smithwaterman_common.o" "-c" "gkl/smithwaterman/smithwaterman_common.cc"
exit status: 0
AR_x86_64-unknown-linux-musl = None
AR_x86_64_unknown_linux_musl = None
TARGET_AR = None
AR = None
running: "ar" "cq" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/libgkl-common.a" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/pairhmm/pairhmm_common.o" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/smithwaterman/smithwaterman_common.o"
exit status: 0
running: "ar" "s" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/libgkl-common.a"
exit status: 0
cargo:rustc-link-lib=static=gkl-common
cargo:rustc-link-search=native=/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out
TARGET = Some("x86_64-unknown-linux-musl")
OPT_LEVEL = Some("3")
HOST = Some("x86_64-unknown-linux-gnu")
CC_x86_64-unknown-linux-musl = None
CC_x86_64_unknown_linux_musl = None
TARGET_CC = None
CC = None
CROSS_COMPILE = None
CFLAGS_x86_64-unknown-linux-musl = None
CFLAGS_x86_64_unknown_linux_musl = None
TARGET_CFLAGS = None
CFLAGS = None
CRATE_CC_NO_DEFAULTS = None
DEBUG = Some("false")
CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
running: "musl-gcc" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-mavx" "-o" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/pairhmm/avx_impl.o" "-c" "gkl/pairhmm/avx_impl.cc"
cargo:warning=In file included from gkl/pairhmm/avx-pairhmm.h:26,
cargo:warning= from gkl/pairhmm/avx_impl.cc:25:
cargo:warning=gkl/pairhmm/Context.h:27:10: fatal error: cmath: No such file or directory
cargo:warning= #include <cmath> // std::isinf
cargo:warning= ^~~~~~~
cargo:warning=compilation terminated.
exit status: 1
--- stderr
error occurred: Command "musl-gcc" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-mavx" "-o" "/github/workspace/target/x86_64-unknown-linux-musl/release/build/gkl-9101aadbf0ecc6ed/out/gkl/pairhmm/avx_impl.o" "-c" "gkl/pairhmm/avx_impl.cc" with args "musl-gcc" did not execute successfully (status code exit status: 1).
warning: build failed, waiting for other jobs to finish...
error: build failed
I thought this might be something to do with using musl-gcc
rather than musl-g++
to compile, but the docker container this in running is meant to use musl-g++
in order to get past this cmath
issue. (See: https://github.com/rhysnewell/rust-cargo-musl-action/blob/master/Dockerfile)
Any help would be appreciated! :)
Cheers,
Rhys
Hi Phil,
I've been getting this set of errors in Lorikeet when using gkl-rs to perform the smith-waterman alignment. It seems that ocassionally the alignment produced by gkl-rs will produce a CIGAR string that suggests that the read aligninment ends up extending past the reference. I don't have specifics nor a reproducible test case yet, but just wanted to flag it with you.
Here is an example error produced by Lorikeet when using gkl-rs:
[2022-02-07T04:10:04Z INFO lorikeet] lorikeet version 0.6.2
[2022-02-07T04:10:04Z INFO lorikeet_genome] Using min-covered-fraction 0%
[2022-02-07T04:10:04Z INFO lorikeet_genome] Using min-read-aligned-percent 0%
[2022-02-07T04:10:04Z INFO lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files
[2022-02-07T04:10:04Z INFO lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files/short/
[2022-02-07T04:10:04Z INFO lorikeet_genome::utils::utils] Not pre-generating minimap2 index
[2022-02-07T04:10:04Z WARN lorikeet_genome::utils::utils] Not using reference index...
[2022-02-07T04:10:04Z INFO lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files/long/
[2022-02-07T04:10:04Z INFO lorikeet_genome::utils::utils] Not pre-generating minimap2 index
[2022-02-07T04:10:04Z WARN lorikeet_genome::utils::utils] Not using reference index...
[2022-02-07T04:10:05Z INFO lorikeet_genome::processing::lorikeet_engine] Processing long reads...
[2022-02-07T04:11:30Z INFO lorikeet_genome::processing::lorikeet_engine] Processing short reads...
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(66), Match(10), Del(134), Match(7), Ins(29), Match(4), Ins(87), Match(3), Ins(2), Match(3), Ins(201), Match(6), Ins(51), Match(5), Ins(107), Match(5), Ins(6), Match(2), Ins(93), Match(3), Ins(83), Match(3), Ins(44), Match(4), Ins(81), Match(3), Ins(208), Match(47), Ins(15), Match(6), Del(1), Match(2), Ins(16), Match(3), Ins(7), Match(10), Del(1), Match(2), Del(4), Match(3), Del(1), Match(2), Ins(3), Match(5), Ins(2), Match(4), Ins(10), Match(2), Del(1), Match(2), Ins(23), Match(3), Ins(31), Match(4), Ins(25)], indel index - 54', src/reads/alignment_utils.rs:446:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 286, ref len - 275, cigar - [Match(1), Del(1), Match(7), Ins(6), Match(3), Del(192), Match(2), Ins(48), Match(4), Ins(59), Match(3), Ins(67), Match(3), Ins(86), Match(3), Ins(73), Match(5), Ins(11), Match(3), Ins(212), Match(4), Ins(4), Match(4), Ins(354), Match(6), Ins(36), Match(3), Ins(35), Match(5), Ins(84), Match(2), Del(1), Match(4), Ins(7), Match(4), Ins(6), Del(26)], indel index - 36', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(193), Match(20), Ins(19), Match(4), Ins(12), Match(4), Ins(83), Match(5), Ins(35), Match(3), Ins(50), Match(5), Ins(3), Match(3), Ins(29), Match(3), Ins(9), Match(2), Ins(1), Match(1), Ins(45), Match(4), Ins(51), Match(3), Ins(2), Match(4), Ins(58), Match(3), Ins(35), Match(3), Ins(6), Match(4), Ins(141), Match(5), Ins(109), Match(157), Del(1), Match(1), Del(1), Match(4), Ins(2), Match(3), Ins(3), Match(3), Del(1), Match(4), Del(5), Match(1), Del(3), Match(4), Ins(2), Match(1), Del(3), Match(2), Del(2), Match(5), Del(14), Match(4), Ins(1)], indel index - 58', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(125), Match(55), Del(64), Ins(21), Match(5), Ins(13), Match(4), Ins(10), Match(3), Ins(22), Match(4), Del(1), Match(3), Ins(4), Match(1), Ins(70), Match(5), Ins(2), Match(8), Ins(102), Match(4), Ins(8), Match(3), Ins(30), Match(3), Ins(32), Match(4), Del(1), Match(2), Ins(77), Match(7), Ins(18), Match(2), Ins(9), Match(2), Ins(5), Match(2), Ins(9), Match(3), Ins(135), Match(4), Ins(8), Match(3), Ins(7), Match(5), Ins(6), Match(5), Ins(28), Match(3), Ins(9), Match(2), Ins(6), Match(2), Ins(10), Match(4), Ins(6), Match(6), Ins(19), Match(4), Ins(18), Match(4), Ins(4), Match(1), Ins(3), Match(6), Ins(1), Match(1), Ins(1), Match(2), Ins(3), Match(3), Ins(5), Match(2), Ins(7), Match(8), Ins(4), Match(3), Ins(59), Match(4), Ins(27), Match(4), Ins(14), Match(3), Ins(2), Match(2), Ins(24), Match(4), Ins(2), Match(2), Ins(9), Match(2), Ins(1), Match(3), Ins(26), Match(3), Ins(20), Match(3), Ins(28), Match(2), Ins(23), Match(3), Ins(11), Match(3), Ins(3), Match(3), Ins(144)], indel index - 105', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 306, ref len - 286, cigar - [Ins(121), Match(54), Del(72), Ins(44), Match(2), Ins(39), Match(3), Ins(10), Match(3), Del(4), Match(3), Ins(2), Match(4), Ins(12), Match(2), Ins(4), Match(1), Ins(29), Match(4), Ins(9), Match(4), Ins(7), Match(3), Ins(8), Match(4), Ins(11), Match(4), Ins(6), Match(4), Ins(46), Match(3), Ins(4), Match(3), Ins(39), Match(2), Ins(15), Match(4), Ins(7), Match(3), Ins(10), Match(3), Ins(25), Match(5), Ins(21), Match(1), Del(1), Match(6), Ins(1), Match(2), Ins(12), Match(3), Ins(5), Match(2), Ins(2), Match(3), Ins(33), Match(5), Ins(35), Match(3), Ins(11), Match(5), Ins(2), Match(2), Ins(23), Match(4), Ins(6), Match(12), Ins(40), Match(3), Ins(1), Match(2), Ins(27), Match(6), Ins(2), Match(1), Ins(1), Match(3), Ins(1), Match(2), Ins(8), Match(5), Ins(5), Match(4), Ins(4), Match(3), Ins(26), Match(5), Ins(2), Match(5), Ins(4), Match(3), Ins(14), Match(3), Ins(16), Match(5), Ins(34), Match(4), Ins(1), Match(4), Ins(4), Match(1), Ins(17), Match(4), Ins(130)], indel index - 103', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Never found start Some(29) or stop None given cigar [Match(69), Ins(16), Match(2), Ins(11), Match(4), Ins(39), Match(3), Ins(13), Match(2), Ins(25), Match(2), Ins(10), Match(3), Ins(3), Match(3), Ins(26), Match(5), Ins(81), Match(5), Ins(7), Match(2), Ins(84), Match(3), Ins(36), Match(2), Ins(70), Match(4), Ins(14), Match(2), Ins(65), Match(6), Ins(33), Match(2), Ins(10), Match(5), Ins(50), Match(5), Ins(82), Match(5), Ins(16), Match(4), Ins(47), Match(5), Ins(69), Match(8), Ins(52), Match(5), Ins(14), Match(3), Ins(1), Match(4), Ins(12), Match(4), Ins(68), Match(5), Ins(20), Match(5), Ins(6), Match(1), Ins(42), Match(5), Ins(11), Match(2), Ins(13), Match(2), Ins(128), Match(6), Ins(87)] ref start 29 ref end 225 offset 0 bases AATCGGAAGCAGTGGGAGATTCTAAAGCAGAGAAGAAACGGTTTGTTTCAACCGTTGAAAATGCTATCAGGGGTGATACATATGCAAGTATTCTAAATTCTTTCTAAGAATAAAACCAAGCATACTATTTTTAATTACGTACGACTAAAAAATATCGGACGATTATTTTTGCTCGTTTTTTATTAGCTTAAATTTTTTGGTTTGTTTAGCTTTATTTTGTTCCTCTCTTAGAATGCGATAAGCGAAAATATAATTTTCAGTCCTTACTTGTATTGAATTTTATCAAGCAGTCAAATAATCATCAAACAATCCACTGTCTATCGTTGTCACTTTCCAATTTCCCGCATTGGTTTTAGGGGCTTTGTAAAAAATTTAGTAAGGTGGATATAGTTGAATGTTCGAATTCCTACAAAAAACGAACTATATTAGAGAAAGACCATTGTCTTTTTTTAAAGCTTTTATCACAACCATTAAAAAATTAGCTATCAACACACTACAAATAAGGATTTTTATCAAGGTTTTCATTGTTCAAAAAAGATACTTAGTAGAAAATTTTAATTAACCTGTTTAAACAATAACTCTCTGCATCTGATTTCCATACAGCGCAGTTGTTCGGCTCTAAATTCAAATAAATTACAATAAACTCAAAAAGTTCTTTTTTTTTTATGCTCTCCGTCATAGTATTTTTATTTTTCTTAGTTTCAATTTATTTTCACACTTCCTTTGAATTTCAATAATTGCACTGATAAAACACCGCCGTGAATGTTTTCTGAATCAGACTATTTTGGGGTACTTCATAATCCTGCATTCTCTTTATTCGAGTTATAAACCTGTCTTGCTCTGTAAATGCCTGACTTGTAATCATTGTATCCTTTGTCAAAGCAGAAACGGTATGTTCATCACATTTTAGTTTTGTAAAATTACGATCGTTTGTTCTACCGGACTAAACCAAACTAAACTTGGAAACTTTTTTCATCCGGCATTTATAGTATGCGATTTTGACCTCCTCTTTTACTTCTTCTGTTAATAGCCTTTCTCTCAAGCAACACATTTTAGTATGTCTTTAAACAGGCTAGTAGTGCTGTCGCCAATTTTAACATTCTTCCCCAGCATCTTAGATCGACTGTCCGATGAAACATAACTATACTTTGAGGTAACTTATTATCAATTTCTTCTGAAAACTCTACTTTCTACACTTATTTGCGTCCGACAATCTGCTCTTTCCAGGAAGATGATTGATCAGTTTCCCTGACGTGAAATCAACGAAACTGATTTACTAAAAGTAGTACTCTCCCGCAAAAAGACTTTTAAAATCAACGCTTACCGTGACGTCAATCAAATCTTCGAGAAGTTGTGTAATGCCTATCCTACGGCGTTCGTTTCGGCAGTTTATCTTCCGCATTTAAATCAAATTTGGTTGGGAGCTTCTCCCGAAACCCTTGTTTCTCAAGATG', src/reads/alignment_utils.rs:825:13
This error disappears when I use my basic smith-waterman implementation. The annoying thing is that this is not picked up in the test cases, everything is green and good apparently. But yeah, seems like gkl-rs is adding either additional Matches or Deletions since the reference ends up needing to be longer for the alignment to make sense.
Cheers,
Rhys
Hi Philip,
Just playing around with large SW alignments and have noticed that the maximum alignment length is capped due to i16 constraints:
Line 24 in 9eec442
Cheers,
Rhys
Hi Philip,
Just brainstorming here. I wanted to hear your thoughts on the possibility of taking the existing smith-waterman code and altering it to perform a global alignment instead. The only reason I'd want to do this is to see how GKL-rs would compare against currently existing global pairwise aligners, like WFA https://github.com/smarco/WFA2-lib
I'm hoping to take this on as a side project, but I just want to hear if you think GKL-rs is currently adaptable as it currently exists.
Cheers,
Rhys
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.