GithubHelp home page GithubHelp logo

algbio / themisto Goto Github PK

View Code? Open in Web Editor NEW
46.0 46.0 3.0 16.55 MB

Space-efficient pseudoalignment with a colored de Bruijn graph

License: GNU General Public License v2.0

C++ 96.22% CMake 1.77% Dockerfile 0.22% Python 1.77% Makefile 0.02%

themisto's People

Contributors

cowkeyman avatar iosfwd avatar jnalanko avatar tmaklin avatar velimakinen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

themisto's Issues

Compilation issue

Hi @jnalanko and thank you for this tool!

I'm running into some compilation issue after following the instructions.
I'm compiling on Ubuntu 21.10 with gcc version 11.2.0.

make[2]: *** No rule to make target 'external/bzip2/libbz2.a', needed by 'bin/themisto'.  Stop.                                                                                   
make[1]: *** [CMakeFiles/Makefile2:444: CMakeFiles/themisto.dir/all] Error 2                                                                                                      
make: *** [Makefile:171: all] Error 2 

I've tried to re-run make as suggested with no luck.
I've also tried the pre-compiled for Linux but it fails with:

Sorting KMC database
in1: 0% Illegal instruction (core dumped)

Any help appreciated!

core dumped during sorting KMC database

Hi @jnalanko , I tried to test themisto with 101 microbial genomes, but it crashed in the step of Sorting KMC database. The disk has enough space. And the same error occurred in the second trial.

$ fd fna.gz$ 101genomes/ > t.txt

$ themisto build -t 16 --temp-dir t -f -i t.txt -k 21 -o themisto 

42.4310 Thu Apr 13 09:16:31 2023 Themisto-3.0.0-13-gfea4f59
42.5140 Thu Apr 13 09:16:31 2023 Maximum k-mer length (size of the de Bruijn graph node labels): 31
43.0990 Thu Apr 13 09:16:31 2023 Build configuration:
Sequence file = t.txt
Index de Bruijn graph output file = themisto.tdbg
Index coloring output file = themisto.tcolors
Temporary directory = t
k = 21
Reverse complements = true
Number of threads = 16
Memory gigabytes = 2
Manual colors = false
Sequence colors = false
File colors = true
Load DBG = false
Handling of non-ACGT characters = delete
Coloring structure type: sdsl-hybrid
Verbosity = normal
43.1500 Thu Apr 13 09:16:31 2023 Starting
43.1670 Thu Apr 13 09:16:31 2023 Running GGCAT
Allocator initialized: mem: 2 GiB chunks: 8192 log2: 18
Started phase: reads bucketing prev stats: 
Temp buckets files size: 2.01 MiB
Finished phase: reads bucketing. phase duration: 2.87s gtime: 2.87s
Started phase: kmers merge prev stats: 
Processing bucket 295 of [1024[R:9599]]  ptime: 10.52s gtime: 13.39s phase eta: 27s est. tot: 37s
Processing bucket 576 of [1024[R:18999]]  ptime: 20.72s gtime: 23.60s phase eta: 16s est. tot: 37s
Processing bucket 847 of [1024[R:28284]]  ptime: 30.93s gtime: 33.80s phase eta: 6s est. tot: 37s
Total color subsets: 124451
Finished phase: kmers merge. phase duration: 38.44s gtime: 41.31s
Started phase: hashes sorting prev stats: 
Finished phase: hashes sorting. phase duration: 4.78s gtime: 46.09s
Started phase: links compaction prev stats: 
Iteration: 2
Remaining: 68628492  ptime: 14.27s gtime: 60.35s
Iteration: 6
Remaining: 15103529  ptime: 22.37s gtime: 68.45s
Completed compaction with 23 iters
Finished phase: links compaction. phase duration: 27.39s gtime: 73.48s
Started phase: reads reorganization prev stats: 
Finished phase: reads reorganization. phase duration: 6.92s gtime: 80.40s
Started phase: unitigs building prev stats: 
Finished phase: unitigs building. phase duration: 11.54s gtime: 91.94s
Started phase: maximal unitigs links building [step 1] prev stats: 
Finished phase: maximal unitigs links building [step 1]. phase duration: 657.82ms gtime: 92.60s
Started phase: maximal unitigs links building [step 2] prev stats: 
Finished phase: maximal unitigs links building [step 2]. phase duration: 1.05s gtime: 93.65s
Started phase: maximal unitigs links building [step 3] prev stats: 
Finished phase: maximal unitigs links building [step 3]. phase duration: 3.97s gtime: 97.62s
Compacted De Bruijn graph construction completed.
TOTAL TIME: 97.62s
Final stats:
        phase: reads bucketing  => 2.87s
        phase: kmers merge      => 38.44s
        phase: hashes sorting   => 4.78s
        phase: links compaction         => 27.39s
        phase: reads reorganization     => 6.92s
        phase: unitigs building         => 11.54s
        phase: maximal unitigs links building [step 1]  => 657.82ms
        phase: maximal unitigs links building [step 2]  => 1.05s
        phase: maximal unitigs links building [step 3]  => 3.97s
113923.5230 Thu Apr 13 09:18:25 2023 Building SBWT
113923.5740 Thu Apr 13 09:18:25 2023 Running KMC counter
**********************************************************************************************************************************
Stage 1: 100%
Stage 2: 100%


135340.1850 Thu Apr 13 09:18:46 2023 Sorting KMC database
in1: 0% Illegal instruction (core dumped)

$ ll t/
total 15G
-rw-r--r-- 1 shenwei shenwei 601K Apr 13 09:33 f26cCPUbWf.colors.dat
-rw-r--r-- 1 shenwei shenwei 3.2G Apr 13 09:34 f26cCPUbWf.fa
-rw-r--r-- 1 shenwei shenwei 1.5G Apr 13 09:34 iYDXFd2eZn.fa
-rw-r--r-- 1 shenwei shenwei 5.1M Apr 13 09:34 kmers1WiOuZI6Ad.kmc_pre
-rw-r--r-- 1 shenwei shenwei 9.5G Apr 13 09:34 kmers1WiOuZI6Ad.kmc_suf

$ ll  themisto.t*
-rw-r--r-- 1 shenwei shenwei 0 Apr 13 09:32 themisto.tcolors
-rw-r--r-- 1 shenwei shenwei 0 Apr 13 09:32 themisto.tdbg

BOSS_TEST.construction hangs sometimes

GDB backtrace when the hang happens:

#0  0x00007fd7468cbd2d in __GI___pthread_timedjoin_ex (threadid=140562561988352, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89
#1  0x00007fd7463affb3 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x000055e68a3253f6 in CExceptionAwareThread::join (this=0x55e68cf85490) at /usr/include/c++/8/bits/unique_ptr.h:345
#3  CKMC<2u>::ProcessStage2_impl (this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc.h:1659
#4  0x000055e68a24ec19 in CKMC<2u>::ProcessStage2 (this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc.h:1802
#5  KMC::CApplication<2u>::ProcessStage2 (stage2Params=..., this=0x55e68ce6c530) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:62
#6  KMC::CApplication<2u>::ProcessStage2 (stage2Params=..., this=0x55e68ce6c530) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:57
#7  KMC::CApplication<3u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#8  KMC::CApplication<4u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#9  KMC::CApplication<5u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#10 KMC::CApplication<6u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#11 KMC::CApplication<7u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#12 KMC::CApplication<8u>::ProcessStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:65
#13 KMC::Runner::RunnerImpl::RunStage2 (stage2Params=..., this=<optimized out>) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:424
#14 KMC::Runner::RunStage2 (this=<optimized out>, params=...) at /home/niklas/code/Themisto/KMC/kmc_core/kmc_runner.cpp:439
#15 0x000055e68a1beda8 in run_kmc (input_files=std::vector of length 1, capacity 1 = {...}, k=39, n_threads=2, ram_gigas=2, min_abundance=1, max_abundance=1000000000)
    at /home/niklas/code/Themisto/src/KMC_wrapper.cpp:47
#16 0x000055e68a0be958 in test_construction (tcase=..., reverse_complements=false) at /home/niklas/code/Themisto/include/libwheeler/BOSS_tests.hh:158
#17 0x000055e68a0bef5b in BOSS_TEST_construction_Test::TestBody (this=0x55e68cd88540) at /home/niklas/code/Themisto/include/libwheeler/BOSS_tests.hh:172
#18 0x000055e68a1fc56c in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x55e68cd88540, method=&virtual testing::Test::TestBody(), 
    location=0x55e68a47a323 "the test body") at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2599
#19 0x000055e68a1f63fb in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x55e68cd88540, method=&virtual testing::Test::TestBody(), 
    location=0x55e68a47a323 "the test body") at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2635
#20 0x000055e68a1d5344 in testing::Test::Run (this=0x55e68cd88540) at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2674
#21 0x000055e68a1d5d61 in testing::TestInfo::Run (this=0x55e68cd79d20) at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2853
#22 0x000055e68a1d6648 in testing::TestSuite::Run (this=0x55e68cd798b0) at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:3012
#23 0x000055e68a1e5a04 in testing::internal::UnitTestImpl::RunAllTests (this=0x55e68cd750b0) at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:5870
#24 0x000055e68a1fd75e in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x55e68cd750b0, 
    method=(bool (testing::internal::UnitTestImpl::*)(class testing::internal::UnitTestImpl * const)) 0x55e68a1e55c2 <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x55e68a47ad80 "auxiliary test code (environments or event listeners)") at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2599
#25 0x000055e68a1f730f in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x55e68cd750b0, 
    method=(bool (testing::internal::UnitTestImpl::*)(class testing::internal::UnitTestImpl * const)) 0x55e68a1e55c2 <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x55e68a47ad80 "auxiliary test code (environments or event listeners)") at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:2635
#26 0x000055e68a1e40dc in testing::UnitTest::Run (this=0x55e68a7779e0 <testing::UnitTest::GetInstance()::instance>)
    at /home/niklas/code/Themisto/googletest/googletest/src/gtest.cc:5444
#27 0x000055e68a0cc5f6 in RUN_ALL_TESTS () at /home/niklas/code/Themisto/googletest/googletest/include/gtest/gtest.h:2293
#28 0x000055e68a0c43ca in main (argc=2, argv=0x7fffb7f7ca98) at /home/niklas/code/Themisto/tests/test_main.cpp:32

Themisto doesn't work with any input data (Error: Cannot open temporary file tmp/kmc_00250.bin)

With every input FASTA file I'm getting the following error:

$ /Users/karel/github/themisto/build/bin/build_index --k 31 --input-file small.fasta --index-dir index --temp-dir tmp
0.0250 Mon Sep 14 15:05:53 2020 Themisto-v0.2.0-1-gd8e44f5
Input file = small.fasta
Input format = fasta
Index directory = index
Temporary directory = tmp
k = 31
Number of threads = 1
Memory megabytes = 1000
Automatic colors = false
Load BOSS = false
0.0260 Mon Sep 14 15:05:53 2020 Starting
0.0260 Mon Sep 14 15:05:53 2020 Making all characters upper case and replacing non-{A,C,G,T} characters with random characeters from {A,C,G,T}
0.0260 Mon Sep 14 15:05:53 2020 Replaced 0 characters
0.0270 Mon Sep 14 15:05:53 2020 Building BOSS
0.0270 Mon Sep 14 15:05:53 2020 Listing (k+2)-mers
Calling KMC with: kmc -fm -k33 -b -m1 -ci1 -cs1 -cx4294967295 -t1 tmp/seqs-p0cfNR6pGewk8dL4ndt8NusCT tmp/KMCkqL5Ep8uXIfD7yGaYdY16rptL tmp 
**
Error: Cannot open temporary file tmp/kmc_00250.bin

Randomize non-ACGT and then build colors by --load-dbg may fail?

When using the randomization for non-ACGT characters without building colors, and building colors again on another run, the program should crash because if we don't get exactly the same random choices, some of the k-mers will change and this messes up the coloring. But at the moment it works because we don't give a seed value for std::rand so according to the standard it will always be seeded with the value 1. No other piece of code uses std::rand before the coloring so by luck we happen to get the random choices. But this might break in the future, so it should be fixed.

Odd behavior if input multifasta to `build` contains empty sequences

Affects at least Themisto v2.1.0.

If the input multifasta file to themisto build contains a sequence with no nucleotides and the built index is used in the pseudoalign command, then the pseudoalignment seems to skip all sequences that come after the empty sequence in the fasta file and reports only matches in the ones that came before it. This seems a bit weird to me :)

I think the intuitive behavior in this case would be to either warn about the empty sequences during index building and prune them from the index, or exit the build process with a helpful error instructing the user to fix the issue (which would be my preferred option).

Reproducable example

Input data

  • example.fasta
>false_match
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
>empty_seq
>true_match
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
  • example.fastq
>read
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

Commands ran

themisto build -k 31 -i example.fasta -o index --temp-dir tmp -m 2000 -t 4
themisto pseudoalign -q example.fastq -o example.aln -i index --temp-dir tmp -t 4

Output

  • What is returned:
0 
  • "Expected" output:
0 2 

Sequence and reverse complement generating different unitigs

If I give themisto-build a file with just a sequence and its reverse complement, extract-unitigs is generating two different unitigs -- is this the expected behavior?

e.g. if 80.fna contains

>k80
ATCAGCAGCGACATGGCGGTCATCACCGTAGTCGAGGCAAGCAATAATGGACGGCGCCCG
ACGTGGTCGATGATCGCAGA
>rc.k80
TCTGCGATCATCGACCACGTCGGGCGCCGTCCATTATTGCTTGCCTCGACTACGGTGATG
ACCGCCATGTCGCTGCTGAT

and then run
themisto build -k 31 -i 80.fna -o 80.k31 --temp-dir .
themisto extract-unitigs -i 80.k31 --colors-out 80.k31.colors --gfa-out 80.k31.gfa

I get a file with two lines in the colors file and two segments in the GFA file

H VN:Z:1.0
S 86 ATCAGCAGCGACATGGCGGTCATCACCGTAGTCGAGGCAAGCAATAATGGACGGCGCCCGACGTGGTCGATGATCGCAGA
S 77 TCTGCGATCATCGACCACGTCGGGCGCCGTCCATTATTGCTTGCCTCGACTACGGTGATGACCGCCATGTCGCTGCTGAT

Indexes for the same de Bruijn graph built from a different number of copies of the same FASTA file have different sizes

When I wanted to evaluate how much space is occupied by the BOSS representation as implemented in Themisto, the following simple test failed: I build the index from a FASTA file and then from two copies of the same file. Clearly, both files contain the same k-mers. However, the obtained indexes are different:

cat ../pangenome-ngonorrhoeae.assembly18.fa > ngono.asm18_1cop.fa
(
        cat ../pangenome-ngonorrhoeae.assembly18.fa
        cat ../pangenome-ngonorrhoeae.assembly18.fa
) > ngono.asm18_2cop.fa

../../../themisto/bin/build_index --mem-megas 10000 --k 18 --input-file ngono.asm18_1cop.fa --n-threads 4 --index-dir index1cop --temp-dir tmp1cop

../../../themisto/bin/build_index --mem-megas 10000 --k 18 --input-file ngono.asm18_2cop.fa --n-threads 4 --index-dir index2cop --temp-dir tmp2cop
tar cvf - index1cop > index1cop.tar
tar cvf - index2cop > index2cop.tar
ls -alh index1cop.tar index2cop.tar 

-rw-rw-r-- 1 kb219 kb219 11M Nov  7 01:44 index1cop.tar
-rw-rw-r-- 1 kb219 kb219 13M Nov  7 01:44 index2cop.tar
(        echo index1cop
        ls -alh index1cop/*
        echo index2cop
        ls -alh index2cop/*
)

index1cop
-rw-rw-r-- 1 kb219 kb219    0 Nov  7 01:38 index1cop/boss-alphabet
-rw-rw-r-- 1 kb219 kb219 1.8K Nov  7 01:38 index1cop/boss-C
-rw-rw-r-- 1 kb219 kb219   11 Nov  7 01:38 index1cop/boss-constants
-rw-rw-r-- 1 kb219 kb219 2.2M Nov  7 01:38 index1cop/boss-indegs
-rw-rw-r-- 1 kb219 kb219 549K Nov  7 01:38 index1cop/boss-indegs_rs
-rw-rw-r-- 1 kb219 kb219 252K Nov  7 01:38 index1cop/boss-indegs_ss0
-rw-rw-r-- 1 kb219 kb219 264K Nov  7 01:38 index1cop/boss-indegs_ss1
-rw-rw-r-- 1 kb219 kb219 2.2M Nov  7 01:38 index1cop/boss-outdegs
-rw-rw-r-- 1 kb219 kb219 549K Nov  7 01:38 index1cop/boss-outdegs_rs
-rw-rw-r-- 1 kb219 kb219 252K Nov  7 01:38 index1cop/boss-outdegs_ss0
-rw-rw-r-- 1 kb219 kb219 264K Nov  7 01:38 index1cop/boss-outdegs_ss1
-rw-rw-r-- 1 kb219 kb219 3.7M Nov  7 01:38 index1cop/boss-outlabels-wt
index2cop
-rw-rw-r-- 1 kb219 kb219    0 Nov  7 01:44 index2cop/boss-alphabet
-rw-rw-r-- 1 kb219 kb219 1.9K Nov  7 01:44 index2cop/boss-C
-rw-rw-r-- 1 kb219 kb219   12 Nov  7 01:44 index2cop/boss-constants
-rw-rw-r-- 1 kb219 kb219 2.6M Nov  7 01:44 index2cop/boss-indegs
-rw-rw-r-- 1 kb219 kb219 661K Nov  7 01:44 index2cop/boss-indegs_rs
-rw-rw-r-- 1 kb219 kb219 303K Nov  7 01:44 index2cop/boss-indegs_ss0
-rw-rw-r-- 1 kb219 kb219 318K Nov  7 01:44 index2cop/boss-indegs_ss1
-rw-rw-r-- 1 kb219 kb219 2.6M Nov  7 01:44 index2cop/boss-outdegs
-rw-rw-r-- 1 kb219 kb219 661K Nov  7 01:44 index2cop/boss-outdegs_rs
-rw-rw-r-- 1 kb219 kb219 303K Nov  7 01:44 index2cop/boss-outdegs_ss0
-rw-rw-r-- 1 kb219 kb219 318K Nov  7 01:44 index2cop/boss-outdegs_ss1
-rw-rw-r-- 1 kb219 kb219 4.4M Nov  7 01:44 index2cop/boss-outlabels-wt

OS: OSX, Themisto version: 21a48ec

themisto build dies when multithreading

hi there, trying to run mGEMS via demix_check and getting an error when using --threads >1-2
don't know if this is a known thing; trying it with the newest version of themisto

`terminate called after throwing an instance of 'std::bad_alloc'` when asking for more memory than is available

Hi,

Running build_index with --mem-megas higher than what is available on the machine (16000 in my example) will terminate with an uncaught std::bad_alloc exception. The output from an example run where this happens is the following text:

[temaklin@xps13 example]$ ~/Tools/themisto/build/bin/build_index -k31 --mem-megas 100000 --input-file example.fasta --index-dir index --temp-dir index/tmp
0.0270 Tue Oct 19 16:08:06 2021 Themisto-v1.1.0-3-ge591cd2
0.0270 Tue Oct 19 16:08:06 2021 Maximum k-mer length (size of the de Bruijn graph node labels): 60
Input file = example.fasta
Input format = fasta
Index directory = index
Temporary directory = index/tmp
k = 31
Number of threads = 1
Memory megabytes = 100000
Automatic colors = false
Load BOSS = false
Preprocessing buffer size = 4096
0.0280 Tue Oct 19 16:08:06 2021 Starting
0.0280 Tue Oct 19 16:08:06 2021 Making all characters upper case and replacing non-{A,C,G,T} characters with random characters from {A,C,G,T}
0.2280 Tue Oct 19 16:08:06 2021 Replaced 0 characters
0.2280 Tue Oct 19 16:08:06 2021 Building BOSS
0.2280 Tue Oct 19 16:08:06 2021 Building KMC database
Validating input alphabet
Calling KMC with: kmc -b -fm -k32 -m93 -ci1 -cs1 -cx4294967295 -t1 index/tmp/seqs-emQeceWVoPCgQwZvP0I7XacvO index/tmp/KMCmS1OFXxZcnrO6MakHmnMkZ7I1 index/tmp 
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
caught signal: 6
Cleaning up temporary files
Aborting

In other cases, where the call to build_index is somehow wrong, the exceptions are caught and Themisto gives more helpful error messages before terminating. Should this case in similar manner suggest to check the value of --mem-megas before terminating?

27092 Killed: 9

I tested the method with simplitigs on of the human genome (HG38, k=31, https://zenodo.org/record/3770419/files/hg38-simplitigs31.fa.gz?download=1). I always get the error Killed: 9, which happens after the program gets stuck on Dumping k-mers to disk for a very long time (>1h).

  • OS: OS X
  • Command /Users/karel/github/themisto/build/bin/build_index --mem-megas 10000 --k 31 --input-file hg38-simplitigs31.fasta --n-threads 8 --index-dir index --temp-dir tmp
0.0230 Mon Sep 14 17:25:06 2020 Themisto-v0.2.0-1-gd8e44f5
Input file = hg38-simplitigs31.fasta
Input format = fasta
Index directory = index
Temporary directory = tmp
k = 31
Number of threads = 8
Memory megabytes = 10000
Automatic colors = false
Load BOSS = false
0.0250 Mon Sep 14 17:25:06 2020 Starting
0.0250 Mon Sep 14 17:25:06 2020 Making all characters upper case and replacing non-{A,C,G,T} characters with random characeters from {A,C,G,T}
69.9690 Mon Sep 14 17:26:16 2020 Replaced 0 characters
69.9750 Mon Sep 14 17:26:16 2020 Building BOSS
69.9750 Mon Sep 14 17:26:16 2020 Listing (k+2)-mers
Calling KMC with: kmc -fm -k33 -b -m10 -ci1 -cs1 -cx4294967295 -t8 tmp/seqs-AetcbrX4gJpCZDtoyYuBdjcAh tmp/KMCfyJijWik8oHkSTcWJIAzu3wE9 tmp 
**********************
Stage 1: 100%
Stage 2: 100%
Dumping k-mers to disk
./construct.sh: line 13: 27092 Killed: 9               /Users/karel/github/themisto/build/bin/build_index --mem-megas 10000 --k 31 --input-file hg38-simplitigs31.fasta --n-threads 8 --index-dir index --temp-dir tmp

Themisto executable or conda version?

Hi,
I would like to use your tool mGEMS, because I was happy with the results that I got with mSWEEP. Unfortunately, I cannot use Themisto since I'm working on a high performance cluster with multiple users. Do you plan on providing a conda version or an executable file for Themisto?

Many thanks and best regards,
Josephine

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.