gatb / simka Goto Github PK
View Code? Open in Web Editor NEWSimka and SimkaMin are comparative metagenomics methods dedicated to NGS datasets.
Home Page: https://gatb.inria.fr/software/simka/
License: GNU Affero General Public License v3.0
Simka and SimkaMin are comparative metagenomics methods dedicated to NGS datasets.
Home Page: https://gatb.inria.fr/software/simka/
License: GNU Affero General Public License v3.0
Hello there,
I'm running into an issue when executing test_simkaMin.py after the package builds. I have tested this both with the debian package I am building as well as a fresh copy of simka untouched from this github repository. I am getting the following issue that does not result in a 100% success rate of this test:
...
python ../../simkaMin/simkaMin.py -in ../../example/simka_input.txt -out __results__/k21_filter_0-1000_n1 -nb-cores 1 -max-memory 100 -kmer-size 21 -nb-kmers 1000 -bin ../../build/bin/simkaMinCore -max-reads 0 -filter
- TEST ERROR: mat_presenceAbsence_jaccard.csv
res
;A;B;C;D;E
A;0.000000;0.780808;0.940741;0.780808;0.446000
B;0.000000;0.000000;0.733333;0.000000;0.873737
C;0.000000;0.000000;0.000000;0.733333;0.970370
D;0.000000;0.000000;0.000000;0.000000;0.873737
E;0.000000;0.000000;0.000000;0.000000;0.000000
truth
;A;B;C;D;E
A;0.000000;0.783000;0.984000;0.783000;0.446000
B;0.000000;0.000000;0.918000;0.000000;0.875000
C;0.000000;0.000000;0.000000;0.918000;0.992000
D;0.000000;0.000000;0.000000;0.000000;0.875000
E;0.000000;0.000000;0.000000;0.000000;0.000000
This is currently blocking the completion of the debian package1. If there are any recommendations or remedies of how this can be patched or fixed upstream, I'd greatly appreciate it.
Many thanks,
Shayan Doust
My test is stuck since one hour at:
[Merging datasets ] 81.8 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 0.0 % mem: [ 11, 11, 12] MB
I tried the command max-merge 4 but now it is stuck in another spot:
[Merging datasets ] 86.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 0.0 % mem: [ 10, 10, 10] MB 21 already merged (remove file /home/bio/Desktop/simka/example/simka_temp_output/simka_output_temp//merge_synchro/21.ok to merge again)
Hello :)
Thanks to maintain Simka!
Is it possible to obtain the k-mers table counts for each sample analyzed with Simka, to then use it in R to calculate alfa diversity metrics?
Thanks,
L
h5cc
that is installed by hdf5
.lib/libhdf5.settings
that is most likely misplaced or not needed./usr/ports/biology/simka/work/.build/ext/gatb-core/include/Release/hdf5/H5ACpublic.h
For example, it installs /usfr/ports/biology/simka/work/.build/ext/gatb-core/include/Release/hdf5
- a directory where I built simka
.
As it misses the .git folder in the release archive of the source, the commands git submodule init
and git submodule update
from the INSTALL file fail each with the following error:
fatal: Not a git repository (or any of the parent directories): .git
Hi,
I am wondering if it is possible to visualise a subsample of the simka results instead of all of the output.
Thanks,
George
greetings, i enjoy using simka but i have a question regarding manipulating the output files.
specifically, is there a way to convert the 'mat_abundance_braycurtis.csv' into a triangular lower matrix.
i'm really interested in working with a triangular matrix, much the same as that produced by vegan.
for example:
library(vegan)
mat <- matrix(1:9, 3, 3)
mat.dis<-vegdist(mat)
mat.dis
1 2
2 0.11111111
3 0.20000000 0.09090909
Hello, thanks for your work on this tool! I wanted to ask whether it was okay to reuse previous simka merge results on a subset of the data? Eg., I have already run simka on a large set of samples, if I rerun simka using the same temporary directory, but only pass a subset of the original files as the files of interest, will this give the correct distance metrics for this subset of samples?
Hi
I have made a simple comparison between two fasta files. Simka performed well, here is one of the matrix files :
;id1;id2
id1;0.000000;0.991121
id2;0.991121;0.000000
but the scripts fail to create the heatmaps :
python create_heatmaps.py ../build/bin/test.txt/
Error in graphics:::plotHclust(n1, merge, height, order(x$order), hang, :
entrée de dendrogramme incorrecte
Calls: plot -> plot.hclust ->
Exécution arrêtée
Second question :
Concerning the results, is it a factor or a percentage ?
Thanks for the help.
Hi,
I'm running simka on around 1000 samples with a total of 45 billion reads.
simka -in simka_input.txt -out results_simka -out-tmp temp_output -simple-dist -max-count 6 -max-merge 18 -nb-cores 112 -max-memory 100000
In the first two days it creates the following folders:
drwxr-xr-t 2 28039 Jun 17 14:32 input
drwxr-xr-t 2 0 Jun 17 14:32 merge_synchro
drwxr-xr-t 2 0 Jun 17 14:32 stats
drwxr-xr-t 2 0 Jun 17 14:32 job_count
drwxr-xr-t 2 0 Jun 17 14:32 job_merge
-rw-r--r-- 1 10989 Jun 17 14:32 datasetIds
-rw-r--r-- 1 46824 Jun 17 14:38 config.h5
drwxr-xr-t 344 8782 Jun 17 14:38 solid
drwxr-xr-t 2 38016 Jun 19 06:13 log
drwxr-xr-t 7 140 Jun 19 06:15 temp
drwxr-xr-t 2 31850 Jun 19 06:15 kmercount_per_partition
drwxr-xr-t 2 30854 Jun 19 06:15 count_synchro
But since June 19th nothing has happened, but the job is still running. Is this normal? Should I keep waiting?
Sometimes, especially when we have a lot of input files, the job stops at the merge stage (v1.3.2 and v1.4.0).
Dear developers,
I find really strange to set the figures in inches. The SI system is in meters, even NASA performs science in meters.
Can we have at least the option to set it in cm ?
Hi, I run into this error via using this input (build from the latest version)
WP1310: /condo/ieg/qiqi/Haibei_metaG/WP1310_paired_1.fastq.gz ; /condo/ieg/qiqi/Haibei_metaG/WP1310_paired_2.fastq.gz
the error is: ERROR: Can't open dataset: WP1310
Any idea why?
Testing ran successfully.
It takes me half a day to find out why. Support for fastq.gz is not ready?
Thanks,
Jianshu
Hello,
Simka has been packaged1 from the Debian Med Team. However, there is a regression only on Arm64 architecture that is stopping unstable to testing migration. I'd rather sort this out before the next Debian release freeze.
Here is the log. The autopkgtest script is here (this is causing a regression and is what gets executed). Any ideas?
Thanks!
Can the kmers corresponding to a partition be recovered?
Hello! I hope you are well. simka looks like a great tool!
How are the samples normalized with the -max-reads 0 flag? I did not see a description of this in the paper.
Have you considered normalization options such as suggested here:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003531 ?
Or transformation options suggested here:
https://www.frontiersin.org/articles/10.3389/fmicb.2017.02224/full ?
Since the default is not to normalize, is the intended workflow to subset all samples to the same number of reads prior to running simka?
Have you tested how much the size discrepancies actually affect the various distance metrics?
Thanks for the clarification.
best,
Roth
I'd like to congratulate the developers on a great metagenomics software tool. I'd like to recommend the developers add simka to the bioconda channel as a package as it would enable easier installation and adoption by the community.
Best wishes,
Muslih.
Hi,
I would like to run Simka on multiple samples with the -max-reads option to deal with various sequencing depth.
However, the samples have also various read length.
I guess this may slighlty bias the results as longer reads increase the total number of kmers.
Would it be possible to add an option to trim all reads to a given length?
Florian
Hello,
I am packaging simka as a debian package1.
Cloning and compiling directly from source completes successfully. However, with regards to debian policy, gatb-core is already available as a debian package in the repository. A debian package library takes precedence over gatb-core being built with simka. As a result, I have made a patch2 to which cmake is adapted to use the system available gatb-core instead of that in thirdparty. This seemingly brings up issues during the linking stage with SimkaAlgorithm.cpp3.
I would be grateful if this patch is looked at or I am pointed into the correct path. I am unsure if gatb-core or some other cmake in gatb-core injects something into simka that causes a successful build.
Thanks,
Shayan Doust
include/hdf5/H5ACpublic.h
etc.
are installed by hdf5.
Dear SimkaMin Dev,
I recently stumbled upon your Simkamin tool and tried to use it to compare my 4000 datasets against each other to get information on the similarity of these samples.
I found something odd in the output matrices. They don’t seem to be symmetric. Where the upper triangular contains mostly values between 0.0 and 1, the lower triangular matrix contains mostly but not exclusively zeros. I would like understand if the lower triangular matrix would be empty but a non-symmetric output is strange.
In fact it seems that there is always a subpart that is symmetric but its mostly not.
I attached a screenshot of parts of the matrix.
Do you know what to do with this information? Should I only use the column-based distances?
Best and thanks,
Hans
--Hi,
i have to compare a multifasta file (200000 sequences) with a chromosomic region. I have already done that with kmer-db tool (https://doi.org/10.1093/bioinformatics/bty610) and i need to compare the results of kmer-db with other tool such as simka.
But i don't find the correct command to do this.
Kmer-db compute a list of distances between each sequence of the multifasta file and the chromosomic region.
this is my input file:
simka_input.txt:
A: multifasta.fasta
B: chr1_region.fasta
the command line i used:
simka -in ./simka_input.txt -out ./simka_results/ -out-tmp ./simka_temp_output -max-memory 128000 -nb-cores 24
in the simka_results directory: zcat mat_abundance_jaccard.csv.gz
;A;B
A;0.000000;0.999993
B;0.999993;0.000000
i have only 2 values whereas i have 200000 sequences in my input file, i don't understand. it seems that simka concatenates all the sequences of the multifasta file and then compares with the other file. How to avoid that ?
thank you --
Hi,
The simple test has been stuck for some hours and all files in the the temporary folder merge_synchro/ are empty. Nothing has changed in my folders since 5 hours ago.
Here is the error file :
^M[Counting datasets ] 0 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: -1.0 % mem: [ 12, 12, 12] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 83.3 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 20 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 80.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 40 % elapsed: 0 min 1 sec remaining: 0 min 2 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 60 % elapsed: 0 min 1 sec remaining: 0 min 1 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 80 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Counting datasets ] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec cpu: 16.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 0 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: -1.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 2.08 % elapsed: 0 min 0 sec remaining: 0 min 10 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 2.08 % elapsed: 0 min 0 sec remaining: 0 min 10 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 4.17 % elapsed: 0 min 0 sec remaining: 0 min 5 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 4.17 % elapsed: 0 min 0 sec remaining: 0 min 5 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 6.25 % elapsed: 0 min 0 sec remaining: 0 min 3 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 6.25 % elapsed: 0 min 0 sec remaining: 0 min 3 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 8.33 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 8.33 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 19.0 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 10.4 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 10.4 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 12.5 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 12.5 % elapsed: 0 min 0 sec remaining: 0 min 2 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 14.6 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 14.6 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 16.7 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 16.7 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 18.8 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 18.8 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 20.8 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 20.8 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 22.9 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 22.9 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 25 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 25 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 25 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 27.1 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 27.1 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 29.2 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 29.2 % elapsed: 0 min 0 sec remaining: 0 min 1 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 31.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 31.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 33.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 33.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 35.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 35.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 37.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 37.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 39.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 39.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 41.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 41.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 43.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 43.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 45.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 45.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 47.9 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 47.9 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 50 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 50 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 50 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 52.1 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 52.1 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 54.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 54.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 56.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 56.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 58.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 58.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 60.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 60.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 18.2 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 62.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 62.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 64.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 64.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 66.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 66.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 17.4 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 68.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 68.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 70.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 70.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 72.9 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 72.9 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 75 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 75 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 75 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 77.1 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 77.1 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 79.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 79.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 81.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 81.2 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 83.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 83.3 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 85.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 85.4 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 87.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 87.5 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 89.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 89.6 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 91.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 91.7 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 93.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 93.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 95.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ^M[Merging datasets ] 95.8 % elapsed: 0 min 0 sec remaining: 0 min 0 sec cpu: 21.7 % mem: [ 13, 13, 13] MB ~
and the log file :
`
Creating input
Nb input datasets: 5
Reads per sample used: all
Maximum ressources used by Simka:
- 5 simultaneous processes for counting the kmers (per job: 9 cores, 1000 MB memory)
- 48 simultaneous processes for merging the kmer counts (per job: 1 cores, memory undefined)
Nb partitions: 48 partitions
Counting k-mers... (log files are ./simka_temp_output/simka_output_temp//log/count_*)
Kmer repartition
0: 270
1: 294
2: 300
3: 323
4: 344
5: 324
6: 275
7: 239
8: 296
9: 298
10: 297
11: 304
12: 278
13: 312
14: 311
15: 320
16: 362
17: 320
18: 303
19: 340
20: 293
21: 269
22: 358
23: 293
24: 291
25: 274
26: 344
27: 305
28: 323
29: 308
30: 302
31: 277
32: 300
33: 309
34: 338
35: 278
36: 310
37: 300
38: 284
39: 330
40: 303
41: 280
42: 317
43: 324
44: 286
45: 388
46: 294
47: 282
Merging k-mer counts and computing distances... (log files are /simka_temp_output/simka_output_temp//log/merge_*)
`
How long should this test take ?
Thank you very much in advance,
Best regards
Hi there,
I built an input file of my reads. It said can't read my reads with the same error when I ran simka. The example tested successfully. Could someone show me where the problem is? Thanks very much!
Here is the error:
Creating input
Nb input datasets: 1
HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
#000:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5A.c
line 425 in H5Aopen(): unable to load attribute info from object header
for attribute: 'version'
major: Attribute
minor: Can't open object
#1:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5Aint.c
line 433 in H5A__open(): unable to load attribute info from object
header for attribute: 'version'
major: Attribute
minor: Can't open object
#2:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5Oattribute.c
line 515 in H5O__attr_open_by_name(): can't locate attribute: 'version'
major: Attribute
minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
#000:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5A.c
line 704 in H5Aget_space(): not an attribute
major: Invalid arguments to routine
minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
#000:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5S.c
line 1013 in H5Sget_simple_extent_dims(): not a dataspace
major: Invalid arguments to routine
minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
#000:
/scratch/fwang/simka-v1.5.3-Source/thirdparty/gatb-core/gatb-core/thirdparty/hdf5/src/H5A.c
line 662 in H5Aread(): not an attribute
major: Invalid arguments to routine
minor: Inappropriate type
ERROR: Can't open dataset: ID1
Here are the IDs of my reads as input:
ID1: 2018031910_paired_1.fasta ; 2018031910_paired_2.fasta
ID2: 201803193_paired_1.fasta ; 201803193_paired_2.fasta
ID3: 20180319_paired_1.fasta ; 20180319_paired_2.fasta
ID4: 20180403_paired_1.fasta ; 20180403_paired_2.fasta
ID5: 20180405_paired_1.fasta ; 20180405_paired_2.fasta
ID6: 20180410_paired_1.fasta ; 20180410_paired_2.fasta
ID7: 2018041210_paired_1.fasta ; 2018041210_paired_2.fasta
ID8: 201804123_paired_1.fasta ; 201804123_paired_2.fasta
ID9: 20180412_paired_1.fasta ; 20180412_paired_2.fasta
ID10: 2018041710_paired_1.fasta ; 2018041710_paired_2.fasta
ID11: 201804173_paired_1.fasta ; 201804173_paired_2.fasta
ID12: 20180417_paired_1.fasta ; 20180417_paired_2.fasta
ID13: 20180419_paired_1.fasta ; 20180419_paired_2.fasta
ID14: 20180424_paired_1.fasta ; 20180424_paired_2.fasta
ID15: 2018042610_paired_1.fasta ; 2018042610_paired_2.fasta
ID16: 201804263_paired_1.fasta ; 201804263_paired_2.fasta
ID17: 20180426_paired_1.fasta ; 20180426_paired_2.fasta
ID18: 20180502_paired_1.fasta ; 20180502_paired_2.fasta
ID19: 20180503_paired_1.fasta ; 20180503_paired_2.fasta
ID20: 2018050810_paired_1.fasta ; 2018050810_paired_2.fasta
ID21: 201805083_paired_1.fasta ; 201805083_paired_2.fasta
ID22: 20180508_paired_1.fasta ; 20180508_paired_2.fasta
ID23: 2018051110_paired_1.fasta ; 2018051110_paired_2.fasta
ID24: 201805113_paired_1.fasta ; 201805113_paired_2.fasta
ID25: 20180511_paired_1.fasta ; 20180511_paired_2.fasta
ID26: 20180515_paired_1.fasta ; 20180515_paired_2.fasta
ID27: 20180517_paired_1.fasta ; 20180517_paired_2.fasta
ID28: 2018052210_paired_1.fasta ; 2018052210_paired_2.fasta
ID29: 201805223_paired_1.fasta ; 201805223_paired_2.fasta
ID30: 20180522_paired_1.fasta ; 20180522_paired_2.fasta
ID31: 20180524_paired_1.fasta ; 20180524_paired_2.fasta
ID32: 2018052910_paired_1.fasta ; 2018052910_paired_2.fasta
ID33: 201805293_paired_1.fasta ; 201805293_paired_2.fasta
ID34: 20180529_paired_1.fasta ; 20180529_paired_2.fasta
Hi, thanks for this useful tool! I'm running into an error with a paired reads file, organized as:
1_1b28d4: 1_1b28d4-t_1.fq.gz ; 1_1b28d4-t_2.fq.gz 1_89e808: 1_89e808-t_1.fq.gz ; 1_89e808-t_2.fq.gz
...
I would expect for my output matrices to have one row for each sample, like:
1_1b28d4 | 1_89e808 | |
---|---|---|
1_1b28d4 | 0 | 0.774246 |
1_89e808 | 0.774246 | 0 |
However, instead my table has duplicate rows for each of the paired ends, which cannot be distinguished:
1_1b28d4 | 1_89e808 | |
---|---|---|
1_1b28d4 | 0 | 0.774246 |
1_89e808 | 0.774246 | 0 |
1_1b28d4 | 0 | 0.774246 |
1_89e808 | 0.774246 | 0 |
Is there a way to work around this?
Hi,
I am not sure how to run a lot of samples on different computer node at the same time and then pool those result to calculate distance matrix any idea how?
Thanks,
Jianshu
Hi,
Simka crashes with a segfault when using an empty file.
$touch sample.fastq
$echo 'sample: sample.fastq' > simka_in.txt
$/usr/local/bin/simka -in simka_in.txt -out-tmp /tmp
Creating input
Nb input datasets: 1
Reads per sample used: all
Maximum ressources used by Simka:
- 1 simultaneous processes for counting the kmers (per job: 16 cores, 5000 MB memory)
- 16 simultaneous processes for merging the kmer counts (per job: 1 cores, memory undefined)
Segmentation fault
An explicit error message would be welcomed.
Thanks,
Florian
lala
[0.0%] Computing k-mer spectrums [Time: 0:00:00]A job failed (simka_test_tmp/simka_database/kmer_spectrums/A/), exiting simka
Greetings,
running the test scripts of simka while stabilizing the upcoming Debian 11, the CI team noted that the program hangs under certain circumstances. Further investigations on my end seemed to reveal that the test was hanging past 9 cores, so as a work around, we are capping the test suite to use no further than 8 cores for the moment. You can refer to Debian bug #986256 for more details.
Do you think this would be an issue within simka, or more something intrinsic to the test data topology?
Kind Regards,
Étienne.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.