czbiohub-sf / midas2 Goto Github PK
View Code? Open in Web Editor NEWMetagenomic Intra-Species Diversity Analysis 2
License: MIT License
Metagenomic Intra-Species Diversity Analysis 2
License: MIT License
Hi! I download one selected species database, using code midas2 database --download --midasdb_name gtdb --midasdb_dir my_midasdb_Catellicoccus --species_list species_list.txt
(only GTDB have the species i want ,and there is only one species number in species_list.txt )
After that , i wanna run SPECIES flow.But it appear 'Processing 64000 queries' constantly
1654588151.2:
Single sample abundant species profiling in subcommand run_species with args
1654588151.2: {
1654588151.2: "subcommand": "run_species",
1654588151.2: "force": false,
1654588151.2: "debug": true,
1654588151.2: "zzz_worker_mode": false,
1654588151.2: "batch_branch": "master",
1654588151.2: "batch_memory": 378880,
1654588151.2: "batch_vcpus": 48,
1654588151.2: "batch_queue": "pairani",
1654588151.2: "batch_ecr_image": "pairani:latest",
1654588151.2: "midas_outdir": "midas2_output",
1654588151.2: "sample_name": "DYG1",
1654588151.2: "r1": "reads/DYG1.decon_1.fastq.gz",
1654588151.2: "r2": "reads/DYG1.decon_2.fastq.gz",
1654588151.2: "midasdb_name": "gtdb",
1654588151.2: "midasdb_dir": "my_midasdb_Catellicoccus",
1654588151.2: "word_size": 28,
1654588151.2: "aln_mapid": null,
1654588151.2: "aln_cov": 0.75,
1654588151.2: "marker_reads": 2,
1654588151.2: "marker_covered": 2,
1654588151.2: "max_reads": null,
1654588151.2: "num_cores": 8
1654588151.2: }
1654588151.2: Create OUTPUT directory for DYG1.
1654588151.2: 'rm -rf midas2_output/DYG1/species'
1654588151.2: 'mkdir -p midas2_output/DYG1/species'
1654588151.2: Create TEMP directory for DYG1.
1654588151.2: 'rm -rf midas2_output/DYG1/temp/species'
1654588151.2: 'mkdir -p midas2_output/DYG1/temp/species'
1654588151.3: MIDAS2::fetch_midasdb_files::start
1654588161.3: MIDAS2::fetch_midasdb_files::finish
1654588161.5: MIDAS2::map_reads_hsblastn::start
[HS-BLASTN] Loading database.
Loading /media/atm3/user02/MIDAS2.0/sample/my_midasdb_Catellicoccus/markers/phyeco/phyeco.fa.sequence, size = 0.4GB
Loading /media/atm3/user02/MIDAS2.0/sample/my_midasdb_Catellicoccus/markers/phyeco/phyeco.fa.bwt, size = 0.8GB
Loading /media/atm3/user02/MIDAS2.0/sample/my_midasdb_Catellicoccus/markers/phyeco/phyeco.fa.sa, size = 0.8GB
[HS-BLASTN] done. Time elapsed: 1.70 secs.
[HS-BLASTN] Processing /dev/stdin.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Processing 64000 queries.
Hi! I am trying to download the UHGG database for all species and I am receiving an error. I already ran
midas2 database --init --midasdb_name uhgg --midasdb_dir /wynton/protected/scratch/clairedubin/midasdb_uhgg
with no errors.
midas2 database --download --midasdb_name uhgg --midasdb_dir /wynton/protected/scratch/clairedubin/midasdb_uhgg --species all`
1689099642.8: Downloading MIDAS database for sliced species 3 with 12 cores in total::start
1689099642.8: Downloading MIDAS database for sliced species 10 with 12 cores in total::start
1689099643.0: Downloading MIDAS database for sliced species 4 with 12 cores in total::start
1689099643.1: Downloading MIDAS database for sliced species 2 with 12 cores in total::start
1689099643.2: Downloading MIDAS database for sliced species 1 with 12 cores in total::start
1689099643.5: Downloading MIDAS database for sliced species 11 with 12 cores in total::start
1689099643.5: Downloading MIDAS database for sliced species 8 with 12 cores in total::start
1689099643.6: Downloading MIDAS database for sliced species 0 with 12 cores in total::start
1689099643.6: Downloading MIDAS database for sliced species 7 with 12 cores in total::start
1689099643.7: Downloading MIDAS database for sliced species 9 with 12 cores in total::start
1689099643.7: Downloading MIDAS database for sliced species 5 with 12 cores in total::start
1689099643.7: Downloading MIDAS database for sliced species 6 with 12 cores in total::start
Traceback (most recent call last):
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/__main__.py", line 28, in <module>
main()
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/__main__.py", line 24, in main
return subcommand_main(subcommand_args)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 148, in main
download_midasdb(args)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 37, in download_midasdb
download_midasdb_worker(args)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 91, in download_midasdb_worker
midasdb.fetch_files("pangenome", species_id_list)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 167, in fetch_files
return self.fetch_tarball(filename, list_of_species)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 192, in fetch_tarball
md5_fetched = file_md5sum(_fetched_file)
File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 341, in file_md5sum
return md5(open(local_file, "rb").read()).hexdigest()
FileNotFoundError: [Errno 2] No such file or directory: '/wynton/protected/scratch/clairedubin/midasdb_uhgg/pangenomes_filtered/100007/centroids.ffn'
Here is the output of ls /wynton/protected/scratch/clairedubin/midasdb_uhgg/
:
chunks
genomes.tsv
markers_models
metadata.tsv
gene_annotations
markers
md5sum.json
pangenomes
So there is no pangenomes_filtered directory, but there is a pangenomes directory. I didn't have this error with an older version of MIDAS2, so I'm wondering if a recent update is having an issue with directory creation or naming. I also receive the same error when attempting to download for select species instead of all species.
Thanks for you remarkable work. I use midas2 to analyze CNV,but i found that not all the sample can detect CNV. So i want to use my data after bining as ref-database, is that feasible for CNV analyze?
During the multiprocessing_map(), the exception happened in the (children) process wouldn't be raised until all processes finished. I set up four ValueErrors in the branch 2021-01-13-exception to reproduce this issue.
Ideally, we want to terminate the multiprocessing, once there is an exception in the process.
In the pan-genome workflow, need to figure out where does the gene_info.txt
used? For species with large number of genomes (e.g. Species 102506 with 8288 total genomes).
There seems like no usage of this gene_info.txt
(for all the genes) in the gene workflow of MIDAS.
For future build up the database, we can generate the gene_info.txt
for only the centroids.
Let cX and cY be 99% clusters with centroids X and Y, respectively. Normally X is an element of cX and does not belong to any other 99% clusters. In some rare degenerate cases, X is also a member of cY.
Subsequent coarser reclustering at 95, 90, ... ANI for the elements of cX would then produce incorrect results. We need to modify the reclustering assignments to handle this case correctly.
There is a hypothesis this case occurs primarily when contig IDs clash between genomes, so ongoing work to rename contigs during import so as to prevent such clashes could possibly address this problem. The hypothesis is just a wild guess at this point.
I was successfully able to install MIDAS2
using the "From Source" instructions only after modifying the midas2.yml file to include your anaconda channel "zhaoc1"
https://midas2.readthedocs.io/en/latest/installation.html#from-source
Without the zhaoc1 channel, I get the following error beacuse midas2 conda package is only available from zhaoc1
Solving environment: failed
ResolvePackageNotFound:
- midas2=1.0.9
Additionally, the install provided under "Quickstart" didn't work for me. This was related to an issue with Python version (python 3.7.9 was installed, but python 3.9 was required for something). This may have something to do with v1.0.0 (which is indicated in these instructions) and v1.0.9.
https://midas2.readthedocs.io/en/latest/quickstart.html#install-midas2
Finally, the installation instructions under "Conda" didn't practically work for me as the "solving environment" step hung for several hours before I killed it.
https://midas2.readthedocs.io/en/latest/installation.html#conda
I look forward to using the tool.
Mike
This should be easy with the recently released update to aegea. Involves removing the magic numbers 838 and 1715518 from aws_batch_init.
the snps workflow needs metedata of the representative, e.g. genome_length, genome_name (if we have it), contig_counts etc. It's better to have this piece of information when we import the genomes into iggtools.
In future perhaps we can group together the MIDAS shared analysis code in analysis/midas.py with the view of separating code paths that are used for analysis from those used for DB construction --- because analysis paths would likely run outside of AWS at some point. [from pull requests #30 ]
Hi,
From your command line history midas2_output/LRDYA/snps/145629.snps.tsv.lz4
, can you make sure the the sample name provided to the merge_snps command --samples_list list_of_samples.tsv
is LRDYA?
Thanks,
Chunyu
Originally posted by @zhaoc1 in #88 (comment)
i reference this issue but this kind of bug still exist.
command:
midas2 merge_snps --samples_list list_of_samples.tsv --midasdb_name uhgg --midasdb_dir ~/database/midas2/ midas2/ --debug
issue:
1661742862.8: Across samples population SNV calling in subcommand merge_snps with args
1661742862.8: {
1661742862.8: "subcommand": "merge_snps",
1661742862.8: "force": false,
1661742862.8: "debug": true,
1661742862.8: "zzz_worker_mode": false,
1661742862.8: "batch_branch": "master",
1661742862.8: "batch_memory": 378880,
1661742862.8: "batch_vcpus": 48,
1661742862.8: "batch_queue": "pairani",
1661742862.8: "batch_ecr_image": "pairani:latest",
1661742862.8: "midas_outdir": "midas2/",
1661742862.8: "samples_list": "list_of_samples.tsv",
1661742862.8: "midasdb_name": "uhgg",
1661742862.8: "midasdb_dir": "/home/lbl/database/midas2/",
1661742862.8: "species_list": null,
1661742862.8: "genome_depth": 5.0,
1661742862.8: "genome_coverage": 0.4,
1661742862.8: "sample_counts": 2,
1661742862.8: "site_depth": 5,
1661742862.8: "site_ratio": 3.0,
1661742862.8: "site_prev": 0.9,
1661742862.8: "snv_type": "common",
1661742862.8: "snp_pooled_method": "prevalence",
1661742862.8: "snp_maf": 0.05,
1661742862.8: "snp_type": "bi, tri, quad",
1661742862.8: "locus_type": "any",
1661742862.8: "num_cores": 16,
1661742862.8: "chunk_size": 1000000,
1661742862.8: "advanced": false,
1661742862.8: "robust_chunk": false
1661742862.8: }
1661742863.7: 248 species pass the filter
1661742863.7: Create OUTPUT directory.
1661742863.7: 'rm -rf midas2/snps'
1661742863.7: 'mkdir -p midas2/snps'
1661742863.7: Create TEMP directory.
1661742863.7: 'rm -rf midas2/temp/snps'
1661742863.7: 'mkdir -p midas2/temp/snps'
1661742870.0: MIDAS2::write_species_summary::start
1661742870.0: MIDAS2::write_species_summary::finish
1661742870.6: MIDAS2::design_chunks::start
Traceback (most recent call last):
File "/home/lbl/miniconda3/envs/midas2.0/bin/midas2", line 8, in
sys.exit(main())
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/main.py", line 24, in main
return subcommand_main(subcommand_args)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main
merge_snps(args)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 658, in merge_snps
raise error
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 639, in merge_snps
arguments_list = design_chunks(species_ids_of_interest, midas_db)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 220, in design_chunks
all_site_chunks = multithreading_map(design_chunks_per_species, [(sp, midas_db) for sp in dict_of_species.values()], num_cores) #<---
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 540, in multithreading_map
return _multi_map(func, items, num_threads, ThreadPool)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 205, in design_chunks_per_species
return sp.compute_snps_chunks(midas_db, chunk_size, "merge")
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/models/species.py", line 84, in compute_snps_chunks
chunks_of_sites = load_chunks_cache(local_file)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/models/species.py", line 181, in load_chunks_cache
chunks_dict = json.load(stream)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/init.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
how to fix it?
Hello,i downloaded one database for selected species of '145629' and finished run_snps step in following code
midas2 run_snps --sample_name ${line}-1 reads/${line}.decon_1.fastq.gz -2 reads/${line}.decon_2.fastq.gz --midasdb_name gtdb --midasdb_dir my_midasdb_Catellicoccus --species_list 145629 --select_threshold=-1 --num_cores 10 --advanced --ignore_ambiguous midas2_output
But there is somthing wrong with merge_snps:
midas2 merge_snps --samples_list list_of_samples.tsv --midasdb_name gtdb --midasdb_dir my_midasdb_Catellicoccus --genome_coverage 0.7 --num_cores 10 midas2_output/merge
1654743241.2: Across samples population SNV calling in subcommand merge_snps with args
1654743241.2: {
1654743241.2: "subcommand": "merge_snps",
1654743241.2: "force": false,
1654743241.2: "debug": false,
1654743241.2: "zzz_worker_mode": false,
1654743241.2: "batch_branch": "master",
1654743241.2: "batch_memory": 378880,
1654743241.2: "batch_vcpus": 48,
1654743241.2: "batch_queue": "pairani",
1654743241.2: "batch_ecr_image": "pairani:latest",
1654743241.2: "midas_outdir": "midas2_output/merge",
1654743241.2: "samples_list": "list_of_samples.tsv",
1654743241.2: "midasdb_name": "gtdb",
1654743241.2: "midasdb_dir": "my_midasdb_Catellicoccus",
1654743241.2: "species_list": null,
1654743241.2: "genome_depth": 5.0,
1654743241.2: "genome_coverage": 0.7,
1654743241.2: "sample_counts": 2,
1654743241.2: "site_depth": 5,
1654743241.2: "site_ratio": 3.0,
1654743241.2: "site_prev": 0.9,
1654743241.2: "snv_type": "common",
1654743241.2: "snp_pooled_method": "prevalence",
1654743241.2: "snp_maf": 0.05,
1654743241.2: "snp_type": "bi, tri, quad",
1654743241.2: "locus_type": "any",
1654743241.2: "num_cores": 10,
1654743241.2: "chunk_size": 1000000,
1654743241.2: "advanced": false,
1654743241.2: "robust_chunk": false
1654743241.2: }
1654743241.6: 1 species pass the filter
1654743241.6: Create OUTPUT directory.
1654743241.6: 'rm -rf midas2_output/merge/snps'
1654743241.6: 'mkdir -p midas2_output/merge/snps'
1654743241.6: Create TEMP directory.
1654743241.6: 'rm -rf midas2_output/merge/temp/snps'
1654743241.6: 'mkdir -p midas2_output/merge/temp/snps'
1654743241.7: MIDAS2::write_species_summary::start
1654743241.7: MIDAS2::write_species_summary::finish
1654743242.7: MIDAS2::design_chunks::start
1654743242.7: ================= Total number of compute chunks: 2
1654743242.7: MIDAS2::design_chunks::finish
1654743242.7: MIDAS2::multiprocessing_map::start
1654743242.8: MIDAS2::process::145629-0::start snps_worker
1654743242.8: MIDAS2::chunk_worker::145629-0::start accumulate_samples
1654743242.8: MIDAS2::process::145629-1::start snps_worker
1654743242.8: MIDAS2::process::145629--1::wait collect_chunks
1654743242.8: MIDAS2::chunk_worker::145629-1::start accumulate_samples
1654743242.8: WARNING: Non-zero exit code 141 from reader of midas2_output/LRDYA/snps/145629.snps.tsv.lz4.
1654743243.1: WARNING: Non-zero exit code 141 from reader of midas2_output/LRDYA/snps/145629.snps.tsv.lz4.
1654743243.1: MIDAS2::process::145629--1::start collect_chunks
cat: midas2_output/merge/temp/snps/145629/cid.0_snps_info.tsv.lz4: No such file or directory
cat: midas2_output/merge/temp/snps/145629/cid.1_snps_info.tsv.lz4: No such file or directory
1654743243.1: Bugs in the codes, keep the outputs for debugging purpose.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 270, in process
snps_worker(species_id, chunk_id)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 293, in snps_worker
chunk_worker(chunks_of_sites[chunk_id][0])
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 345, in chunk_worker
accumulate(accumulator, proc_args)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 378, in accumulate
for row in select_from_tsv(stream, schema=curr_schema, selected_columns=snps_pileup_basic_schema, result_structure=dict):
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 392, in select_from_tsv
assert False, f"Line {i + j} has {len(values)} columns; was expecting {len(headers)}."
AssertionError: Line 0 has 13 columns; was expecting 8.
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/media/home/user02/miniconda3/envs/midas2.0/bin/midas2", line 8, in
sys.exit(main())
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/main.py", line 24, in main
return subcommand_main(subcommand_args)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main
merge_snps(args)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 653, in merge_snps
raise error
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 643, in merge_snps
proc_flags = multiprocessing_map(process, arguments_list, args.num_cores)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 532, in multiprocessing_map
return _multi_map(func, items, num_procs, multiprocessing.Pool)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/media/home/user02/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
AssertionError: Line 0 has 13 columns; was expecting 8.
Thank you so much
and test it over a small set of species.
From the MIDAS2 paper:
with database customization and Bowtie2 alignment taking up to 75% of run time
Given the apparent computational bottleneck of bowtie2, are there any plans to add bwa-meme as an alternative aligner?
Hello,
We are running MIDAS2 on our computer cluster, where I was given temporary rights to install the database in a shared location, for many users to enjoy your tool. Then, the admins removed my rights to write in the install location (which also relieves my storage quota ;) ).
The problem is that when running the merge command, MIDAS2 is trying to write chunks at this database location, where a user may not have write access rights.
See below the stderr for the job affected by this (and the command):
midas2 merge_snps \
--num_cores 12 \
--midasdb_name gtdb \
--midasdb_dir /cluster/shared/databases/MIDSA2/latest/gtdb \
--genome_depth 5.0 \
--sample_counts 2 \
--site_depth 2 \
--site_ratio 3.0 \
--site_prev 0.9 \
--snv_type common \
--snp_pooled_method prevalence \
--snp_maf 0.1 \
--snp_type {bi,tri,quad} \
--locus_type any \
--force \
--samples_list \
${SCRATCH_FOLDER}/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/sample_list.txt \
${SCRATCH_FOLDER}/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species
$ cat outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/jobs/output/slurm-midas2_merge.fdprf.mds2_78f86426289cf46f1cc5._illumina-midas2_snps_7425188.e
The following modules were not unloaded:
(Use "module --force purge" to unload all):
1) StdEnv
1673959603.1: Across samples population SNV calling in subcommand merge_snps with args
1673959603.1: {
1673959603.1: "subcommand": "merge_snps",
1673959603.1: "force": true,
1673959603.1: "debug": false,
1673959603.1: "zzz_worker_mode": false,
1673959603.1: "batch_branch": "master",
1673959603.1: "batch_memory": 378880,
1673959603.1: "batch_vcpus": 48,
1673959603.1: "batch_queue": "pairani",
1673959603.1: "batch_ecr_image": "pairani:latest",
1673959603.1: "midas_outdir": "/cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species",
1673959603.1: "samples_list": "/cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/sample_list.txt",
1673959603.1: "midasdb_name": "gtdb",
1673959603.1: "midasdb_dir": "/cluster/shared/databases/MIDSA2/latest/gtdb",
1673959603.1: "species_list": null,
1673959603.1: "genome_depth": 5.0,
1673959603.1: "genome_coverage": 0.4,
1673959603.1: "sample_counts": 2,
1673959603.1: "site_depth": 2,
1673959603.1: "site_ratio": 3.0,
1673959603.1: "site_prev": 0.9,
1673959603.1: "snv_type": "common",
1673959603.1: "snp_pooled_method": "prevalence",
1673959603.1: "snp_maf": 0.1,
1673959603.1: "snp_type": [
1673959603.1: "bi",
1673959603.1: "tri",
1673959603.1: "quad"
1673959603.1: ],
1673959603.1: "locus_type": [
1673959603.1: "any"
1673959603.1: ],
1673959603.1: "num_cores": 12,
1673959603.1: "chunk_size": 1000000,
1673959603.1: "advanced": false,
1673959603.1: "robust_chunk": false
1673959603.1: }
1673959603.5: 98 species pass the filter
1673959603.5: Create OUTPUT directory.
1673959603.5: 'rm -rf /cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/snps'
1673959603.5: 'mkdir -p /cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/snps'
1673959603.5: Create TEMP directory.
1673959603.5: 'rm -rf /cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/temp/snps'
1673959603.5: 'mkdir -p /cluster/work/jobs/7425188/cluster/projects/nn8075k/federica/outputs/midas2_merge/after_midas2_78f86426289cf46f1cc5/illumina/gtdb/all_species/temp/snps'
1673959606.4: MIDAS2::write_species_summary::start
1673959606.4: MIDAS2::write_species_summary::finish
1673959607.9: MIDAS2::design_chunks::start
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/120476’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/110537’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/144385’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/106379’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/126839’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/113335’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/131364’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/123321’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/141587’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/102787’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/117262’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/108799’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/141985’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/101791’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/106238’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/116023’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/101899’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/123465’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/113950’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/143698’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/125635’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/130996’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/118769’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/110920’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/139883’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/121846’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/127445’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/122185’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/141780’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/147354’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/102854’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/114718’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/110932’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/108804’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/136285’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/137039’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/128710’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/126638’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/142444’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/114432’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/147309’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/116721’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/134149’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/138056’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/111522’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/128099’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/117326’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/106410’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/109726’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/130446’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/135207’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/125475’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/140833’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/104567’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/136688’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/116478’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/140766’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/113916’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/107040’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/116797’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/142790’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/117786’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/136029’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/120329’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/103143’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/140084’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/106335’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/144859’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/129716’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/110859’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/143588’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/131067’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/138633’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/121261’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/127200’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/139834’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/130720’: Permission denied
mkdir: cannot create directory ‘/cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/140619’: Permission denied
1673959648.2: Deleting untrustworthy outputs due to error. Specify --debug flag to keep.
Traceback (most recent call last):
File "/cluster/projects/nn8075k/conda_envs/midas2/bin/midas2", line 10, in <module>
sys.exit(main())
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/__main__.py", line 24, in main
return subcommand_main(subcommand_args)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main
merge_snps(args)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 658, in merge_snps
raise error
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 639, in merge_snps
arguments_list = design_chunks(species_ids_of_interest, midas_db)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 220, in design_chunks
all_site_chunks = multithreading_map(design_chunks_per_species, [(sp, midas_db) for sp in dict_of_species.values()], num_cores) #<---
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 540, in multithreading_map
return _multi_map(func, items, num_threads, ThreadPool)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 205, in design_chunks_per_species
return sp.compute_snps_chunks(midas_db, chunk_size, "merge")
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/models/species.py", line 103, in compute_snps_chunks
command(f"mkdir -p {os.path.dirname(loc_fp)}")
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 246, in command
return subprocess.run(cmd, shell=shell, **subproc_args)
File "/cluster/projects/nn8075k/conda_envs/midas2/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'mkdir -p /cluster/shared/databases/MIDSA2/latest/gtdb/temp/chunksize.1000000/120476' returned non-zero exit status 1.
Is there a way to tell MIDAS2 to write elsewhere, maybe in a $TMPDIR location?
Thanks! F
we didn't clean up the centroids.ffn using our standard (one seq per line); instead we just copied the results from vsearch cluster.
When I enter this command there is no action or output
$ midas2 run_species --sample_name ${sample_name} -1 reads/${sample_name}_R1.fastq.gz --midasdb_name uhgg --midasdb_dir my_midasdb_uhgg --num_cores 8 my_midas2_output
1652982326.0: Species abundance estimation in subcommand run_species with args
1652982326.0: {
1652982326.0: "subcommand": "run_species",
1652982326.0: "force": false,
1652982326.0: "debug": false,
1652982326.0: "zzz_worker_mode": false,
1652982326.0: "batch_branch": "master",
1652982326.0: "batch_memory": 378880,
1652982326.0: "batch_vcpus": 48,
1652982326.0: "batch_queue": "pairani",
1652982326.0: "batch_ecr_image": "pairani:latest",
1652982326.0: "midas_outdir": "my_midas2_output",
1652982326.0: "sample_name": "sample1",
1652982326.0: "r1": "reads/sample1_R1.fastq.gz",
1652982326.0: "r2": null,
1652982326.0: "midasdb_name": "uhgg",
1652982326.0: "midasdb_dir": "my_midasdb_uhgg",
1652982326.0: "word_size": 28,
1652982326.0: "aln_mapid": null,
1652982326.0: "aln_cov": 0.75,
1652982326.0: "marker_reads": 2,
1652982326.0: "marker_covered": 2,
1652982326.0: "max_reads": null,
1652982326.0: "num_cores": 8
1652982326.0: }
1652982326.0: Create OUTPUT directory for sample1.
1652982326.0: 'rm -rf my_midas2_output/sample1/species'
1652982326.0: 'mkdir -p my_midas2_output/sample1/species'
1652982326.0: Create TEMP directory for sample1.
1652982326.0: 'rm -rf my_midas2_output/sample1/temp/species'
1652982326.0: 'mkdir -p my_midas2_output/sample1/temp/species'
1652982326.0: MIDAS2::fetch_midasdb_files::start
download failed: s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4 to - An error occurred (403) when calling the HeadObject operation: Forbidden
1652982332.0: Sleeping 4.433524636219189 seconds before retry 1 of <function download_reference at 0x155553e93440> with ('s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4', '/global/u1/s/snayfach/test/my_midasdb_uhgg'), {}.
download failed: s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4 to - An error occurred (403) when calling the HeadObject operation: Forbidden
1652982337.9: Sleeping 11.755280753849886 seconds before retry 2 of <function download_reference at 0x155553e93440> with ('s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4', '/global/u1/s/snayfach/test/my_midasdb_uhgg'), {}.
download failed: s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4 to - An error occurred (403) when calling the HeadObject operation: Forbidden
1652982351.2: Deleting untrustworthy outputs due to error. Specify --debug flag to keep.
Traceback (most recent call last):
File "/global/homes/s/snayfach/.conda/envs/midas2/bin/midas2", line 10, in
sys.exit(main())
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/main.py", line 24, in main
return subcommand_main(subcommand_args)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_species.py", line 498, in main
run_species(args)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_species.py", line 492, in run_species
raise error
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_species.py", line 443, in run_species
midas_db = MIDAS_DB(os.path.abspath(args.midasdb_dir), args.midasdb_name)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 60, in init
self.local_toc = self.fetch_files("table_of_contents")
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 118, in fetch_files
return _fetch_file_from_s3((s3_path, local_path))
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 165, in _fetch_file_from_s3
return download_reference(s3_path, local_dir)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 467, in wrapped_operation
return operation(*args, **kwargs)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 643, in download_reference
command(f"set -o pipefail; aws s3 cp --only-show-errors --no-sign-request {ref_path} - | {uncompress_cmd} > {local_path}")
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 245, in command
return subprocess.run(cmd, shell=shell, **subproc_args)
File "/global/homes/s/snayfach/.conda/envs/midas2/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'set -o pipefail; aws s3 cp --only-show-errors --no-sign-request s3://microbiome-pollardlab/uhgg_v1/genomes.tsv.lz4 - | lz4 -dc > /global/u1/s/snayfach/test/my_midasdb_uhgg/genomes.tsv' returned non-zero exit status 1.
Hello,
In this ReadTheDocs page, it is shown in the anatomy of the per-sample run_snps
outputs, that the alignments are present:
|- temp
|- snps
|- repgenomes.bam run_snps Rep-genome alignment file
|- {species}/snps_XX.tsv.lz4
|- bt2_indexes
|- snps/repgenomes.* run_snps Sample-specific rep-genome database
However, both the temp
and bt_indexes
folder turn out empty after a successful run of run_snps
.
I do not remove these files and I do see in the stdout:
1695133677.6: 'samtools index -@ 4 /<my_path>/temp/snps/111210/111210.sorted.bam'
but this file is nowhere to be found after completion.
I consider hacking the tool for not-removing these files that would be useful to my work.
Please, is there a way/flag to tell midas2 to keep these output files?
Thanks,
Franck
i have finished the run_species
and run_snps
steps. there is something wrong with merge_snps
I used midas2 merge_snps --samples_list list_of_samples.tsv --midasdb_name gtdb --midasdb_dir my_midasdb_Escherichia_coli --num_cores 10 --genome_coverage 0.0 --force midas2_output/merge
Here come:
Traceback (most recent call last):
File "/home/luoqingqing/.conda/envs/midas2.0/bin/midas2", line 8, in
sys.exit(main())
File "/home/luoqingqing/.conda/envs/midas2.0/lib/python3.7/site-packages/midas2/main.py", line 24, in main
return subcommand_main(subcommand_args)
File "/home/luoqingqing/.conda/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main
merge_snps(args)
File "/home/luoqingqing/.conda/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 653, in merge_snps
raise error
File "/home/luoqingqing/.conda/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 617, in merge_snps
assert species_ids_of_interest, f"No (specified) species pass the genome_coverage filter across samples, please adjust the genome_coverage or species_list"
AssertionError: No (specified) species pass the genome_coverage filter across samples, please adjust the genome_coverage or species_list
but i have adjusted the genome_coverage to 0, how can i make it more suitable to get the merge results.
waiting for your kind reply, many thanks
after download the v1.0.9 source code, and get into the directory
conda env create -n midas2 -f midas2.yml
return error message:
Retrieving notices: ...working... done
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
After we have the knowledge of exactly which files are needed for MIDAS analysis, we need to rename the MIDAS-DB built by iggtools to make sure the /temp
dir is only valid for build DB purpose, and won't be needed in the MIDAS analysis step. Therefore, when users need to download the 2.0 version database, we can skip the temp
dirs.
use command midas2 database can not download uhgg database on local. Can you write a workflow show how to build uhgg database?
I currently have a problematic installation of Bowtie2, resulting in a bowtie2-build failure
$ bowtie2-build --threads 10 data/SS01120.m.proc.iggtools/genes/temp_sc3.0/pangenomes.fa data/SS01120.m.proc.iggtools/genes/temp_sc3.0/pangenomes > data/SS01120.m.proc.iggtools/genes/temp_sc3.0/bowtie2-build.log
/pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/bowtie2-build-s: /pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/../lib/libtbb.so.2)
which includes an exit code of 1
.
However, iggtools exit code is still 0
.
$ iggtools midas_run_genes --threads 10 --debug -1 data/SS01120.m.proc.r1.fq.gz -2 data/SS01120.m.proc.r2.fq.gz data/SS01120.m.proc.iggtools
1580506602.8: Doing important work in subcommand midas_run_genes with args
[....debug output removed....]
1580506645.8: 'bowtie2-build --threads 10 data/SS01120.m.proc.iggtools/genes/temp_sc3.0/pangenomes.fa data/SS01120.m.proc.iggtools/genes/temp_sc3.0/pangenomes > data/SS01120.m.proc.iggtools/genes/temp_sc3.0/bowtie2-build.log'
/pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/bowtie2-build-s: /pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /pollard/home/bsmith/anaconda3/envs/ucfmt4/bin/../lib/libtbb.so.2)
$ echo $?
0
As a result, pipelines that include this iggtools call do not fail, and try to run the next step.
The problem is that the iggtools.subcommands.midas_run_genes.midas_run_genes
function catches ALL exceptions and performs cleanup without a final non-zero exit.
This same (IMO anti-) pattern is also used in midas_run_snps, but doesn't seem to be used in midas_run_species.
The expected behavior is for errors in subcommands to result in iggtools exiting with a non-zero exit code.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.