GithubHelp home page GithubHelp logo

apcamargo / bioinformatics-snakemake-pipelines Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 4.0 54 KB

Collection of Snakemake pipeline for commin bioinformatics tasks

License: GNU General Public License v3.0

Python 100.00%

bioinformatics-snakemake-pipelines's People

Contributors

apcamargo avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

bioinformatics-snakemake-pipelines's Issues

Issues about bioinformatics-snakemake-pipelines/genome-ani-leiden-clustering-pipeline

Hi Dr.Camargo:
I find this nice project when I read your IMG/VR v4 manuscripts published on Nucleic Acids Research. I have some doubts about this and hope you can help me about them.
1.I tried this project using the command like this:
nohup snakemake --config input=virus30_aftercheckv.fna.split output=representative_genomes_directory leiden_resolution=1.0 avg_ani=0.95 min_ani=0.0 min_cov=0.85 blast_threads=16 --cores 16 -s genome-ani-leiden-clustering-pipeline.smk &

and it turned error like this:
[Tue Feb 20 21:55:34 2024]
rule anicalc:
input: virus30_aftercheckv.fna.split_concatenated.fna, virus30_aftercheckv.fna.split_blast_sorted.tsv
output: virus30_aftercheckv.fna.split_ani.tsv
jobid: 4

Removing temporary output file virus30_aftercheckv.fna.split_blast_sorted.tsv.
[Tue Feb 20 22:00:01 2024]
Finished job 4.
6 of 9 steps (67%) done

[Tue Feb 20 22:00:01 2024]
rule aniclust:
input: virus30_aftercheckv.fna.split_concatenated.fna, virus30_aftercheckv.fna.split_ani.tsv
output: virus30_aftercheckv.fna.split_clusters.tsv
jobid: 2

Traceback (most recent call last):
File "/mnt/sdb/xjh/proj_vlp_bulk/bulk_assembly_genomad/votu30_after_checkv/.snakemake/scripts/tmps86bt2ol.aniclust.py", line 64, in
min_ani=snakemake.params.min_ani,
AttributeError: 'Params' object has no attribute 'min_ani'
[Tue Feb 20 22:00:01 2024]
Error in rule aniclust:
jobid: 2
output: virus30_aftercheckv.fna.split_clusters.tsv

RuleException:
CalledProcessError in line 65 of /mnt/sdb/xjh/proj_vlp_bulk/bulk_assembly_genomad/votu30_after_checkv/genome-ani-leiden-clustering-pipeline.smk:
Command 'set -euo pipefail; /mnt/sdb/xjh/miniconda3/envs/suv_threesoft/bin/python3.10 /mnt/sdb/xjh/proj_vlp_bulk/bulk_assembly_genomad/votu30_after_checkv/.snakemake/scripts/tmps86bt2ol.aniclust.py' returned non-zero exit status 1.
File "/mnt/sdb/xjh/miniconda3/envs/suv_threesoft/lib/python3.10/site-packages/snakemake/executors/init.py", line 2208, in run_wrapper
File "/mnt/sdb/xjh/proj_vlp_bulk/bulk_assembly_genomad/votu30_after_checkv/genome-ani-leiden-clustering-pipeline.smk", line 65, in __rule_aniclust
File "/mnt/sdb/xjh/miniconda3/envs/suv_threesoft/lib/python3.10/site-packages/snakemake/executors/init.py", line 551, in _callback
File "/mnt/sdb/xjh/miniconda3/envs/suv_threesoft/lib/python3.10/concurrent/futures/thread.py", line 52, in run
File "/mnt/sdb/xjh/miniconda3/envs/suv_threesoft/lib/python3.10/site-packages/snakemake/executors/init.py", line 537, in cached_or_run
File "/mnt/sdb/xjh/miniconda3/envs/suv_threesoft/lib/python3.10/site-packages/snakemake/executors/init.py", line 2239, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/sdb/xjh/proj_vlp_bulk/bulk_assembly_genomad/votu30_after_checkv/.snakemake/log/2024-02-20T211837.878935.snakemake.log
I don't know what's wrong about this.

2.In your IMG/VR v4 manuscript published on Nucleic Acids Research I saw other 2 steps (i.e. BLAST and ANI AF compute) before Leiden algorithm when clustering viral contigs into vOTUs. Can I only use the Leiden algorithm(https://github.com/apcamargo/ bioinformatics-snakemake-pipelines/tree/main/genomeani-leiden-clustering-pipeline) to cluster viral contigs into vOTUs? I am not very clear what's the functions of the BLAST and ANI AF computation.

3.If I have several samples of viral contigs (like sample1 sample2 sample3......), when I finish using this Leiden algorithm to cluster vOTUS among all samples, is it possible to know which samples does the specific vOTU exits? Which way is more recommended below:
a.Cluster vOTUs among all samples of the study.
b.Cluster vOTUs in each samples separately.

I am sorry to ask such basic questions due to my limited knowledge。
Thank you very much for your time!
Best wishes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.