GithubHelp home page GithubHelp logo

mdu-phl / abritamr Goto Github PK

View Code? Open in Web Editor NEW
64.0 64.0 12.0 91.13 MB

A pipeline for running AMRfinderPlus and collating results into functional classes

Python 92.09% Dockerfile 1.01% JavaScript 6.90%

abritamr's People

Contributors

andersgs avatar kristyhoran avatar pansapiens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

abritamr's Issues

Missing interpretation in the Salmonella report

Hello,
I have some issues with the report for salmonella generated by abritamr report.
Some interpretation fields are left blank so this seems like a bug :(.
Best regards,

<style> </style>
MDU Sample ID Item code Ampicillin - ResMech Ampicillin - Interpretation Cefotaxime (ESBL) - ResMech Cefotaxime (ESBL) - Interpretation Cefotaxime (AmpC) - ResMech Cefotaxime (AmpC) - Interpretation Tetracycline - ResMech Tetracycline - Interpretation Gentamicin - ResMech Gentamicin - Interpretation Kanamycin - ResMech Kanamycin - Interpretation Streptomycin - ResMech Streptomycin - Interpretation Sulfathiazole - ResMech Sulfathiazole - Interpretation Trimethoprim - ResMech Trimethoprim - Interpretation Trim-Sulpha - ResMech Trim-Sulpha - Interpretation Chloramphenicol - ResMech Chloramphenicol - Interpretation Ciprofloxacin - ResMech Ciprofloxacin - Interpretation Meropenem - ResMech Meropenem - Interpretation Azithromycin - ResMech Azithromycin - Interpretation Aminoglycosides (RMT) - ResMech Aminoglycosides (RMT) - Interpretation Colistin - ResMech Colistin - Interpretation Other - ResMech Other - Interpretation
abritamr   None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected Susceptible None detected mdsA*;mdsB*

Can't find `collate.py` when running within a Singularity container

When running within a Singularity container, we hit this error:

python3: can't open file '/opt/conda/lib/python3.7/site-packages/abritamr/utils/collate.py': [Errno 2] No such file or directory

I think it has to do with the Snakemake file being outside the container --- so to speak.

How to run program

This is probably a stupid question but how do you run the program? I have cd into the directory that I have my .fasta file in and tried combinations of

abritamr --run path to fasta
abritamr --run
abritamr path to fasta

I'm wondering if someone could give me the dummies guid to it?!

Cheers!

Need to check for empty dicts

I'm using the NCBI test data (https://github.com/ncbi/amr/wiki/Test-your-installation) to make some Unit tests for this tool in our setup and found a bug.

If there are no partial matches it errors out (command and output below).
I think you need some checking for empty elements prior to your collate calls.
Will report back if I find a neat fix.

abritamr run -c test_dna.fa -px TEST --species Escherichia -j 2

[INFO:11/23/2021 11:41:56 AM] The input file seems to be in the correct format. Thank you. [INFO:11/23/2021 11:41:56 AM] Checking if file test_dna.fa exists [INFO:11/23/2021 11:41:56 AM] test_dna.fa is present. abritamr can proceed. [INFO:11/23/2021 11:41:56 AM] Checking for amrfinder DB: None and comparing it to 2021-09-30.1 [WARNING:11/23/2021 11:41:56 AM] It seems you don't have the AMRFINDER_DB variable set. Now checking AMRfinder setup. Please note if the AMRFinder DB is not v 2021-09-30.1 this may cause errors [CRITICAL:11/23/2021 11:41:56 AM] Your amrfinder database version is NOT 2021-09-30.1. abriTAMR will still run but behaviour may not be as expected in terms of binnig genes into the appropriate drug classes. [INFO:11/23/2021 11:41:56 AM] You are running abritamr in assembly mode. Now executing : mkdir -p TEST && amrfinder -n test_dna.fa -o TEST/amrfinder.out --plus --organism Escherichia --threads 2 [INFO:11/23/2021 11:42:00 AM] AMRfinder completed successfully. Will now move on to collation. [INFO:11/23/2021 11:42:00 AM] This is a single sample run. [INFO:11/23/2021 11:42:00 AM] Opened amrfinder output for TEST blaTEM-156 blaPDC blaOXA vanG blaEC blaTEM aph(3'')-Ib sul2 blaOXA blaTEM emrD3 pmrB_C84R pmrB_C84R 23S_A2058T nfsA_K141STOP nfsA_R15C ampC_T-14TGT

Traceback (most recent call last): File "/media/nvme/miniconda3/envs/amrfinder/bin/abritamr", line 8, in <module> sys.exit(main()) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/abritamr.py", line 124, in main args.func(args) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/abritamr.py", line 19, in run_pipeline collated_data = C.run() File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 346, in run summary_drugs, summary_partial, virulence = self.collate(prefix = self.prefix) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 299, in collate reftab=reftab, df=df, isolate=prefix File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 202, in get_per_isolate partials = self.joins(dict_for_joining=partials) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 71, in joins dict_for_joining[i] = ",".join(dict_for_joining[i]) TypeError: sequence item 0: expected str instance, NoneType found

POINTN causing None error

When using the --species flag with Staphylococcus_aureus The following error was raised.

Traceback (most recent call last):
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/bin/abritamr", line 8, in <module>
    sys.exit(main())
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/abritamr.py", line 127, in main
    args.func(args)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/abritamr.py", line 20, in run_pipeline
    collated_data = C.run()
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 296, in run
    summary_drugs, summary_partial, virulence = self._batch_collate(input_file = self.input)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 277, in _batch_collate
    temp_match, temp_partial, temp_virulence = self.collate(prefix = prefix)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 246, in collate
    reftab=reftab, df=df, isolate=prefix
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 143, in get_per_isolate
    partials = self.joins(dict_for_joining=partials)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 41, in joins
    dict_for_joining[i] = ",".join(dict_for_joining[i])
TypeError: sequence item 0: expected str instance, NoneType found

Provide a combined output file

Provide a single output file, which contains all genes detected

  • Genes (and mutations - were reported) will still be provided by drug class (or listed as virulence)
  • No annotation = exact match detected
  • * = a blast or allele match is supplied - name reported is the closest match reported by amrfinder
  • ^ = a partial match

TypeError: sequence item 0: expected str instance, NoneType found

When running abritamr in batch mode a TypeError occured for 8 E. coli genomes out of 92. We have looked for a pattern in these genomes that could have caused this but found nothing. The program ran successfully on another batch of genomes for K. pneumoniae.

The code used:

abritamr run --contigs AMR_input.tsv --jobs 16 --amrfinder_db /mnt/storage/databases/bakta/db/amrfinderplus-db/2022-08-09.1 --species Escherichia

The error:

Traceback (most recent call last):
File "/mnt/storage/conda/milly/abritamr/bin/abritamr", line 10, in
sys.exit(main())
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/abritamr.py", line 124, in main
args.func(args)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/abritamr.py", line 20, in run_pipeline
C.run()
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 310, in run
summary_drugs, summary_partial, virulence = self.collate(prefix = self.prefix)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 262, in collate
drug, partial, virulence = self.get_per_isolate(
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 159, in get_per_isolate
drugclass_dict = self.joins(dict_for_joining=drugclass_dict)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 41, in joins
dict_for_joining[i] = ",".join(dict_for_joining[i])
TypeError: sequence item 0: expected str instance, NoneType found

Documentation improvement

Hello,
I have been working with your tool and noticed that something was not very clear in the doc: the ISOLATE name in the mdu qc file MUST be the same as in the matches from the run => the prefix of the output directory or the program crashes with the following error:

Traceback (most recent call last): File "/usr/local/bin/lmod/abritamr/1.0.13/bin/abritamr", line 10, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/abritamr.py", line 129, in main args.func(args) File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/abritamr.py", line 28, in mdu C.run() File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/Collate.py", line 822, in run passed_match_df = self.mdu_reporting_general(match=self.match) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/Collate.py", line 769, in mdu_reporting_general exp_species = qcdf["SPECIES_EXP"].values[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: index 0 is out of bounds for axis 0 with size 0

I would thus process this error in a more meaningful one in the code or specify it a bit more clearly when you speak about the qc file.
Best regards,

Fix v1.0.7 tag

The v1.0.7 tag is inconsistent, there is an extra dot between the v and 1 (v.1.0.7).
Once that dot is seen, it can't be unseen. Can just add an additional tag to the same commit if removing the existing one will cause issues.

missing blaEC gene

Dieter has identified sequences where amrfinder does not report the presence of a gene where abricate does. The gene is blaEC-5.

parallel: not found

Hi

I am getting the following error:
`[CRITICAL:05/17/2023 12:59:31 PM] Your amrfinder database version is NOT 2022-08-09.1. abriTAMR will still run but behaviour may not be as expected in terms of binnig genes into the appropriate drug classes.
[INFO:05/17/2023 12:59:31 PM] You are running abritamr in batch mode. Now executing : parallel -j 16 --colsep '\t' 'mkdir -p {1} && amrfinder -n {2} -o {1}/amrfinder.out --plus --organism Enterococcus_faecalis --threads 1 -d AMRFINDER_DB' :::: nett.txt
[CRITICAL:05/17/2023 12:59:31 PM] There appears to have been a problem with running amrfinder plus. The following erro has been reported :
/bin/sh: 1: parallel: not found

[CRITICAL:05/17/2023 12:59:31 PM] The amrfinder output : LH_NE_10.fna/amrfinder.out is missing. Something has gone wrong with AMRfinder plus. Please check all inputs and try again.`

Please help me out to resolve the error.

Second, does the database work on all Enterococcus species? and
how can I update the database?

Thanks,
Hassan

Path to Singularity image should be in config

Maybe change the following in the Snakemake template:

    input: "{sample}/contigs.fa"
    output: "{sample}/{sample}.out"
    shell:
        """
        singularity run ~/dev/Singularity/amrfinderplus/amrfinderplus.simg amrfinder -n {input} -o {output} -t 1
        """

To:

    input: "{sample}/contigs.fa"
    output: "{sample}/{sample}.out"
    params: 
      singularity_image=config['path_amrfinderplus_singularity']
    shell:
        """
        singularity run {params.singularity_image} amrfinder -n {input} -o {output} -t 1
        """

Add gene classes to salmonella mdu

  • Add chloramphenicol as a separate respect with interpretation
  • add ALL carbapenemases as a resmech not interpretation
  • verify and add Cotrimoxazole in addition to sulfathiazole and trimetphorim

abritamr should fail with non-zero exit code if the amrfinder command is not successful

I see the followining error (due to missing parallel .. see bioconda/bioconda-recipes#41088) but abritamr
has exit code 0 (which indicates success).

�[38;21m[INFO:05/22/2023 01:03:43 PM] The input file seems to be in the correct format. Thank you.�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking that the input data is present.�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking if file /tmp/tmp6j2k5pye/files/3/3/f/dataset_33fbe018-0709-481b-b4b2-3117e725ad73.dat exists�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking for amrfinder DB: /usr/local/lib/python3.11/site-packages/abritamr/db/amrfinderplus/data/2022-08-09.1 and comparing it to 2022-08-09.1�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] You seem to have the correct AMRfinder DB setup. Well done!�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] All check complete now running AMRFinder�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] You are running abritamr in batch mode. Now executing : parallel -j 2 --colsep '\t' 'mkdir -p {1} && amrfinder -n {2} -o {1}/amrfinder.out --plus  --threads 1 -d /usr/local/lib/python3.11/site-packages/abritamr/db/amrfinderplus/data/2022-08-09.1 --ident_min 0.9 ' :::: /tmp/tmp6j2k5pye/job_working_directory/000/2/configs/tmp27psiuwg�[0m
�[31;1m[CRITICAL:05/22/2023 01:03:43 PM] There appears to have been a problem with running amrfinder plus. The following erro has been reported : 
 /bin/sh: parallel: command not found
�[0m
�[31;1m[CRITICAL:05/22/2023 01:03:43 PM] The amrfinder output : CP009102.1.fasta/amrfinder.out is missing. Something has gone wrong with AMRfinder plus. Please check all inputs and try again.�[0m

Difficulty in running the abritamr report

After completion of run, i would like to generate the report as availability of functionality.

Unfortunately, i am receiving following error for QC file.

(abritamr) user@bioscience:/data/mg/abritamr$ abritamr report -m summary_matches.txt -p summary_partials.txt
[INFO:03/06/2023 10:16:02 AM] You are generating a general report
[INFO:03/06/2023 10:16:02 AM] Now checking all input files are present.
[INFO:03/06/2023 10:16:02 AM] Checking that QC is present.
[CRITICAL:03/06/2023 10:16:02 AM] The QC file supplied () does not exist. Please check your inputs and try again.

Kindly guide me to generate QC file

Issue with snakemake pipeline?

Hi,

I'm having some issues trying to run this pipeline. I've installed via conda on a nectar instance, and when I call abritamr I get the following output:

(abritamr) ubuntu@nectar2:/mnt/nectar/analyses/mdu_amr_pipeline_test$ abritamr -c contig_file.txt
[INFO:07/17/2021 03:16:37 AM] Checking the structure of your input file.
[INFO:07/17/2021 03:16:37 AM] The input file seems to be in the correct format. Thank you.
[INFO:07/17/2021 03:16:37 AM] Checking that the input data is present. If present will link to /mnt/nectar/analyses/mdu_amr_pipeline_test
[INFO:07/17/2021 03:16:37 AM] Checking if file /mnt/nectar/analyses/klebs_oxy/polishing/Kox100_round3.fasta exists
[INFO:07/17/2021 03:16:37 AM] Setting up workflow files.
[INFO:07/17/2021 03:16:37 AM] Writing config file
[INFO:07/17/2021 03:16:37 AM] Writing snakefile
[INFO:07/17/2021 03:16:37 AM] Written Snakefile and config.yaml to /mnt/nectar/analyses/mdu_amr_pipeline_test
[INFO:07/17/2021 03:16:37 AM] Running pipeline using command snakemake -s Snakefile_abritamr -j 16  2>&1 | tee -a job.log. This may take some time.
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/abritamr/bin/abritamr", line 10, in <module>
    sys.exit(main())
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/abritamr.py", line 112, in main
    args.func(args)
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/abritamr.py", line 18, in run_pipeline
    return P.run_amr()
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/AmrSetup.py", line 219, in run_amr
    wkflow = self.run_snakemake()
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/AmrSetup.py", line 191, in run_snakemake
    wkfl = subprocess.run(cmd, shell=True, capture_output=True)
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'

AMRFinder Plus is installed, and I am able to run it successfully on the genome that I've used in the above test.

Unable to run abritamr in batch mode , giving error parallel: invalid option -- '-'

Hi,
I am unable to run abritamr in batch mode as following error occurs. The screenshot is attached here. I followed the instruction as given on github page. I tried every possibilities present on internet, that does not worked so far. If you have better suggestions , please let me know? With single input file abritamr is running fine.
Thanking you!
Manish Ranjan
Screenshot from 2023-01-25 12-56-16

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.