mdu-phl / abritamr Goto Github PK

View Code? Open in Web Editor NEW

64.0 64.0 12.0 91.13 MB

A pipeline for running AMRfinderPlus and collating results into functional classes

Python 92.09% Dockerfile 1.01% JavaScript 6.90%

abritamr's People

Contributors

Stargazers

Watchers

Forkers

esr-nz bccdc-phl dkomics mukhtarl shvartsmanirina tauqeer9 ayixon abukamaal ztsin oumarousoro

abritamr's Issues

Use click instead of argparse

I know it is an extra dependency, but it is much nicer to work with than argparse.

Missing interpretation in the Salmonella report

Hello,
I have some issues with the report for salmonella generated by abritamr report.
Some interpretation fields are left blank so this seems like a bug :(.
Best regards,

MDU Sample ID	Item code	Ampicillin - ResMech	Ampicillin - Interpretation	Cefotaxime (ESBL) - ResMech	Cefotaxime (ESBL) - Interpretation	Cefotaxime (AmpC) - ResMech	Cefotaxime (AmpC) - Interpretation	Tetracycline - ResMech	Tetracycline - Interpretation	Gentamicin - ResMech	Gentamicin - Interpretation	Kanamycin - ResMech	Kanamycin - Interpretation	Streptomycin - ResMech	Streptomycin - Interpretation	Sulfathiazole - ResMech	Sulfathiazole - Interpretation	Trimethoprim - ResMech	Trimethoprim - Interpretation	Trim-Sulpha - ResMech	Trim-Sulpha - Interpretation	Chloramphenicol - ResMech	Chloramphenicol - Interpretation	Ciprofloxacin - ResMech	Ciprofloxacin - Interpretation	Meropenem - ResMech	Meropenem - Interpretation	Azithromycin - ResMech	Azithromycin - Interpretation	Aminoglycosides (RMT) - ResMech	Aminoglycosides (RMT) - Interpretation	Colistin - ResMech	Colistin - Interpretation	Other - ResMech	Other - Interpretation
abritamr		None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected	Susceptible	None detected		mdsA;mdsB

Can't find `collate.py` when running within a Singularity container

When running within a Singularity container, we hit this error:

python3: can't open file '/opt/conda/lib/python3.7/site-packages/abritamr/utils/collate.py': [Errno 2] No such file or directory

I think it has to do with the Snakemake file being outside the container --- so to speak.

How to run program

This is probably a stupid question but how do you run the program? I have cd into the directory that I have my .fasta file in and tried combinations of

abritamr --run path to fasta
abritamr --run
abritamr path to fasta

I'm wondering if someone could give me the dummies guid to it?!

Cheers!

Need to check for empty dicts

I'm using the NCBI test data (https://github.com/ncbi/amr/wiki/Test-your-installation) to make some Unit tests for this tool in our setup and found a bug.

If there are no partial matches it errors out (command and output below).
I think you need some checking for empty elements prior to your collate calls.
Will report back if I find a neat fix.

abritamr run -c test_dna.fa -px TEST --species Escherichia -j 2

[INFO:11/23/2021 11:41:56 AM] The input file seems to be in the correct format. Thank you. [INFO:11/23/2021 11:41:56 AM] Checking if file test_dna.fa exists [INFO:11/23/2021 11:41:56 AM] test_dna.fa is present. abritamr can proceed. [INFO:11/23/2021 11:41:56 AM] Checking for amrfinder DB: None and comparing it to 2021-09-30.1 [WARNING:11/23/2021 11:41:56 AM] It seems you don't have the AMRFINDER_DB variable set. Now checking AMRfinder setup. Please note if the AMRFinder DB is not v 2021-09-30.1 this may cause errors [CRITICAL:11/23/2021 11:41:56 AM] Your amrfinder database version is NOT 2021-09-30.1. abriTAMR will still run but behaviour may not be as expected in terms of binnig genes into the appropriate drug classes. [INFO:11/23/2021 11:41:56 AM] You are running abritamr in assembly mode. Now executing : mkdir -p TEST && amrfinder -n test_dna.fa -o TEST/amrfinder.out --plus --organism Escherichia --threads 2 [INFO:11/23/2021 11:42:00 AM] AMRfinder completed successfully. Will now move on to collation. [INFO:11/23/2021 11:42:00 AM] This is a single sample run. [INFO:11/23/2021 11:42:00 AM] Opened amrfinder output for TEST blaTEM-156 blaPDC blaOXA vanG blaEC blaTEM aph(3'')-Ib sul2 blaOXA blaTEM emrD3 pmrB_C84R pmrB_C84R 23S_A2058T nfsA_K141STOP nfsA_R15C ampC_T-14TGT

Traceback (most recent call last): File "/media/nvme/miniconda3/envs/amrfinder/bin/abritamr", line 8, in <module> sys.exit(main()) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/abritamr.py", line 124, in main args.func(args) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/abritamr.py", line 19, in run_pipeline collated_data = C.run() File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 346, in run summary_drugs, summary_partial, virulence = self.collate(prefix = self.prefix) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 299, in collate reftab=reftab, df=df, isolate=prefix File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 202, in get_per_isolate partials = self.joins(dict_for_joining=partials) File "/media/nvme/miniconda3/envs/amrfinder/lib/python3.7/site-packages/abritamr/Collate.py", line 71, in joins dict_for_joining[i] = ",".join(dict_for_joining[i]) TypeError: sequence item 0: expected str instance, NoneType found

POINTN causing None error

When using the --species flag with Staphylococcus_aureus The following error was raised.

Traceback (most recent call last):
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/bin/abritamr", line 8, in <module>
    sys.exit(main())
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/abritamr.py", line 127, in main
    args.func(args)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/abritamr.py", line 20, in run_pipeline
    collated_data = C.run()
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 296, in run
    summary_drugs, summary_partial, virulence = self._batch_collate(input_file = self.input)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 277, in _batch_collate
    temp_match, temp_partial, temp_virulence = self.collate(prefix = prefix)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 246, in collate
    reftab=reftab, df=df, isolate=prefix
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 143, in get_per_isolate
    partials = self.joins(dict_for_joining=partials)
  File "/home/morjm/miniconda3/envs/bohra200522_cloned_060622/lib/python3.7/site-packages/abritamr/Collate.py", line 41, in joins
    dict_for_joining[i] = ",".join(dict_for_joining[i])
TypeError: sequence item 0: expected str instance, NoneType found

Provide a combined output file

Provide a single output file, which contains all genes detected

Genes (and mutations - were reported) will still be provided by drug class (or listed as virulence)
No annotation = exact match detected
* = a blast or allele match is supplied - name reported is the closest match reported by amrfinder
^ = a partial match

Update DB to 2021-09-30.1

Update AMRfinder DB to 2021-09-30.1

update db
verify
deploy

TypeError: sequence item 0: expected str instance, NoneType found

When running abritamr in batch mode a TypeError occured for 8 E. coli genomes out of 92. We have looked for a pattern in these genomes that could have caused this but found nothing. The program ran successfully on another batch of genomes for K. pneumoniae.

The code used:

abritamr run --contigs AMR_input.tsv --jobs 16 --amrfinder_db /mnt/storage/databases/bakta/db/amrfinderplus-db/2022-08-09.1 --species Escherichia

The error:

Traceback (most recent call last):
File "/mnt/storage/conda/milly/abritamr/bin/abritamr", line 10, in
sys.exit(main())
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/abritamr.py", line 124, in main
args.func(args)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/abritamr.py", line 20, in run_pipeline
C.run()
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 310, in run
summary_drugs, summary_partial, virulence = self.collate(prefix = self.prefix)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 262, in collate
drug, partial, virulence = self.get_per_isolate(
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 159, in get_per_isolate
drugclass_dict = self.joins(dict_for_joining=drugclass_dict)
File "/mnt/storage/conda/milly/abritamr/lib/python3.10/site-packages/abritamr/Collate.py", line 41, in joins
dict_for_joining[i] = ",".join(dict_for_joining[i])
TypeError: sequence item 0: expected str instance, NoneType found

new issue for testing

Test Issue for planner

correct installation instructions

Please correct : conda create -n abritamr -c bioconda ncbi-amrfinder to conda create -n abritamr -c bioconda ncbi-amrfinderplus

Documentation improvement

Hello,
I have been working with your tool and noticed that something was not very clear in the doc: the ISOLATE name in the mdu qc file MUST be the same as in the matches from the run => the prefix of the output directory or the program crashes with the following error:

Traceback (most recent call last): File "/usr/local/bin/lmod/abritamr/1.0.13/bin/abritamr", line 10, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/abritamr.py", line 129, in main args.func(args) File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/abritamr.py", line 28, in mdu C.run() File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/Collate.py", line 822, in run passed_match_df = self.mdu_reporting_general(match=self.match) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/bin/lmod/abritamr/1.0.13/lib/python3.11/site-packages/abritamr/Collate.py", line 769, in mdu_reporting_general exp_species = qcdf["SPECIES_EXP"].values[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: index 0 is out of bounds for axis 0 with size 0

I would thus process this error in a more meaningful one in the code or specify it a bit more clearly when you speak about the qc file.
Best regards,

Remove restriction on running with a different DB version

Dani and Jacqui suggested that allowing users to use a different DB with a warning would be helpful/

Fix v1.0.7 tag

The v1.0.7 tag is inconsistent, there is an extra dot between the v and 1 (v.1.0.7).
Once that dot is seen, it can't be unseen. Can just add an additional tag to the same commit if removing the existing one will cause issues.

missing blaEC gene

Dieter has identified sequences where amrfinder does not report the presence of a gene where abricate does. The gene is blaEC-5.

test 2 for planner

parallel: not found

I am getting the following error:
`[CRITICAL:05/17/2023 12:59:31 PM] Your amrfinder database version is NOT 2022-08-09.1. abriTAMR will still run but behaviour may not be as expected in terms of binnig genes into the appropriate drug classes.
[INFO:05/17/2023 12:59:31 PM] You are running abritamr in batch mode. Now executing : parallel -j 16 --colsep '\t' 'mkdir -p {1} && amrfinder -n {2} -o {1}/amrfinder.out --plus --organism Enterococcus_faecalis --threads 1 -d AMRFINDER_DB' :::: nett.txt
[CRITICAL:05/17/2023 12:59:31 PM] There appears to have been a problem with running amrfinder plus. The following erro has been reported :
/bin/sh: 1: parallel: not found

[CRITICAL:05/17/2023 12:59:31 PM] The amrfinder output : LH_NE_10.fna/amrfinder.out is missing. Something has gone wrong with AMRfinder plus. Please check all inputs and try again.`

Please help me out to resolve the error.

Second, does the database work on all Enterococcus species? and
how can I update the database?

Thanks,
Hassan

add identity option

Add in an option for users to set the min_identity value for amrfinder

duplicated sample ID in the input file resulted in a large

many duplicated rows in 'abritamr.txt". The file size is huge but no other error msg.
Can we have a sanity check implemented for these duplicated sample IDs?

Meaningful test/validation

Add a meaningful test to ensure that abritamr detects and reports genes as expected

Path to Singularity image should be in config

Maybe change the following in the Snakemake template:

    input: "{sample}/contigs.fa"
    output: "{sample}/{sample}.out"
    shell:
        """
        singularity run ~/dev/Singularity/amrfinderplus/amrfinderplus.simg amrfinder -n {input} -o {output} -t 1
        """

To:

    input: "{sample}/contigs.fa"
    output: "{sample}/{sample}.out"
    params: 
      singularity_image=config['path_amrfinderplus_singularity']
    shell:
        """
        singularity run {params.singularity_image} amrfinder -n {input} -o {output} -t 1
        """

Add gene classes to salmonella mdu

Add chloramphenicol as a separate respect with interpretation
add ALL carbapenemases as a resmech not interpretation
verify and add Cotrimoxazole in addition to sulfathiazole and trimetphorim

abritamr should fail with non-zero exit code if the amrfinder command is not successful

I see the followining error (due to missing parallel .. see bioconda/bioconda-recipes#41088) but abritamr
has exit code 0 (which indicates success).

�[38;21m[INFO:05/22/2023 01:03:43 PM] The input file seems to be in the correct format. Thank you.�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking that the input data is present.�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking if file /tmp/tmp6j2k5pye/files/3/3/f/dataset_33fbe018-0709-481b-b4b2-3117e725ad73.dat exists�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] Checking for amrfinder DB: /usr/local/lib/python3.11/site-packages/abritamr/db/amrfinderplus/data/2022-08-09.1 and comparing it to 2022-08-09.1�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] You seem to have the correct AMRfinder DB setup. Well done!�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] All check complete now running AMRFinder�[0m
�[38;21m[INFO:05/22/2023 01:03:43 PM] You are running abritamr in batch mode. Now executing : parallel -j 2 --colsep '\t' 'mkdir -p {1} && amrfinder -n {2} -o {1}/amrfinder.out --plus  --threads 1 -d /usr/local/lib/python3.11/site-packages/abritamr/db/amrfinderplus/data/2022-08-09.1 --ident_min 0.9 ' :::: /tmp/tmp6j2k5pye/job_working_directory/000/2/configs/tmp27psiuwg�[0m
�[31;1m[CRITICAL:05/22/2023 01:03:43 PM] There appears to have been a problem with running amrfinder plus. The following erro has been reported : 
 /bin/sh: parallel: command not found
�[0m
�[31;1m[CRITICAL:05/22/2023 01:03:43 PM] The amrfinder output : CP009102.1.fasta/amrfinder.out is missing. Something has gone wrong with AMRfinder plus. Please check all inputs and try again.�[0m

Difficulty in running the abritamr report

After completion of run, i would like to generate the report as availability of functionality.

Unfortunately, i am receiving following error for QC file.

(abritamr) user@bioscience:/data/mg/abritamr$ abritamr report -m summary_matches.txt -p summary_partials.txt
[INFO:03/06/2023 10:16:02 AM] You are generating a general report
[INFO:03/06/2023 10:16:02 AM] Now checking all input files are present.
[INFO:03/06/2023 10:16:02 AM] Checking that QC is present.
[CRITICAL:03/06/2023 10:16:02 AM] The QC file supplied () does not exist. Please check your inputs and try again.

Kindly guide me to generate QC file

add license

thanks!

Issue with snakemake pipeline?

Hi,

I'm having some issues trying to run this pipeline. I've installed via conda on a nectar instance, and when I call abritamr I get the following output:

(abritamr) ubuntu@nectar2:/mnt/nectar/analyses/mdu_amr_pipeline_test$ abritamr -c contig_file.txt
[INFO:07/17/2021 03:16:37 AM] Checking the structure of your input file.
[INFO:07/17/2021 03:16:37 AM] The input file seems to be in the correct format. Thank you.
[INFO:07/17/2021 03:16:37 AM] Checking that the input data is present. If present will link to /mnt/nectar/analyses/mdu_amr_pipeline_test
[INFO:07/17/2021 03:16:37 AM] Checking if file /mnt/nectar/analyses/klebs_oxy/polishing/Kox100_round3.fasta exists
[INFO:07/17/2021 03:16:37 AM] Setting up workflow files.
[INFO:07/17/2021 03:16:37 AM] Writing config file
[INFO:07/17/2021 03:16:37 AM] Writing snakefile
[INFO:07/17/2021 03:16:37 AM] Written Snakefile and config.yaml to /mnt/nectar/analyses/mdu_amr_pipeline_test
[INFO:07/17/2021 03:16:37 AM] Running pipeline using command snakemake -s Snakefile_abritamr -j 16  2>&1 | tee -a job.log. This may take some time.
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/abritamr/bin/abritamr", line 10, in <module>
    sys.exit(main())
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/abritamr.py", line 112, in main
    args.func(args)
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/abritamr.py", line 18, in run_pipeline
    return P.run_amr()
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/AmrSetup.py", line 219, in run_amr
    wkflow = self.run_snakemake()
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/site-packages/abritamr/AmrSetup.py", line 191, in run_snakemake
    wkfl = subprocess.run(cmd, shell=True, capture_output=True)
  File "/home/ubuntu/miniconda3/envs/abritamr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'

AMRFinder Plus is installed, and I am able to run it successfully on the genome that I've used in the above test.

Unable to run abritamr in batch mode , giving error parallel: invalid option -- '-'

Hi,
I am unable to run abritamr in batch mode as following error occurs. The screenshot is attached here. I followed the instruction as given on github page. I tried every possibilities present on internet, that does not worked so far. If you have better suggestions , please let me know? With single input file abritamr is running fine.
Thanking you!
Manish Ranjan

mdu-phl / abritamr Goto Github PK

abritamr's People

Contributors

Stargazers

Watchers

Forkers

abritamr's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs