GithubHelp home page GithubHelp logo

oschwengers / platon Goto Github PK

View Code? Open in Web Editor NEW
99.0 6.0 15.0 38.3 MB

Identification & characterization of bacterial plasmid-borne contigs from short-read draft assemblies.

Home Page: https://doi.org/10.1099/mgen.0.000398

License: GNU General Public License v3.0

Shell 8.30% Nextflow 11.55% Python 77.98% Common Workflow Language 2.17%
microbiology bioinformatics plasmids ngs wgs contigs bacteria assembly

platon's Introduction

DOI:10.1099/mgen.0.000398 License: GPL v3 PyPI - Python Version GitHub release PyPI PyPI - Status Conda

Platon: identification and characterization of bacterial plasmid contigs from short-read draft assemblies

Contents

Description

TL;DR Platon detects plasmid-borne contigs within bacterial draft (meta) genomes assemblies. Therefore, Platon analyzes the distribution bias of protein-coding gene families among chromosomes and plasmids. This analysis is complemented by comprehensive contig characterizations followed by heuristic filters.

Platon conducts three analysis steps:

  1. It predicts and searches protein sequences against a custom and pre-computed database comprising marker protein sequences (MPS) and related replicon distribution scores (RDS). These scores express the empirically measured bias of protein sequence family distributions among plasmids and chromosomes pre-computed on complete NCBI RefSeq replicons. Platon calculates the mean RDS for each contig and either classifies them as chromosome if the RDS is below a sensitivity cutoff determined to 95% sensitivity or as plasmid if the RDS is above a specificity cutoff determined to 99.9% specificity. Exact values for these thresholds have been computed based on Monte Carlo simulations of artifical replicon fragments created from complete RefSeq chromosome and plasmid sequences.
  2. Contigs passing the sensitivity filter get comprehensivley characterized. Hereby, Platon tries to circularize the contig sequences, searches for rRNA, replication, mobilization and conjugation genes, oriT sequences, incompatibility group DNA probes and finally performs a BLAST+ search against the NCBI plasmid database.
  3. Finally, to increase the overall sensitivity, Platon classifies all remaining contigs based on the gathered information by several heuristics.
Replicon distribution and alignment hit frequencies of MPS
Fig: Replicon distribution and alignment hit frequencies of MPS. Shown are summed plasmid and chromosome alignment hit frequencies per MPS plotted against plasmid/chromosome hit count ratios scaled to [-1 (chromosome), 1 (plasmid)]; Hue: normalized RDS values (min=-100, max=100), hit count outliers below 10-4 and above 1 are discarded for the sake of readability.

Input/Output

Input

Platon accepts draft (meta) genome assemblies in fasta format. If contigs have been assembled with SPAdes, Platon is able to extract the coverage information from the contig names.

Output

For each contig classified as plasmid sequence the following columns are printed to STDOUT as tab separated values:

  • Contig ID
  • Length
  • Coverage
  • # ORFs
  • RDS
  • Circularity
  • Incompatibility Type(s)
  • # Replication Genes
  • # Mobilization Genes
  • # OriT Sequences
  • # Conjugation Genes
  • # rRNA Genes
  • # Plasmid Database Hits

In addition, Platon writes the following files into the output directory:

  • <prefix>.plasmid.fasta: contigs classified as plasmids or plasmodal origin
  • <prefix>.chromosome.fasta: contigs classified as chromosomal origin
  • <prefix>.tsv: dense information as printed to STDOUT (see above)
  • <prefix>.json: comprehensive results and information on each single plasmid contig. All files are prefixed (<prefix>) as the input genome fasta file.

Installation

Platon can be installed via BioConda or Pip. However, we encourage to use Conda to automatically install all required 3rd party dependencies. In all cases a mandatory database must be downloaded.

BioConda

$ conda install -c conda-forge -c bioconda -c defaults platon

Pip

$ python3 -m pip install --user cb-platon

Platon requires the following 3rd party executables which must be installed & executable:

Database download

Platon requires a mandatory database which is publicly hosted at Zenodo: DOI Further information is provided in the database section below.

$ wget https://zenodo.org/record/4066768/files/db.tar.gz
$ tar -xzf db.tar.gz
$ rm db.tar.gz

The db path can either be provided via parameter (--db) or environment variable (PLATON_DB):

$ platon --db <db-path> genome.fasta

$ export PLATON_DB=<db-path>
$ platon genome.fasta

Additionally, for a system-wide setup, the database can be copied to the Platon base directory:

$ cp -r db/ <platon-installation-dir>

Usage

Usage:

usage: platon [--db DB] [--prefix PREFIX] [--output OUTPUT] [--mode {sensitivity,accuracy,specificity}] [--characterize] [--meta] [--help] [--verbose] [--threads THREADS] [--version] <genome>

Identification and characterization of bacterial plasmid contigs from short-read draft assemblies.

Input / Output:
  <genome>              draft genome in fasta format
  --db DB, -d DB        database path (default = <platon_path>/db)
  --prefix PREFIX, -p PREFIX
                        Prefix for output files
  --output OUTPUT, -o OUTPUT
                        Output directory (default = current working directory)

Workflow:
  --mode {sensitivity,accuracy,specificity}, -m {sensitivity,accuracy,specificity}
                        applied filter mode: sensitivity: RDS only (>= 95% sensitivity); specificity: RDS only (>=99.9% specificity); accuracy: RDS & characterization heuristics (highest accuracy) (default = accuracy)
  --characterize, -c    deactivate filters; characterize all contigs
  --meta                use metagenome gene prediction mode

General:
  --help, -h            Show this help message and exit
  --verbose, -v         Print verbose information
  --threads THREADS, -t THREADS
                        Number of threads to use (default = number of available CPUs)
  --version             show program's version number and exit

Examples

Simple:

$ platon genome.fasta

Expert: writing results to results directory with verbose output using 8 threads:

$ platon --db ~/db --output results/ --verbose --threads 8 genome.fasta

Mode

Platon provides 3 different modi controlling which filters will be used. Accuracy mode is the preset default.

Sensitivity

In the sensitivity mode Platon will classifiy all contigs with an RDS value below the sensitivity threshold as chromosomal and all remaining contigs as plasmid. This threshold was defined to account for 95% sensitivity and computed via Monte Carlo simulations of artifical contigs resulting in an RDS=-7.9. -> use this mode to exclude chromosomal contigs.

Specificity

In the specificity mode Platon will classifiy all contigs with an RDS value above the specificity threshold as plasmid and all remaining contigs as chromosomal. This threshold was defined to account for 99.9% specificity and computed via Monte Carlo simulations of artifical contigs resulting in an RDS=0.7.

Accuracy (default)

In the accuracy mode Platon will classifiy all contigs with:

  • an RDS value below the sensitivity threshold as chromosomal
  • an RDS value above the specificity threshold as plasmid and in addition all contigs as plasmid for which one of the following is true: it
  • can be circularized
  • has an incompatibility group sequence
  • has a replication or mobilization HMM hit
  • has an oriT hit
  • has an RDS above the conservative score (0.1), a RefSeq plasmid hit and no rRNA hit

Database

Platon depends on a custom database based on MPS, RDS, RefSeq Plasmid database, PlasmidFinder db as well as manually curated MOB HMM models from MOBscan, custom conjugation and replication HMM models and oriT sequences from MOB-suite. This database based on UniProt UniRef90 release 202 can be downloaded here: (zipped 1.6 Gb, unzipped 2.8 Gb) DOI https://zenodo.org/record/4066768/files/db.tar.gz

Please make sure that you use the latest Platon version along with the most recent database version! Older software versions are not compatible with the latest database version

Dependencies

Platon was developed and tested in Python 3.5 and depends on BioPython (>=1.71).

Additionally, it depends on the following 3rd party executables:

Citation

Schwengers O., Barth P., Falgenhauer L., Hain T., Chakraborty T., & Goesmann A. (2020). Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores. Microbial Genomics, 95, 295. https://doi.org/10.1099/mgen.0.000398

As Platon takes advantage of the inc groups, MOB HMMs and oriT sequences of the following databases, please also cite:

  • Carattoli A., Zankari E., Garcia-Fernandez A., Voldby Larsen M., Lund O., Villa L., Aarestrup F.M., Hasman H. (2014) PlasmidFinder and pMLST: in silico detection and typing of plasmids. Antimicrobial Agents and Chemotherapy, https://doi.org/10.1128/AAC.02412-14

  • Garcillán-Barcia M. P., Redondo-Salvo S., Vielva L., de la Cruz F. (2020) MOBscan: Automated Annotation of MOB Relaxases. Methods in Molecular Biology, https://doi.org/10.1007/978-1-4939-9877-7_21

  • Robertson J., Nash J. H. E. (2018) MOB-suite: Software Tools for Clustering, Reconstruction and Typing of Plasmids From Draft Assemblies. Microbial Genomics, https://doi.org/10.1099/mgen.0.000206

Issues

If you run into any issues with Platon, we'd be happy to hear about it! Please, start the pipeline with -v (verbose) and do not hesitate to file an issue including as much of the following as possible:

  • a detailed description of the issue
  • the platon cmd line output
  • the <prefix>.json file if possible
  • A reproducible example of the issue with a small dataset that you can share (helps us identify whether the issue is specific to a particular computer, operating system, and/or dataset).

platon's People

Contributors

franciscozorrilla avatar mr-c avatar oschwengers avatar patrick-barth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

platon's Issues

Error: No module named 'platon.platon' As #21

Hi! I installed (and used) Platon two weeks ago and worked without problems and then I uninstalled. However, I needed again and I tried to installed again and I am getting this error (same as #21):
"line 6, in
from platon.platon import main
ModuleNotFoundError: No module named 'platon.platon'"

I have tried to use conda and pip for installation (as suggested there) and I am getting the same result. Any suggestion?
Thanks in advance!

Permission denied: 'prodigal'

Hi!
Traceback (most recent call last):
File "/root/miniconda3/envs/platon/bin/platon", line 10, in
sys.exit(main())
File "/root/miniconda3/envs/platon/lib/python3.8/site-packages/platon/platon.py", line 60, in main
pu.test_dependencies() # test dependencies
File "/root/miniconda3/envs/platon/lib/python3.8/site-packages/platon/utils.py", line 125, in test_dependencies
version = read_tool_output(dependency)
File "/root/miniconda3/envs/platon/lib/python3.8/site-packages/platon/utils.py", line 66, in read_tool_output
tool_output = str(sp.check_output(command, stderr=sp.STDOUT)) # stderr must be added in case the tool output is not piped into stdout
File "/root/miniconda3/envs/platon/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/root/miniconda3/envs/platon/lib/python3.8/subprocess.py", line 493, in run
with Popen(*popenargs, **kwargs) as process:
File "/root/miniconda3/envs/platon/lib/python3.8/subprocess.py", line 858, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/root/miniconda3/envs/platon/lib/python3.8/subprocess.py", line 1704, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: 'prodigal'

It keeps showing no permission, why is this? I am a beginner hope someone can help me, thank you very much

diamond version

I cannot used platon because of the diamond version required, which is v2.0.14.
Initially I downloaded the newest version of Diamond v.2.1.8 and I had no problem with the installation,
and platon still insist with the v2.0.14, I downloaded but I couldn't install it.
So, any suggestions for the v2.0.14 installation? or how can I solve this issue?

Thank you

Error: No module named 'platon.platon'

Hi,

Thank you so much for Platon software!

The installation of program was successful, I followed these steps:

$ conda install -c conda-forge -c bioconda -c defaults platon

$ python3 -m pip install --user platon

But, when I run the program, I get this error:

$ platon
Traceback (most recent call last):
  File "/home/life/miniconda3/bin/platon", line 6, in <module>
    from platon.platon import main
ModuleNotFoundError: No module named 'platon.platon'

How can I fix this bug?

Cheers

Andrea

The number of contigs in plasmid.fasta is not the same as the number of contigs in .tsv

Hi,
I found that the number of contigs in plasmid.fasta is not the same as the number of contigs in .tsv. The number of contigs in my .tsv file is 112,598 but the number of contigs in plasmid.fasta is 111,936. Do you have any idea what the problem is? which one should I use? And also the contig number in plasmid.fasta+ the contig number in chromosome.fasta < the total number of the contigs. What are the others? Thanks a lot!

Best
Xuanji

genome fasta file UNRECOGNIZED

Hello so I have been trying for hours now.
I keep running into this issue where my genome.fasta file is not being recognized

code i used -
platon --db /media/mustafa/New Volume/Platon/db --output Platon_results/ --verbose --threads 8 M01094-002.fasta

I am not sure where my mistake is being made. Please help !

database file (rds.tsv) not readable

Hi,

I tried to use platon 1.1.0, which I installed under a new conda environment with python 3.5.
When trying to start the first analysis I encoutered the following error message:

ERROR: database file (rds.tsv) not readable!

The db folder is in the platon folder, but I also gave the absoulte path of the db when trying to run the program.
Additionally I wanted to change the permissions on the folder, but that didn´t work neither.

How can this be fixed?

Best wishes,
Lisa

"Marker protein search failed!" error after execution

Describe the bug
Hi, I am using platon version 1.7 installed through mamba. As input I am using draft assmblies obtained from bacterial whole genome sequencing (enterobacteria, mainly klebsiella) with Illumina. My issues are:

  1. What is the required length of the contigs? One of the assemblies has 170 contigs; however, 56 are analyzed based on their size.
  2. Using any of the assemblies, the result is always "Marker protein search failed!" The error that appears in the log file is: ERROR - MAIN - diamond execution failed! diamond-error-code=-11.

Any hint on how to fix it?

Best regards.

Therefore, please provide us with at least the following information:

  • what exactly happened

"Marker protein search failed!" error after execution

  • what exact command was executed: just copy-paste the command line

platon --db ../Data_Bases/db_platon/ --prefix --output platon/ --verbose --threads 8 contigs/KP882418.fasta

  • what installation of Platon did you use: BioConda, GitHub, Pip

BioConda (mamba)

  • which version of Platon was used

1.7

Incorporate latest RefSeq Reference Plasmid database files

@oschwengers @mr-c @patrick-barth

Hey, guys!

This tool is great: recent incorporation into our workflow had offered some really insightful perspectives we were struggling to obtain prior to coming across your work. Thank you!

It looks like the RefSeq Plasmid FTP site now contains eight "plasmid.*.1.genomic.fna.gz" files and, from 'build-db.sh', it looks like the current database (https://zenodo.org/record/3924529/files/db.tar.gz?download=1) may contain only six of these. It appears as though this data were added (2020-07-09) shortly after your publication of the database (2020-06-30), unfortunately.

Can we confirm or disconfirm what "db.tar.gz" was built using?

If there are now additional "plasmid.*.1.genomic.fna.gz" files available, could we have access to an updated database?

Grateful for all your hard work, I know we're all pulled in a lot of directions right now,
Timothy

Marker protein search failed!

Hi !

when i running platon, it was showed "Marker protein search failed"! what is this mean?

commond line: platon 9_M_contigs.fasta --db /mibi/Wanli/plton/db -o platontest -t 8 --verbose
Options, parameters and arguments:
db path: /mibi/users/Wanli/plton/db
use bundled binaries: False
genome path: /mibi/users/Wanli/gut_micr/test_platon/9_M_contigs.fasta
output path: /mibi/users/Wanli/gut_micr/test_platon/platontest
prefix: 9_M_contigs
mode: accuracy
characterize: False
tmp path: /tmp/tmpm92_0thl
# threads: 8
parse draft genome...
exclude contig 'NODE_9_length_966_cov_1.577053_cutoff_0', too short (966)
exclude contig 'NODE_10_length_909_cov_1.908654_cutoff_0', too short (909)
exclude contig 'NODE_11_length_728_cov_0.471582_cutoff_0', too short (728)
exclude contig 'NODE_12_length_674_cov_3.328308_cutoff_0', too short (674)
exclude contig 'NODE_13_length_617_cov_10.677778_cutoff_0', too short (617)
parsed 30 raw contigs
excluded 5 contigs by size filter
analyze 25 contigs
predict ORFs...
found 474 ORFs
search marker protein sequences (MPS)...
Marker protein search failed!

execution problem

  • what exact command was executed: just copy-paste the command line
    platon -h

  • what installation of Platon did you use: BioConda, GitHub, Pip
    I used mamba install

  • what exactly happened
    Traceback (most recent call last):
    File "/home/liu/.local/bin/platon", line 5, in
    from platon.platon import main
    File "/home/liu/.local/lib/python3.10/site-packages/platon/platon.py", line 12, in
    from Bio import SeqIO
    File "/home/liu/.local/lib/python3.9/site-packages/Bio/SeqIO/init.py", line 374, in
    from Bio.Align import MultipleSeqAlignment
    File "/home/liu/.local/lib/python3.9/site-packages/Bio/Align/init.py", line 29, in
    raise MissingPythonDependencyError(
    Bio.MissingPythonDependencyError: Please install numpy if you want to use Bio.Align. See http://www.numpy.org

But there's numpy 1.25.2 in this conda environment, and I also create a new environment for platon.

ERROR: genome file not readable!

Hi,

First, thank you so much for creating and developing Platon.

I have an issue installing Platon.
Indeed, using Conda/BioConda when I run the command:
$ platon --db ./db genome.fasta

I have an error message:
ERROR: genome file (/home/administrateur/Platon/genome.fasta) not readable!
And when I check in the depository, I have no such file as genome.fasta

Do you have any idea how to solve this issue?

I'm using platon 1.5.0.

However, when I run:
$ platon --help
I have the help documentation.

Thank you so much in advance for your much appreciated help!

Audrey

Issue updating Platon v1.6

Describe the bug
To help users and fix any bugs and issues a concise description if beneficial and often necessary.

Therefore, please provide us with at least the following information:

  • what exactly happened
  • what exact command was executed: just copy-paste the command line
  • what installation of Platon did you use: BioConda, GitHub, Pip
  • which version of Platon was used

Allow specification of output file prefix (not just path)

Hi,
I would like to be able to specify not only the output directory but also the prefix of the output files. The reason is that I am using platon in a snakemake workflow and I would like to assign the proper sample name even though the contig name might just be contig.fasta.

Do you think that is possible?

Thanks,
Carlus

Strange output - is it normal?

Hello,

First of all, thanks for this tool, I think it is really important to identify plasmid-related contigs. I am running platon on a series of genome contigs files. The tool is able find several contigs, in different genomes, that are plasmid-borne.

However, in some of the genomes I have this results below, and I am not sure if it is normal or not? I wish there was something prompted like, for example >>> ANALYSIS COMPLETED SUCCESSFULLY >>> or similar sentence, to understand if the tool run correctly or not.

And example of what I mean is this output:

(platon) [gian@dev-amd20 code]$ platon /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/assembly.fasta --threads 40 --verbose --db /mnt/research/ShadeLab/gian/databases/platon_db/db/ --output /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/
Platon v1.6
Options and arguments:
   input: /mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/contigs_hq_fixstart.fasta
   db: /mnt/ufs18/rs-033/ShadeLab/gian/databases/platon_db/db
   output: /mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2
   prefix: assembly
   mode: accuracy
   characterize: False
   tmp path: /tmp/tmpz4iqsjqa
   # threads: 40
parse draft genome...
   exclude contig 'NODE_1_length_2147114_cov_45.153046', too long (2147114)
   exclude contig 'NODE_20_length_823_cov_292.701149', too short (823)
   exclude contig 'NODE_21_length_695_cov_331.225352', too short (695)
   exclude contig 'NODE_23_length_574_cov_71.671141', too short (574)
   exclude contig 'NODE_24_length_548_cov_296.399050', too short (548)
   exclude contig 'NODE_25_length_527_cov_299.475000', too short (527)
   exclude contig 'NODE_26_length_518_cov_40.744246', too short (518)
   exclude contig 'NODE_2_length_513869_cov_43.096975', too long (513869)
   parsed 22 raw contigs
   excluded 8 contigs by size filter
   analyze 14 contigs
predict ORFs...
   found 1909 ORFs
search marker protein sequences (MPS)...
   found 1730 MPS
compute replicon distribution scores (RDS)...
apply RDS sensitivity threshold (SNT=-7.9) filter...
   excluded 9 contigs by SNT filter
characterize contigs...
ID	Length	Coverage	# ORFs	RDS	Circular	Inc Type(s)	# Replication	# Mobilization	# OriT	# Conjugation	# AMRs	# rRNAs	# Plasmid Hits

and if I look in the output directory I have:

(platon) [gian@dev-amd20 code]$ ll /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/
total 4.7M
-rw-r----- 1 gian ShadeLab 4.6M Jan 25 21:00 assembly.chromosome.fasta
-rw-r----- 1 gian ShadeLab    2 Jan 25 21:00 assembly.json
-rw-r----- 1 gian ShadeLab  11K Jan 25 21:00 assembly.log
-rw-r----- 1 gian ShadeLab    0 Jan 25 21:00 assembly.plasmid.fasta
-rw-r----- 1 gian ShadeLab  131 Jan 25 21:00 assembly.tsv

and

(platon) [gian@dev-amd20 code]$ cat /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.log 
2023-01-25 21:00:02,269 - INFO - MAIN - version 1.6
2023-01-25 21:00:02,270 - INFO - MAIN - command line: /mnt/home/gian/anaconda2/envs/platon/bin/platon /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/assembly.fasta --threads 40 --verbose --db /mnt/research/ShadeLab/gian/databases/platon_db/db/ --output /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/
2023-01-25 21:00:02,270 - INFO - CONFIG - threads=40
2023-01-25 21:00:02,270 - INFO - CONFIG - verbose=True
2023-01-25 21:00:02,270 - DEBUG - CONFIG - test parameter db: db_tmp=/mnt/research/ShadeLab/gian/databases/platon_db/db/
2023-01-25 21:00:02,271 - INFO - CONFIG - database detected: type=parameter, path=/mnt/ufs18/rs-033/ShadeLab/gian/databases/platon_db/db
2023-01-25 21:00:02,271 - INFO - CONFIG - genome-path=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/contigs_hq_fixstart.fasta
2023-01-25 21:00:02,272 - INFO - CONFIG - tmp-path=/tmp/tmpz4iqsjqa
2023-01-25 21:00:02,272 - INFO - CONFIG - output-path=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2
2023-01-25 21:00:02,272 - INFO - CONFIG - mode=accuracy
2023-01-25 21:00:02,272 - INFO - CONFIG - characterize=False
2023-01-25 21:00:02,289 - INFO - UTILS - dependency check: tool=prodigal, version=v2.6.3
2023-01-25 21:00:02,295 - INFO - UTILS - dependency check: tool=diamond, version=v2.0.15
2023-01-25 21:00:02,355 - INFO - UTILS - dependency check: tool=blastn, version=v2.12.0
2023-01-25 21:00:02,359 - INFO - UTILS - dependency check: tool=hmmsearch, version=v3.3.2
2023-01-25 21:00:02,363 - INFO - UTILS - dependency check: tool=nucmer, version=v4.0.0
2023-01-25 21:00:02,368 - INFO - UTILS - dependency check: tool=cmscan, version=v1.1.4
2023-01-25 21:00:02,405 - INFO - MAIN - exclude contig: too long: id=NODE_1_length_2147114_cov_45.153046, length=2147114
2023-01-25 21:00:02,406 - INFO - MAIN - exclude contig: too short: id=NODE_20_length_823_cov_292.701149, length=823
2023-01-25 21:00:02,407 - INFO - MAIN - exclude contig: too short: id=NODE_21_length_695_cov_331.225352, length=695
2023-01-25 21:00:02,407 - INFO - MAIN - exclude contig: too short: id=NODE_23_length_574_cov_71.671141, length=574
2023-01-25 21:00:02,407 - INFO - MAIN - exclude contig: too short: id=NODE_24_length_548_cov_296.399050, length=548
2023-01-25 21:00:02,407 - INFO - MAIN - exclude contig: too short: id=NODE_25_length_527_cov_299.475000, length=527
2023-01-25 21:00:02,408 - INFO - MAIN - exclude contig: too short: id=NODE_26_length_518_cov_40.744246, length=518
2023-01-25 21:00:02,413 - INFO - MAIN - exclude contig: too long: id=NODE_2_length_513869_cov_43.096975, length=513869
2023-01-25 21:00:02,429 - INFO - MAIN - length contig filter: # input=22, # discarded=8, # remaining=14
2023-01-25 21:00:10,088 - INFO - MAIN - ORF detection: # ORFs=1909
2023-01-25 21:00:10,088 - INFO - MAIN - ORF contig filter disabled! # passed contigs=14
2023-01-25 21:00:19,642 - INFO - MAIN - MPS detection: # MPS=1730
2023-01-25 21:00:25,249 - INFO - MAIN - contig RDS: contig=NODE_10_length_166327_cov_45.482142, RDS=-23.173201, score-sum=-3499.153336, #ORFs=151
2023-01-25 21:00:25,256 - INFO - MAIN - contig RDS: contig=NODE_11_length_115072_cov_45.243195, RDS=-15.155100, score-sum=-1636.750778, #ORFs=108
2023-01-25 21:00:25,257 - INFO - MAIN - contig RDS: contig=NODE_12_length_66594_cov_47.770924, RDS=-3.705217, score-sum=-207.492140, #ORFs=56
2023-01-25 21:00:25,258 - INFO - MAIN - contig RDS: contig=NODE_13_length_60171_cov_42.250333, RDS=-40.436105, score-sum=-2223.985785, #ORFs=55
2023-01-25 21:00:25,259 - INFO - MAIN - contig RDS: contig=NODE_14_length_40297_cov_44.013343, RDS=0.590191, score-sum=17.705734, #ORFs=30
2023-01-25 21:00:25,260 - INFO - MAIN - contig RDS: contig=NODE_15_length_31833_cov_44.891125, RDS=-116.252568, score-sum=-3255.071897, #ORFs=28
2023-01-25 21:00:25,260 - INFO - MAIN - contig RDS: contig=NODE_18_length_1134_cov_125.804369, RDS=0.000000, score-sum=0.000000, #ORFs=0
2023-01-25 21:00:25,260 - INFO - MAIN - contig RDS: contig=NODE_19_length_1061_cov_304.962527, RDS=0.000000, score-sum=0.000000, #ORFs=0
2023-01-25 21:00:25,263 - INFO - MAIN - contig RDS: contig=NODE_3_length_435256_cov_45.365944, RDS=-18.422025, score-sum=-7147.745700, #ORFs=388
2023-01-25 21:00:25,266 - INFO - MAIN - contig RDS: contig=NODE_4_length_347418_cov_46.100826, RDS=-17.713544, score-sum=-5367.203949, #ORFs=303
2023-01-25 21:00:25,267 - INFO - MAIN - contig RDS: contig=NODE_6_length_258053_cov_46.234761, RDS=-15.931166, score-sum=-3377.407243, #ORFs=212
2023-01-25 21:00:25,269 - INFO - MAIN - contig RDS: contig=NODE_7_length_218603_cov_46.117441, RDS=-15.435090, score-sum=-3009.842526, #ORFs=195
2023-01-25 21:00:25,270 - INFO - MAIN - contig RDS: contig=NODE_8_length_203124_cov_45.556353, RDS=0.369013, score-sum=63.470272, #ORFs=172
2023-01-25 21:00:25,272 - INFO - MAIN - contig RDS: contig=NODE_9_length_201256_cov_45.789384, RDS=-84.505940, score-sum=-17830.753259, #ORFs=211
2023-01-25 21:00:25,273 - INFO - MAIN - RDS SNT filter: # discarded contigs=9, # remaining contigs=5
2023-01-25 21:00:25,315 - DEBUG - functions - circularity: contig=NODE_12_length_66594_cov_47.770924, len=66594, seq-a-len=33297, seq-b-len=33297
2023-01-25 21:00:25,326 - DEBUG - functions - circularity: contig=NODE_14_length_40297_cov_44.013343, len=40297, seq-a-len=20148, seq-b-len=20149
2023-01-25 21:00:25,326 - DEBUG - functions - circularity: contig=NODE_18_length_1134_cov_125.804369, len=1134, seq-a-len=567, seq-b-len=567
2023-01-25 21:00:25,334 - DEBUG - functions - circularity: contig=NODE_19_length_1061_cov_304.962527, len=1061, seq-a-len=530, seq-b-len=531
2023-01-25 21:00:25,339 - DEBUG - functions - circularity: contig=NODE_8_length_203124_cov_45.556353, len=203124, seq-a-len=101562, seq-b-len=101562
2023-01-25 21:00:25,343 - INFO - functions - circularity: contig=NODE_12_length_66594_cov_47.770924, is-circ=False
2023-01-25 21:00:25,344 - INFO - functions - circularity: contig=NODE_18_length_1134_cov_125.804369, is-circ=False
2023-01-25 21:00:25,346 - INFO - functions - circularity: contig=NODE_14_length_40297_cov_44.013343, is-circ=False
2023-01-25 21:00:25,348 - INFO - functions - circularity: contig=NODE_19_length_1061_cov_304.962527, is-circ=False
2023-01-25 21:00:25,379 - INFO - functions - circularity: contig=NODE_8_length_203124_cov_45.556353, is-circ=False
2023-01-25 21:00:25,466 - INFO - functions - oriT: contig=NODE_12_length_66594_cov_47.770924, # oriT=0
2023-01-25 21:00:25,486 - INFO - functions - oriT: contig=NODE_18_length_1134_cov_125.804369, # oriT=0
2023-01-25 21:00:25,487 - INFO - functions - oriT: contig=NODE_19_length_1061_cov_304.962527, # oriT=0
2023-01-25 21:00:25,491 - INFO - functions - oriT: contig=NODE_14_length_40297_cov_44.013343, # oriT=0
2023-01-25 21:00:25,513 - INFO - functions - oriT: contig=NODE_8_length_203124_cov_45.556353, # oriT=0
2023-01-25 21:00:25,572 - INFO - functions - inc-type: contig=NODE_19_length_1061_cov_304.962527, # inc-types=0
2023-01-25 21:00:25,578 - INFO - functions - inc-type: contig=NODE_14_length_40297_cov_44.013343, # inc-types=0
2023-01-25 21:00:25,579 - INFO - functions - inc-type: contig=NODE_8_length_203124_cov_45.556353, # inc-types=0
2023-01-25 21:00:25,591 - INFO - functions - inc-type: contig=NODE_18_length_1134_cov_125.804369, # inc-types=0
2023-01-25 21:00:25,591 - INFO - functions - inc-type: contig=NODE_12_length_66594_cov_47.770924, # inc-types=0
2023-01-25 21:00:25,763 - INFO - functions - rRNAs: contig=NODE_14_length_40297_cov_44.013343, # rRNAs=0
2023-01-25 21:00:25,770 - INFO - functions - ref plasmids: contig=NODE_12_length_66594_cov_47.770924, # ref plasmids=0
2023-01-25 21:00:25,770 - INFO - functions - ref plasmids: contig=NODE_8_length_203124_cov_45.556353, # ref plasmids=0
2023-01-25 21:00:25,850 - INFO - functions - ref plasmids: contig=NODE_14_length_40297_cov_44.013343, # ref plasmids=0
2023-01-25 21:00:26,072 - INFO - functions - rRNAs: hit! contig=NODE_18_length_1134_cov_125.804369, type=LSU_rRNA_bacteria, start=661, end=1134, strand=+
2023-01-25 21:00:26,072 - INFO - functions - rRNAs: hit! contig=NODE_18_length_1134_cov_125.804369, type=SSU_rRNA_bacteria, start=1, end=82, strand=+
2023-01-25 21:00:26,074 - INFO - functions - rRNAs: contig=NODE_18_length_1134_cov_125.804369, # rRNAs=2
2023-01-25 21:00:26,163 - INFO - functions - rRNAs: contig=NODE_12_length_66594_cov_47.770924, # rRNAs=0
2023-01-25 21:00:26,868 - INFO - functions - ref plasmids: hit! contig=NODE_19_length_1061_cov_304.962527, id=NZ_AP023206.1, c-start=6, c-end=1054, coverage=0.988690, identity=0.968541
2023-01-25 21:00:26,868 - INFO - functions - ref plasmids: contig=NODE_19_length_1061_cov_304.962527, # ref plasmids=1
2023-01-25 21:00:26,883 - INFO - functions - ref plasmids: contig=NODE_18_length_1134_cov_125.804369, # ref plasmids=0
2023-01-25 21:00:27,438 - INFO - functions - AMRs: hit! contig=NODE_8_length_203124_cov_45.556353, type=RND_permease_1-NCBI, start=10910, end=14014, strand=+
2023-01-25 21:00:27,438 - INFO - functions - AMRs: contig=NODE_8_length_203124_cov_45.556353, # AMRs=1
2023-01-25 21:00:28,073 - INFO - functions - rRNAs: hit! contig=NODE_19_length_1061_cov_304.962527, type=LSU_rRNA_bacteria, start=1, end=1035, strand=+
2023-01-25 21:00:28,076 - INFO - functions - rRNAs: contig=NODE_19_length_1061_cov_304.962527, # rRNAs=1
2023-01-25 21:00:28,476 - INFO - functions - rRNAs: contig=NODE_8_length_203124_cov_45.556353, # rRNAs=0
2023-01-25 21:00:29,547 - DEBUG - MAIN - removed tmp dir: /tmp/tmpz4iqsjqa
2023-01-25 21:00:29,548 - DEBUG - MAIN - output: tsv=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.tsv
2023-01-25 21:00:29,550 - DEBUG - MAIN - output: json=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.json
2023-01-25 21:00:29,552 - DEBUG - MAIN - output: chromosomes=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.chromosome.fasta
2023-01-25 21:00:29,562 - DEBUG - MAIN - output: plasmids=/mnt/ufs18/home-150/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.plasmid.fasta

and the ,tsv file is empty

(platon) [gian@dev-amd20 code]$ cat /mnt/home/gian/project_82_genomes/data/PvP012-Illumina_Pantoea_agglomerans_52616.3.395054.CTCATTGC-CTCATTGC_results/platon2/assembly.tsv 
ID	Length	Coverage	# ORFs	RDS	Circular	Inc Type(s)	# Replication	# Mobilization	# OriT	# Conjugation	# AMRs	# rRNAs	# Plasmid Hits

Thanks much!
Gian

ERROR - diamond execution failed! diamond-error-code=-4

Hello,
After successfully installing platon with conda as remote user on the server, I was very excited of running my samples. However, the above error has persisted and I seem not to head anywhere in fixing it.
Here is the script:
platon --db db --output platon_S2 --verbose S2.fasta

And here is the error
output
Is there anyway I can fix this?
I am using contigs assembled with spades

Plasmid verification

Hi Oliver,

Thanks for developing Platon, it's really handy. Do you know if there's anyway I can have do a quality check on the results? Something like CheckM that can check the quality of assembled genomes, is there a way I can verify if the results are high quality plasmids?

Cheers

Alan

Marker protein search failed! and ERROR - MAIN - diamond execution failed! diamond-error-code=-11

I get the same error that was reported here: #12

  • Installed platon through bioconda
  • version is 1.7

Command and error message:

(platon) bayraktar@archlinux:~/Downloads$ platon --db db/ --output results/ --verbose --threads 8 E_coli_test.fasta 
Platon v1.7
Options and arguments:
        input: /home/bayraktar/Downloads/E_coli_test.fasta
        db: /home/bayraktar/Downloads/db
        output: /home/bayraktar/Downloads/results
        prefix: E_coli_test
        mode: accuracy
        characterize: False
        tmp path: /tmp/tmp1l0x7gk_
        # threads: 8
parse draft genome...
        exclude contig 'SRR6985737_202105251828_1_length_878894_cov_36.967575', too long (878894)
        exclude contig 'SRR6985737_202105251828_2_length_588050_cov_36.850970', too long (588050)
        parsed 80 raw contigs
        excluded 2 contigs by size filter
        analyze 78 contigs
predict ORFs...
        found 3655 ORFs
search marker protein sequences (MPS)...
Marker protein search failed!

Log file:

2024-03-06 12:50:59,800 - INFO - CONFIG - metagenome=False
2024-03-06 12:50:59,803 - INFO - UTILS - dependency check: tool=prodigal, version=v2.6.3
2024-03-06 12:50:59,805 - INFO - UTILS - dependency check: tool=diamond, version=v2.1.9
2024-03-06 12:50:59,856 - INFO - UTILS - dependency check: tool=blastn, version=v2.15.0
2024-03-06 12:50:59,857 - INFO - UTILS - dependency check: tool=hmmsearch, version=v3.4.0
2024-03-06 12:50:59,858 - INFO - UTILS - dependency check: tool=nucmer, version=v4.0.0
2024-03-06 12:50:59,861 - INFO - UTILS - dependency check: tool=cmscan, version=v1.1.5
2024-03-06 12:50:59,866 - INFO - MAIN - exclude contig: too long: id=SRR6985737_202105251828_1_length_878894_cov_36.967575, length=878894
2024-03-06 12:50:59,867 - INFO - MAIN - exclude contig: too long: id=SRR6985737_202105251828_2_length_588050_cov_36.850970, length=588050
2024-03-06 12:50:59,871 - INFO - MAIN - length contig filter: # input=80, # discarded=2, # remaining=78
2024-03-06 12:51:04,412 - INFO - MAIN - ORF detection: # ORFs=3655
2024-03-06 12:51:04,412 - INFO - MAIN - ORF contig filter disabled! # passed contigs=78
2024-03-06 12:51:21,723 - ERROR - MAIN - diamond execution failed! diamond-error-code=-11
2024-03-06 12:51:21,723 - DEBUG - MAIN - diamond execution: cmd=['diamond', 'blastp', '--db', '/home/bayraktar/Downloads/db/mps.dmnd', '--query', '/tmp/tmp1l0x7gk_/proteins.faa', '--out', '/tmp/tmp1l0x7gk_/diamond.tsv', '--max-target-seqs', '1', '--id', '90', '--query-cover', '80', '--subject-cover', '80', '--threads', '8', '--tmpdir', '/tmp/tmp1l0x7gk_'], stdout='', stderr='diamond v2.1.9.163 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 8
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /tmp/tmp1l0x7gk_
#Target sequences to report alignments for: 1
Opening the database...  [0.02s]
Database: /home/bayraktar/Downloads/db/mps.dmnd (type: Diamond database, sequences: 4847438, letters: 1549533412)
Block size = 2000000000
Opening the input file...  [0s]
Opening the output file...  [0s]
Loading query sequences...  [0.003s]
Length sorting queries...  [0.001s]
Masking queries...  [0.008s]
Building query seed set...  [0.07s]
Algorithm: Query-indexed
Building query histograms...  [0.006s]
Seeking in database...  [0s]
Loading reference sequences...  [0.855s]
Length sorting reference...  [0.667s]
Initializing temporary storage...  [0.002s]
Building reference histograms...  [2.18s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/1, shape 1/2.
Building reference seed array...  [1.33s]
Building query seed array...  [0.006s]
Computing hash join...  [0.276s]
Searching alignments...  [0.423s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2.
Building reference seed array...  [1.118s]
Building query seed array...  [0.005s]
Computing hash join...  [0.25s]
Searching alignments...  [0.423s]
Deallocating memory...  [0s]
Deallocating buffers...  [0.014s]
Clearing query masking...  [0s]
Computing alignments... Loading trace points...  [0.063s]
Sorting trace points...  [0.02s]
Computing alignments... '

result are all plasmids

HI!

when i using platon to predict plasmid contigs, the command is like below:
"platon {input.f} --db {params.db} --mode {params.mode} -c -o {output.f} -t {threads} -v 2>{log.err} >{log.out}"

and i checked the result, in the * .plasmid.fasta file, is all contigs, which means all contig are been classified as plasmid. so what was happend?

no matter what kind of mode (Accuracy or Sensitivity). and the version is platon 1.4.0.

Trouble testing on chromosomes

Hi
I am testing Platon 1.6 on the E. coli chromosome accession number CP027572.1 as well as bacterial chromosomes CP045233.1 and CP011509.1.
platon [–c] --db /env/ig/biobank/by-soft/platon/1.6/db/ --output …/test_ecoli_c/ --verbose …/ecoli.fasta
There is no output when running in accuracy mode. When launched in –c mode, I get a table with one row, the ID being the sequence ID and the RDS being negativ, and the chromosome.fasta file is empty whereas the sequence is in the plasmid.fasta file.
The same thing happens when I try an input file containing both chromosomes and plasmids sequences, every sequences are in the plasmid.fasta file.
Any idea on what I might be missing ?
Best regards

Blast contig results

Hi,
thanks for developing Platon. I have run Platon to identify contigs of a specific plasmid (pESI).
Some of the identified contigs have 100% identity with several different plasmids, but platon only reports the first one of the blast list.
Is there a way to report all blast results of the contigs which have identical identity.

Best,

Thorsten

Error occurs while trying to run Platon

Hi,
I installed the dependencies as well as database but when am trying to run platon I get error like this :

platon --verbose --threads 4 Ecloacae4.fasta

Options, parameters and arguments:
db path: /home/abb/platon/db
use bundled binaries: True
genome path: /home/abb/Ecloacae4.fasta
output path: /home/abb
prefix: Ecloacae4
mode: accuracy
characterize: False
tmp path: /tmp/tmpga06u329
# threads: 4
parse draft genome...
exclude contig 'E.cloacae4_contig_55', too short (341)
exclude contig 'E.cloacae4_contig_56', too short (227)
exclude contig 'E.cloacae4_contig_57', too short (333)
exclude contig 'E.cloacae4_contig_58', too short (590)
exclude contig 'E.cloacae4_contig_63', too short (247)
exclude contig 'E.cloacae4_contig_78', too short (307)
exclude contig 'E.cloacae4_contig_80', too short (587)
exclude contig 'E.cloacae4_contig_81', too short (739)
exclude contig 'E.cloacae4_contig_84', too short (279)
exclude contig 'E.cloacae4_contig_85', too short (448)
exclude contig 'E.cloacae4_contig_88', too short (941)
exclude contig 'E.cloacae4_contig_90', too short (474)
exclude contig 'E.cloacae4_contig_91', too short (723)
exclude contig 'E.cloacae4_contig_92', too short (535)
exclude contig 'E.cloacae4_contig_93', too short (453)
exclude contig 'E.cloacae4_contig_94', too short (412)
exclude contig 'E.cloacae4_contig_95', too short (875)
exclude contig 'E.cloacae4_contig_96', too short (875)
exclude contig 'E.cloacae4_contig_98', too short (281)
exclude contig 'E.cloacae4_contig_103', too short (914)
exclude contig 'E.cloacae4_contig_104', too short (660)
exclude contig 'E.cloacae4_contig_105', too short (374)
exclude contig 'E.cloacae4_contig_106', too short (452)
exclude contig 'E.cloacae4_contig_107', too short (401)
exclude contig 'E.cloacae4_contig_108', too short (250)
exclude contig 'E.cloacae4_contig_110', too short (532)
exclude contig 'E.cloacae4_contig_112', too short (451)
exclude contig 'E.cloacae4_contig_114', too short (922)
exclude contig 'E.cloacae4_contig_116', too short (432)
exclude contig 'E.cloacae4_contig_117', too short (209)
exclude contig 'E.cloacae4_contig_118', too short (382)
exclude contig 'E.cloacae4_contig_120', too short (520)
exclude contig 'E.cloacae4_contig_121', too short (615)
exclude contig 'E.cloacae4_contig_122', too short (313)
exclude contig 'E.cloacae4_contig_123', too short (251)
exclude contig 'E.cloacae4_contig_124', too short (221)
exclude contig 'E.cloacae4_contig_125', too short (416)
exclude contig 'E.cloacae4_contig_126', too short (305)
exclude contig 'E.cloacae4_contig_128', too short (355)
exclude contig 'E.cloacae4_contig_129', too short (251)
exclude contig 'E.cloacae4_contig_130', too short (249)
exclude contig 'E.cloacae4_contig_131', too short (565)
exclude contig 'E.cloacae4_contig_132', too short (487)
exclude contig 'E.cloacae4_contig_133', too short (220)
exclude contig 'E.cloacae4_contig_134', too short (292)
exclude contig 'E.cloacae4_contig_135', too short (271)
exclude contig 'E.cloacae4_contig_136', too short (220)
exclude contig 'E.cloacae4_contig_137', too short (221)
exclude contig 'E.cloacae4_contig_138', too short (253)
exclude contig 'E.cloacae4_contig_139', too short (323)
exclude contig 'E.cloacae4_contig_140', too short (335)
exclude contig 'E.cloacae4_contig_142', too short (346)
exclude contig 'E.cloacae4_contig_144', too short (633)
exclude contig 'E.cloacae4_contig_145', too short (314)
exclude contig 'E.cloacae4_contig_146', too short (248)
exclude contig 'E.cloacae4_contig_147', too short (223)
exclude contig 'E.cloacae4_contig_148', too short (461)
exclude contig 'E.cloacae4_contig_149', too short (857)
exclude contig 'E.cloacae4_contig_150', too short (521)
exclude contig 'E.cloacae4_contig_151', too short (311)
exclude contig 'E.cloacae4_contig_152', too short (275)
exclude contig 'E.cloacae4_contig_153', too short (240)
exclude contig 'E.cloacae4_contig_154', too short (161)
exclude contig 'E.cloacae4_contig_155', too short (174)
exclude contig 'E.cloacae4_contig_156', too short (219)
exclude contig 'E.cloacae4_contig_157', too short (244)
exclude contig 'E.cloacae4_contig_158', too short (550)
exclude contig 'E.cloacae4_contig_159', too short (252)
exclude contig 'E.cloacae4_contig_160', too short (448)
exclude contig 'E.cloacae4_contig_161', too short (223)
exclude contig 'E.cloacae4_contig_162', too short (318)
exclude contig 'E.cloacae4_contig_163', too short (311)
exclude contig 'E.cloacae4_contig_164', too short (215)
exclude contig 'E.cloacae4_contig_165', too short (220)
exclude contig 'E.cloacae4_contig_166', too short (250)
exclude contig 'E.cloacae4_contig_167', too short (256)
exclude contig 'E.cloacae4_contig_168', too short (401)
exclude contig 'E.cloacae4_contig_169', too short (251)
exclude contig 'E.cloacae4_contig_170', too short (326)
exclude contig 'E.cloacae4_contig_171', too short (424)
exclude contig 'E.cloacae4_contig_172', too short (448)
exclude contig 'E.cloacae4_contig_173', too short (365)
exclude contig 'E.cloacae4_contig_174', too short (389)
exclude contig 'E.cloacae4_contig_175', too short (259)
exclude contig 'E.cloacae4_contig_176', too short (262)
exclude contig 'E.cloacae4_contig_177', too short (325)
exclude contig 'E.cloacae4_contig_178', too short (205)
exclude contig 'E.cloacae4_contig_179', too short (299)
exclude contig 'E.cloacae4_contig_180', too short (270)
exclude contig 'E.cloacae4_contig_181', too short (233)
exclude contig 'E.cloacae4_contig_182', too short (247)
exclude contig 'E.cloacae4_contig_183', too short (347)
exclude contig 'E.cloacae4_contig_184', too short (427)
exclude contig 'E.cloacae4_contig_185', too short (248)
exclude contig 'E.cloacae4_contig_186', too short (235)
exclude contig 'E.cloacae4_contig_187', too short (263)
exclude contig 'E.cloacae4_contig_188', too short (491)
exclude contig 'E.cloacae4_contig_189', too short (257)
exclude contig 'E.cloacae4_contig_190', too short (278)
exclude contig 'E.cloacae4_contig_191', too short (309)
exclude contig 'E.cloacae4_contig_192', too short (463)
exclude contig 'E.cloacae4_contig_193', too short (233)
exclude contig 'E.cloacae4_contig_194', too short (277)
exclude contig 'E.cloacae4_contig_195', too short (402)
exclude contig 'E.cloacae4_contig_196', too short (252)
exclude contig 'E.cloacae4_contig_197', too short (399)
exclude contig 'E.cloacae4_contig_198', too short (259)
exclude contig 'E.cloacae4_contig_199', too short (219)
exclude contig 'E.cloacae4_contig_200', too short (259)
exclude contig 'E.cloacae4_contig_201', too short (494)
exclude contig 'E.cloacae4_contig_202', too short (549)
exclude contig 'E.cloacae4_contig_203', too short (249)
exclude contig 'E.cloacae4_contig_204', too short (300)
exclude contig 'E.cloacae4_contig_205', too short (252)
exclude contig 'E.cloacae4_contig_206', too short (250)
exclude contig 'E.cloacae4_contig_207', too short (379)
exclude contig 'E.cloacae4_contig_208', too short (361)
exclude contig 'E.cloacae4_contig_209', too short (175)
exclude contig 'E.cloacae4_contig_210', too short (256)
exclude contig 'E.cloacae4_contig_211', too short (586)
exclude contig 'E.cloacae4_contig_212', too short (251)
exclude contig 'E.cloacae4_contig_213', too short (413)
exclude contig 'E.cloacae4_contig_214', too short (226)
exclude contig 'E.cloacae4_contig_215', too short (411)
exclude contig 'E.cloacae4_contig_216', too short (366)
exclude contig 'E.cloacae4_contig_217', too short (432)
exclude contig 'E.cloacae4_contig_218', too short (470)
exclude contig 'E.cloacae4_contig_219', too short (251)
exclude contig 'E.cloacae4_contig_220', too short (260)
exclude contig 'E.cloacae4_contig_221', too short (236)
exclude contig 'E.cloacae4_contig_222', too short (411)
exclude contig 'E.cloacae4_contig_223', too short (251)
exclude contig 'E.cloacae4_contig_224', too short (444)
exclude contig 'E.cloacae4_contig_225', too short (247)
exclude contig 'E.cloacae4_contig_226', too short (378)
exclude contig 'E.cloacae4_contig_227', too short (251)
exclude contig 'E.cloacae4_contig_228', too short (220)
exclude contig 'E.cloacae4_contig_229', too short (470)
exclude contig 'E.cloacae4_contig_230', too short (251)
exclude contig 'E.cloacae4_contig_231', too short (434)
exclude contig 'E.cloacae4_contig_232', too short (251)
exclude contig 'E.cloacae4_contig_233', too short (251)
exclude contig 'E.cloacae4_contig_234', too short (293)
exclude contig 'E.cloacae4_contig_235', too short (248)
exclude contig 'E.cloacae4_contig_236', too short (202)
exclude contig 'E.cloacae4_contig_237', too short (250)
exclude contig 'E.cloacae4_contig_238', too short (319)
exclude contig 'E.cloacae4_contig_239', too short (251)
exclude contig 'E.cloacae4_contig_240', too short (251)
exclude contig 'E.cloacae4_contig_241', too short (465)
exclude contig 'E.cloacae4_contig_242', too short (248)
exclude contig 'E.cloacae4_contig_243', too short (314)
exclude contig 'E.cloacae4_contig_244', too short (251)
exclude contig 'E.cloacae4_contig_245', too short (267)
exclude contig 'E.cloacae4_contig_246', too short (247)
exclude contig 'E.cloacae4_contig_247', too short (251)
exclude contig 'E.cloacae4_contig_248', too short (402)
exclude contig 'E.cloacae4_contig_249', too short (423)
exclude contig 'E.cloacae4_contig_250', too short (369)
exclude contig 'E.cloacae4_contig_251', too short (310)
exclude contig 'E.cloacae4_contig_252', too short (349)
exclude contig 'E.cloacae4_contig_253', too short (208)
exclude contig 'E.cloacae4_contig_254', too short (284)
exclude contig 'E.cloacae4_contig_255', too short (41)
exclude contig 'E.cloacae4_contig_256', too short (281)
exclude contig 'E.cloacae4_contig_257', too short (249)
exclude contig 'E.cloacae4_contig_258', too short (216)
exclude contig 'E.cloacae4_contig_259', too short (229)
exclude contig 'E.cloacae4_contig_260', too short (167)
exclude contig 'E.cloacae4_contig_261', too short (285)
exclude contig 'E.cloacae4_contig_262', too short (422)
exclude contig 'E.cloacae4_contig_263', too short (263)
exclude contig 'E.cloacae4_contig_264', too short (244)
exclude contig 'E.cloacae4_contig_265', too short (250)
exclude contig 'E.cloacae4_contig_266', too short (358)
exclude contig 'E.cloacae4_contig_267', too short (251)
exclude contig 'E.cloacae4_contig_268', too short (246)
exclude contig 'E.cloacae4_contig_269', too short (312)
exclude contig 'E.cloacae4_contig_270', too short (398)
exclude contig 'E.cloacae4_contig_271', too short (251)
exclude contig 'E.cloacae4_contig_272', too short (287)
exclude contig 'E.cloacae4_contig_273', too short (245)
exclude contig 'E.cloacae4_contig_274', too short (450)
exclude contig 'E.cloacae4_contig_275', too short (437)
exclude contig 'E.cloacae4_contig_276', too short (234)
exclude contig 'E.cloacae4_contig_277', too short (250)
exclude contig 'E.cloacae4_contig_278', too short (246)
exclude contig 'E.cloacae4_contig_279', too short (250)
exclude contig 'E.cloacae4_contig_280', too short (391)
exclude contig 'E.cloacae4_contig_281', too short (251)
exclude contig 'E.cloacae4_contig_282', too short (330)
exclude contig 'E.cloacae4_contig_283', too short (404)
exclude contig 'E.cloacae4_contig_284', too short (227)
exclude contig 'E.cloacae4_contig_285', too short (440)
exclude contig 'E.cloacae4_contig_286', too short (217)
exclude contig 'E.cloacae4_contig_287', too short (339)
exclude contig 'E.cloacae4_contig_288', too short (251)
exclude contig 'E.cloacae4_contig_289', too short (250)
exclude contig 'E.cloacae4_contig_290', too short (301)
exclude contig 'E.cloacae4_contig_291', too short (237)
exclude contig 'E.cloacae4_contig_292', too short (251)
exclude contig 'E.cloacae4_contig_293', too short (358)
exclude contig 'E.cloacae4_contig_294', too short (250)
exclude contig 'E.cloacae4_contig_295', too short (251)
exclude contig 'E.cloacae4_contig_296', too short (217)
exclude contig 'E.cloacae4_contig_297', too short (282)
exclude contig 'E.cloacae4_contig_298', too short (129)
exclude contig 'E.cloacae4_contig_299', too short (515)
exclude contig 'E.cloacae4_contig_300', too short (248)
exclude contig 'E.cloacae4_contig_301', too short (266)
exclude contig 'E.cloacae4_contig_302', too short (364)
exclude contig 'E.cloacae4_contig_303', too short (282)
exclude contig 'E.cloacae4_contig_304', too short (233)
exclude contig 'E.cloacae4_contig_305', too short (251)
exclude contig 'E.cloacae4_contig_306', too short (225)
exclude contig 'E.cloacae4_contig_307', too short (303)
exclude contig 'E.cloacae4_contig_308', too short (270)
exclude contig 'E.cloacae4_contig_309', too short (372)
exclude contig 'E.cloacae4_contig_310', too short (231)
exclude contig 'E.cloacae4_contig_311', too short (396)
exclude contig 'E.cloacae4_contig_312', too short (251)
exclude contig 'E.cloacae4_contig_313', too short (247)
exclude contig 'E.cloacae4_contig_314', too short (245)
exclude contig 'E.cloacae4_contig_315', too short (258)
exclude contig 'E.cloacae4_contig_316', too short (250)
exclude contig 'E.cloacae4_contig_317', too short (229)
exclude contig 'E.cloacae4_contig_318', too short (250)
exclude contig 'E.cloacae4_contig_319', too short (239)
exclude contig 'E.cloacae4_contig_320', too short (200)
exclude contig 'E.cloacae4_contig_321', too short (364)
exclude contig 'E.cloacae4_contig_322', too short (217)
exclude contig 'E.cloacae4_contig_323', too short (251)
exclude contig 'E.cloacae4_contig_324', too short (250)
exclude contig 'E.cloacae4_contig_325', too short (251)
exclude contig 'E.cloacae4_contig_326', too short (297)
exclude contig 'E.cloacae4_contig_327', too short (207)
exclude contig 'E.cloacae4_contig_328', too short (448)
exclude contig 'E.cloacae4_contig_329', too short (202)
exclude contig 'E.cloacae4_contig_330', too short (414)
exclude contig 'E.cloacae4_contig_331', too short (250)
exclude contig 'E.cloacae4_contig_332', too short (300)
exclude contig 'E.cloacae4_contig_333', too short (229)
exclude contig 'E.cloacae4_contig_334', too short (240)
exclude contig 'E.cloacae4_contig_335', too short (206)
exclude contig 'E.cloacae4_contig_336', too short (223)
exclude contig 'E.cloacae4_contig_337', too short (394)
exclude contig 'E.cloacae4_contig_338', too short (251)
exclude contig 'E.cloacae4_contig_339', too short (228)
exclude contig 'E.cloacae4_contig_340', too short (263)
exclude contig 'E.cloacae4_contig_341', too short (445)
exclude contig 'E.cloacae4_contig_342', too short (248)
exclude contig 'E.cloacae4_contig_343', too short (267)
exclude contig 'E.cloacae4_contig_344', too short (290)
exclude contig 'E.cloacae4_contig_345', too short (250)
exclude contig 'E.cloacae4_contig_346', too short (350)
exclude contig 'E.cloacae4_contig_347', too short (297)
exclude contig 'E.cloacae4_contig_348', too short (225)
exclude contig 'E.cloacae4_contig_349', too short (252)
exclude contig 'E.cloacae4_contig_350', too short (300)
exclude contig 'E.cloacae4_contig_351', too short (251)
exclude contig 'E.cloacae4_contig_352', too short (297)
exclude contig 'E.cloacae4_contig_353', too short (312)
exclude contig 'E.cloacae4_contig_354', too short (304)
exclude contig 'E.cloacae4_contig_355', too short (235)
exclude contig 'E.cloacae4_contig_356', too short (216)
exclude contig 'E.cloacae4_contig_357', too short (209)
exclude contig 'E.cloacae4_contig_358', too short (270)
exclude contig 'E.cloacae4_contig_359', too short (328)
exclude contig 'E.cloacae4_contig_360', too short (247)
exclude contig 'E.cloacae4_contig_361', too short (252)
exclude contig 'E.cloacae4_contig_362', too short (279)
exclude contig 'E.cloacae4_contig_363', too short (245)
exclude contig 'E.cloacae4_contig_364', too short (251)
exclude contig 'E.cloacae4_contig_365', too short (250)
exclude contig 'E.cloacae4_contig_366', too short (251)
exclude contig 'E.cloacae4_contig_367', too short (376)
exclude contig 'E.cloacae4_contig_368', too short (250)
exclude contig 'E.cloacae4_contig_369', too short (248)
exclude contig 'E.cloacae4_contig_370', too short (285)
exclude contig 'E.cloacae4_contig_371', too short (251)
exclude contig 'E.cloacae4_contig_372', too short (251)
exclude contig 'E.cloacae4_contig_373', too short (568)
exclude contig 'E.cloacae4_contig_374', too short (243)
exclude contig 'E.cloacae4_contig_375', too short (235)
exclude contig 'E.cloacae4_contig_376', too short (233)
exclude contig 'E.cloacae4_contig_377', too short (248)
exclude contig 'E.cloacae4_contig_378', too short (321)
exclude contig 'E.cloacae4_contig_379', too short (412)
exclude contig 'E.cloacae4_contig_380', too short (251)
exclude contig 'E.cloacae4_contig_381', too short (250)
exclude contig 'E.cloacae4_contig_382', too short (340)
exclude contig 'E.cloacae4_contig_383', too short (415)
exclude contig 'E.cloacae4_contig_384', too short (250)
exclude contig 'E.cloacae4_contig_385', too short (236)
exclude contig 'E.cloacae4_contig_386', too short (251)
exclude contig 'E.cloacae4_contig_387', too short (251)
exclude contig 'E.cloacae4_contig_388', too short (251)
exclude contig 'E.cloacae4_contig_389', too short (251)
exclude contig 'E.cloacae4_contig_390', too short (251)
exclude contig 'E.cloacae4_contig_391', too short (259)
exclude contig 'E.cloacae4_contig_392', too short (246)
exclude contig 'E.cloacae4_contig_393', too short (244)
exclude contig 'E.cloacae4_contig_394', too short (251)
exclude contig 'E.cloacae4_contig_395', too short (248)
exclude contig 'E.cloacae4_contig_396', too short (251)
exclude contig 'E.cloacae4_contig_397', too short (258)
exclude contig 'E.cloacae4_contig_398', too short (264)
exclude contig 'E.cloacae4_contig_399', too short (250)
exclude contig 'E.cloacae4_contig_400', too short (249)
exclude contig 'E.cloacae4_contig_401', too short (332)
exclude contig 'E.cloacae4_contig_402', too short (232)
exclude contig 'E.cloacae4_contig_403', too short (328)
exclude contig 'E.cloacae4_contig_404', too short (362)
exclude contig 'E.cloacae4_contig_405', too short (310)
exclude contig 'E.cloacae4_contig_406', too short (249)
exclude contig 'E.cloacae4_contig_407', too short (245)
exclude contig 'E.cloacae4_contig_408', too short (262)
exclude contig 'E.cloacae4_contig_409', too short (248)
exclude contig 'E.cloacae4_contig_410', too short (298)
exclude contig 'E.cloacae4_contig_411', too short (230)
exclude contig 'E.cloacae4_contig_412', too short (244)
exclude contig 'E.cloacae4_contig_413', too short (296)
exclude contig 'E.cloacae4_contig_414', too short (251)
exclude contig 'E.cloacae4_contig_415', too short (201)
exclude contig 'E.cloacae4_contig_416', too short (239)
exclude contig 'E.cloacae4_contig_417', too short (387)
exclude contig 'E.cloacae4_contig_418', too short (273)
exclude contig 'E.cloacae4_contig_419', too short (257)
exclude contig 'E.cloacae4_contig_420', too short (251)
exclude contig 'E.cloacae4_contig_421', too short (284)
exclude contig 'E.cloacae4_contig_422', too short (213)
exclude contig 'E.cloacae4_contig_423', too short (231)
exclude contig 'E.cloacae4_contig_424', too short (277)
exclude contig 'E.cloacae4_contig_425', too short (251)
exclude contig 'E.cloacae4_contig_426', too short (251)
exclude contig 'E.cloacae4_contig_427', too short (251)
exclude contig 'E.cloacae4_contig_428', too short (251)
exclude contig 'E.cloacae4_contig_429', too short (251)
exclude contig 'E.cloacae4_contig_430', too short (251)
exclude contig 'E.cloacae4_contig_431', too short (251)
exclude contig 'E.cloacae4_contig_432', too short (278)
exclude contig 'E.cloacae4_contig_433', too short (251)
exclude contig 'E.cloacae4_contig_434', too short (251)
exclude contig 'E.cloacae4_contig_435', too short (242)
exclude contig 'E.cloacae4_contig_436', too short (271)
exclude contig 'E.cloacae4_contig_437', too short (250)
exclude contig 'E.cloacae4_contig_438', too short (251)
exclude contig 'E.cloacae4_contig_439', too short (251)
exclude contig 'E.cloacae4_contig_440', too short (226)
exclude contig 'E.cloacae4_contig_441', too short (247)
exclude contig 'E.cloacae4_contig_442', too short (251)
exclude contig 'E.cloacae4_contig_443', too short (294)
exclude contig 'E.cloacae4_contig_444', too short (222)
exclude contig 'E.cloacae4_contig_445', too short (324)
exclude contig 'E.cloacae4_contig_446', too short (251)
exclude contig 'E.cloacae4_contig_447', too short (251)
exclude contig 'E.cloacae4_contig_448', too short (262)
exclude contig 'E.cloacae4_contig_449', too short (226)
exclude contig 'E.cloacae4_contig_450', too short (222)
exclude contig 'E.cloacae4_contig_451', too short (210)
exclude contig 'E.cloacae4_contig_452', too short (213)
exclude contig 'E.cloacae4_contig_453', too short (311)
exclude contig 'E.cloacae4_contig_454', too short (205)
exclude contig 'E.cloacae4_contig_455', too short (249)
exclude contig 'E.cloacae4_contig_456', too short (205)
exclude contig 'E.cloacae4_contig_457', too short (222)
exclude contig 'E.cloacae4_contig_458', too short (353)
exclude contig 'E.cloacae4_contig_459', too short (246)
exclude contig 'E.cloacae4_contig_460', too short (214)
exclude contig 'E.cloacae4_contig_461', too short (211)
exclude contig 'E.cloacae4_contig_462', too short (249)
exclude contig 'E.cloacae4_contig_463', too short (206)
exclude contig 'E.cloacae4_contig_464', too short (265)
exclude contig 'E.cloacae4_contig_465', too short (251)
exclude contig 'E.cloacae4_contig_466', too short (251)
exclude contig 'E.cloacae4_contig_467', too short (258)
exclude contig 'E.cloacae4_contig_468', too short (243)
exclude contig 'E.cloacae4_contig_469', too short (248)
exclude contig 'E.cloacae4_contig_470', too short (222)
exclude contig 'E.cloacae4_contig_471', too short (222)
exclude contig 'E.cloacae4_contig_472', too short (230)
exclude contig 'E.cloacae4_contig_473', too short (64)
exclude contig 'E.cloacae4_contig_474', too short (127)
exclude contig 'E.cloacae4_contig_475', too short (164)
exclude contig 'E.cloacae4_contig_476', too short (158)
exclude contig 'E.cloacae4_contig_477', too short (397)
exclude contig 'E.cloacae4_contig_478', too short (147)
exclude contig 'E.cloacae4_contig_479', too short (220)
exclude contig 'E.cloacae4_contig_480', too short (306)
exclude contig 'E.cloacae4_contig_481', too short (178)
parsed 481 raw contigs
excluded 390 contigs by size filter
analyze 91 contigs
predict ORFs...
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in run_code
exec(code, run_globals)
File "/home/abb/platon/platon/platon.py", line 402, in
main()
File "/home/abb/platon/platon/platon.py", line 178, in main
proteins_path = pf.predict_orfs(config, contigs, genome_path)
File "/home/abb/platon/platon/functions.py", line 501, in predict_orfs
orf_id = cols[8].split(';')[0].split('=')[1].split('
')[1]
IndexError: list index out of range

Please help me with a solution to solve this out.
Thanks

IndexError: list index out of range error

Hi,
I am trying to use Platon 1.6 installed with BioConda to identify plasmid contigs. By running the following command:

platon --db ~/database/db --prefix test --output ./ test.fa

Platon reports a IndexError: list index out of range

Traceback (most recent call last):
  File "/home/cui/miniconda3/envs/platon/bin/platon", line 10, in <module>
    sys.exit(main())
  File "/home/cui/miniconda3/envs/platon/lib/python3.8/site-packages/platon/platon.py", line 153, in main
    proteins_path = pf.predict_orfs(contigs, cfg.genome_path)
  File "/home/cui/miniconda3/envs/platon/lib/python3.8/site-packages/platon/functions.py", line 492, in predict_orfs
    orf_id = cols[8].split(';')[0].split('=')[1].split('_')[1]
IndexError: list index out of range

The version of dependent software is

  • Prodigal: V2.6.3
  • diamond: V2.0.15
  • blastn: V2.13.0+
  • hmmsearch: V3.3.2
  • nucmer: V4.0.0
  • cmscan: V1.1.4

The fasta file I used is test.zip

Platon report Mobilization and Conjugation

Hello everyone!

In the report, obtained from PLATON, I have columns named Mobilization and Conjugation.
In the #Mobilization column, I have 1 or 0 results, which I assume means 1-mobilizable 0- non-mobilizable.
Whereas in Column Conjugation I have many different numbers like 0, 4, 15,16, etc. What does mean? How to interpret these results. I could not find any hints in the manual.
How interpret result below:

  1. #Mobilization 1, #Conjugation 16
  2. #Mobilization 0 #Conjugation 2
  3. #Mobilization 1 #Conjugation 0

I ask for help

Interpretation of results

Hi @oschwengers,

Thanks again for recommending the platon tool, this is exactly what I was looking for.
I just wanted to double check the interpretation of the results with you, below is the verbose output of one of my isolate genome assemblies:

[10:52:43 am GMT] START SAMPLE metagem_lane821s003044.fa
Platon v1.6
Options and arguments:
	input: /rds/project/rds-XUr6B1Jhndg/fz274/kost_soil/drep/metagem_lane821s003044.fa
	db: /rds/user/fz274/hpc-work/platon_db/db
	output: /rds/project/rds-XUr6B1Jhndg/fz274/kost_soil/test_platon
	prefix: metagem_lane821s003044
	mode: accuracy
	characterize: False
	tmp path: /tmp/tmpzj0_z4kw
	# threads: 32
parse draft genome...
	exclude contig 'NODE_137_length_892_cov_367.357055', too short (892)
	exclude contig 'NODE_138_length_853_cov_2176.626289', too short (853)
	exclude contig 'NODE_139_length_829_cov_371.747340', too short (829)
	exclude contig 'NODE_140_length_818_cov_1585.425101', too short (818)
	exclude contig 'NODE_141_length_811_cov_758.149864', too short (811)
	exclude contig 'NODE_142_length_789_cov_547.790730', too short (789)
	exclude contig 'NODE_143_length_704_cov_116.652313', too short (704)
	exclude contig 'NODE_144_length_629_cov_1271.644928', too short (629)
	exclude contig 'NODE_145_length_629_cov_1248.532609', too short (629)
	exclude contig 'NODE_146_length_612_cov_309.685981', too short (612)
	exclude contig 'NODE_147_length_578_cov_592.802395', too short (578)
	exclude contig 'NODE_148_length_552_cov_409.675789', too short (552)
	exclude contig 'NODE_149_length_545_cov_3045.194444', too short (545)
	exclude contig 'NODE_150_length_528_cov_10815.097561', too short (528)
	parsed 150 raw contigs
	excluded 14 contigs by size filter
	analyze 136 contigs
predict ORFs...
	found 6608 ORFs
search marker protein sequences (MPS)...
	found 661 MPS
compute replicon distribution scores (RDS)...
apply RDS sensitivity threshold (SNT=-7.9) filter...
	excluded 0 contigs by SNT filter
characterize contigs...
ID	Length	Coverage	# ORFs	RDS	Circular	Inc Type(s)	# Replication	# Mobilization	# OriT	# Conjugation	# AMRs	# rRNAs	# Plasmid Hits
NODE_62_length_22943_cov_24.692863	22943	24.7	23	0.0	no	0	2	0	0	0	0	0	0
NODE_128_length_1337_cov_418.150000	1337	418.1	1	0.1	no	0	0	0	0	0	0	0	1
[10:55:06 am GMT] DONE RUNNING SAMPLE metagem_lane821s003044.fa

The printed table at the end seems to suggest that the contig NODE_62_length_22943_cov_24.692863 did not have any plasmid hits, however this contig is included in the *.plasmid.fasta file. Could you please clarify how I should interpret these results?

To give some background on my research question: I am interested in identifying plasmid-borne contigs and then searching for any metabolic genes present in those plasmids. Would you recommend I stick with the default accuracy mode for this?

Thank you and best wishes,
Francisco

Running multiple samples

Hello. I tried running draft genome assemblies of multiple samples (different isolates) using the following code:

for D in fasta_aba/ic2/*.fasta; do N=$(basename $D .fasta) ; platon --db db --output platon --prefix $N $D; done

The outputs are for individual isolates. I have 500+ isolates and I don't think I can check every single one of them.

  1. Is there any way to summarize the results in a single csv or tsv file? (i.e. presence/absence of plasmids, with corresponding %id and coverage; no. of plasmids detected per isolate; etc.)

  2. is it possible to put the identity of plasmid hits in the current tsv output? current output is only no. of plasmid hits.

Thank you

ERROR: database file (orit.nhr) not readable!

Hello,

I tried to run platon by cloning it from github (v1.6). I downloaded the lastest database (v1.5.0).

When trying to run the command:
platon --db /platon/db --prefix t --output test fasta.fna
I received the following error:

"ERROR: database file (orit.nhr) not readable!"

I can see in the github /test/db/ file, there are the following database names:

  • conjugation
  • inc-types
  • mobilization
  • mps
  • ncbifam-amr
  • orit
  • refseq-plasmids
  • replication
  • rRNA

This includes the missing "orit" database the error mentions.

However, the database v1.5.0, the latest, does not seem to have the orit part, but has these:

  • conjugation
  • inc-types
  • mobilization
  • ncbifam-amr
  • protein-scores
  • refseq-bacteria-nrpc-reps_0
  • refseq-bacteria-nrpc-reps
  • refseq-plasmids
  • replication
  • rRNA

I had the same issue with the bioconda version...

I would love to use and cite the tool, but I am having trouble with it. Please let me know if you have any idea how to fix this issue.

Thanks,

Alex

Platon is Time-Intensive

Hi, I'm using platon for the first time with a very large metagenome (330k contigs). Although I specified 50 threads (which our computer has) when calling platon, the program still takes ~3-4 days to complete. The most time intensive part appears to be characterizing contigs. Any tips on how to speed things up?

Thanks for making a great tool!

command line arguments: platon --db platon_db --output platon_supercent --threads 50 --verbose allContigs_sorted.fa
Version 1.6
Installed with Conda

Differences between .log and plasmid.fasta file

Dear Oliver,

I have a similiar problem as xuanji2017. I analysed isolates harbouring the large pESI plasmid (CP016413).
When I grep for the specific plasmid ID 'CP016413' in the .log file I got 42 hits but only 39 sequences appear in the plasmid.fasta file. As an example sequence contig00080, which fits perfectly to the pESI plasmid did not appear in the plasmid.fasta file.

Do you have an idea where the discrepancies might come from?
I used platon version 1.6.

I attached the assembly file.

Best wishes,

Thorsten
P8_contigs.txt
P8_S17__out.1.contigs.txt
P8_S17__out.1.contigs.log
P8_S17__out.1.contigs.plasmid.txt

Platon can not get ORF id

Hi
I have installed Platon v1.3.1 by pip and run it using a genome assembly
I got below error message.

Traceback (most recent call last):
  File "/home/chen1i6c04/miniconda3/envs/py38/bin/platon", line 8, in <module>
    sys.exit(main())
  File "/home/chen1i6c04/miniconda3/envs/py38/lib/python3.8/site-packages/platon/platon.py", line 171, in main
    proteins_path = pf.predict_orfs(config, contigs, genome_path)
  File "/home/chen1i6c04/miniconda3/envs/py38/lib/python3.8/site-packages/platon/functions.py", line 501, in predict_orfs
    orf_id = cols[8].split(';')[0].split('=')[1].split('_')[1]
IndexError: list index out of range

This error happened seemingly in Platon parse the ORF id in gff file.
Did i do wrong something?

Thanks

Problem with running short plasmid sequences

Hi,
thanks for the platon tool, I really like it.

I have trouble running platon on a small plasmid. It is composed of two contigs only, with a total length of about 10 kbp.

Is there a limit on the contig size for running platon?

The bash command and output is:

$ platon --threads 1 --verbose --mode accuracy --output results/pT11E03068/platon --db /home/xxx/databases/platon pT11E03068.fasta
Options, parameters and arguments:
        db path: /home/xxx/databases/platon
        use bundled binaries: False
        genome path: pT11E03068.fasta
        output path: results/pT11E03068/platon
        mode: accuracy
        characterize: False
        tmp path: /tmp/tmpv1w9w9m_
        # threads: 1
parse draft genome...
        parsed 2 raw contigs
        excluded 0 contigs by size filter
        analyze 2 contigs
predict ORFs...
        found 0 ORFs
search marker protein sequences (MPS)...
Marker protein search failed!

The logfile reads:

2020-06-05 17:03:28,129 - main - INFO - version 1.3.1
2020-06-05 17:03:28,130 - functions - DEBUG - config: base-dir=/home/DenekeC/anaconda3/envs/bakcharak_amrfinder/lib/python3.6/site-packages
2020-06-05 17:03:28,130 - functions - DEBUG - config: share-dir=/home/DenekeC/anaconda3/envs/bakcharak_amrfinder/lib/python3.6/site-packages/share
2020-06-05 17:03:28,130 - functions - DEBUG - config: bundled binaries=False
2020-06-05 17:03:28,131 - main - INFO - configuration: db-path=/home/DenekeC/Snakefiles/bakcharak/databases/platon
2020-06-05 17:03:28,131 - main - INFO - configuration: bundled binaries=False
2020-06-05 17:03:28,131 - main - INFO - configuration: tmp-path=/tmp/tmpv1w9w9m_
2020-06-05 17:03:28,131 - main - INFO - parameters: genome=pT11E03068.fasta
2020-06-05 17:03:28,131 - main - INFO - parameters: mode=accuracy
2020-06-05 17:03:28,131 - main - INFO - parameters: output=results/pT11E03068/platon
2020-06-05 17:03:28,131 - main - INFO - options: characterize=False
2020-06-05 17:03:28,131 - main - INFO - options: threads=1
2020-06-05 17:03:28,138 - main - INFO - length contig filter: # input=2, # discarded=0, # remaining=2
2020-06-05 17:03:28,148 - functions - WARNING - ORFs failed! prodigal-error-code=10
2020-06-05 17:03:28,148 - functions - DEBUG - ORFs: cmd=['prodigal', '-i', 'pT11E03068.fasta', '-a', '/tmp/tmpv1w9w9m_/proteins.faa', '-c', '-f', 'gff', '-o', '/tmp/tmpv1w9w9m_/prodigal.gff'] stdout='', stderr='-------------------------------------
PRODIGAL v2.6.3 [February, 2016]
Univ of Tenn / Oak Ridge National Lab
Doug Hyatt, Loren Hauser, et al.
-------------------------------------
Request:  Single Genome, Phase:  Training
Reading in the sequence(s) to train...

Error:  Sequence must be 20000 characters (only 10512 read).
(Consider running with the -p meta option or finding more contigs from the same genome.)

'
2020-06-05 17:03:28,148 - main - INFO - ORF detection: # ORFs=0
2020-06-05 17:03:28,149 - main - INFO - ORF contig filter disabled! # passed contigs=2
2020-06-05 17:03:28,160 - main - ERROR - diamond execution failed! diamond-error-code=1
2020-06-05 17:03:28,160 - main - DEBUG - diamond execution: cmd=['diamond', 'blastp', '--db', '/home/xxx/databases/platon/mps.dmnd', '--query', 'None', '--out', '/tmp/tmpv1w9w9m_/ghostz.tsv', '--max-target-seqs', '1', '--id', '90', '--query-cover', '80', '--subject-cover', '80', '--threads', '1', '--tmpdir', '/tmp/tmpv1w9w9m_'], stdout='', stderr='diamond v0.9.32.133 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org

#CPU threads: 1
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /tmp/tmpv1w9w9m_
Opening the database...  [0s]
#Target sequences to report alignments for: 1
Reference = databases/platon/mps.dmnd
Sequences = 4108727
Letters = 1299539496
Block size = 2000000000
Opening the input file... No such file or directory
 [0s]
Error: Error opening file None
'

Any help appreciated. Thanks
Carlus

Known plasmid genes detected in the chromosome

Hello,
This is not a technical issue. I successfully ran pluton 1.4 on Rhizobium genome. There are very well known symbiotic genes which usually lie on the symbiotic plasmid. However, I detected some of them on the chromosome and they were not present on the plasmid. I assembled the genome with spades. Is there anything wrong I am doing? I platon with default settings.

Clabe

RDS is always 0.0

Hi!
I am trying to use Platon 1.6 installed with BioConda to identify plasmid contigs. By running the following command:

platon contigs.fasta --db ~/Databases/db --output platon_accu --mode accuracy --threads 8 --characterize

I got the following result (I am showing the first few lines):

ID Length Coverage # ORFs RDS Circular Inc Type(s) # Replication # Mobilization # OriT # Conjugation # AMRs # rRNAs # Plasmid Hits
NODE_1_length_66028_cov_26.537579 66028 26.5 50 0.0 no 0 0 0 0 0 0 0 0
NODE_1_length_63294_cov_26.832935 63294 26.8 48 0.0 no 0 0 0 0 0 0 0 0
NODE_1_length_63165_cov_26.834275 63165 26.8 48 0.0 yes 0 0 0 0 0 0 0 0
NODE_1_length_51546_cov_2.360878 51546 2.4 74 0.0 yes 0 0 0 0 0 0 0 0
NODE_2_length_32011_cov_1.484036 32011 1.5 39 0.0 yes 0 0 0 0 0 0 0 0
NODE_3_length_19747_cov_141.934964 19747 141.9 3 0.0 yes 0 0 0 0 0 0 2 0

After running the same command without "--characterize", the first two contigs are classified as chromosomal and the rest as plasmids. Now, I am not sure if it is a bug or if I am misunderstanding how the calculation of RDS or the classification criteria work, but the RDS value for all my contigs (over a thousand of them) is always 0.0. Moreover, it looks like rRNA genes were detected in the last showed contig and the number of ORFs was very low, but it was still characterized as a plasmid. Lastly, when I tried to use the sensitivity mode, I got the same results as with the accuracy mode, but when using the specificity mode, all my contigs were classified as chromosomes. Is this an expected behavior?

platon: error: unrecognized arguments: assembly.fasta

hi,

When performing the command according to the tutorial, I encountered an error. My command is

platon -db /mnt/microbio_research/dengwei/database/platon_db/db/ assembly.fasta

And the error:

$ usage: platon [-h] [--db DB] [--mode {sensitivity,accuracy,specificity}]
              [--characterize] [--output OUTPUT] [--prefix PREFIX]
              [--threads THREADS] [--verbose] [--version]
              <genome>
platon: error: unrecognized arguments: assembly.fasta

May you kindly give any suggestion? Thanks in advance.

Error running the package

Hi there,

I have installed the Platon package to the best of my capabilities using Conda to crate an environment and then installing the dependencies using pip.

However, when I try to run it using a draft genome assembly I obtain the following error

Traceback (most recent call last): File "/usr/local/bin/platon", line 10, in <module> sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/platon/platon.py", line 268, in main filtered_contigs = {k:v for (k,v) in scored_contigs.items() if filter_contig(v) } File "/usr/local/lib/python3.7/site-packages/platon/platon.py", line 268, in <dictcomp> filtered_contigs = {k:v for (k,v) in scored_contigs.items() if filter_contig(v) } File "/usr/local/lib/python3.7/site-packages/platon/functions.py", line 249, in filter_contig if( contig['is_circular'] ): KeyError: 'is_circular'

could you please help me fix this issue?

Thanks in advance.

Cheers,
Pablo

--meta missing

Hi,

im using the biocontainer quay.io/biocontainers/platon version 1.6--pyhdfd78af_1.
Specifying --meta seems to be not part of the of the parameter settings even though it is mentioned in the README.

~$ platon assembly.fasta --meta  --db db --mode sensitivity -t 14
usage: platon [--db DB] [--prefix PREFIX] [--output OUTPUT] [--mode {sensitivity,accuracy,specificity}] [--characterize] [--help] [--verbose] [--threads THREADS] [--version] <genome>
platon: error: unrecognized arguments: --meta

Option to see the 'hits' in the results

Hi,

Thank you for the great tool!

I'm not sure if I've missed it or not but is there an option for a more detailed report which includes, for example, which plasmids or Inc types in the database(s) were detected?

I notice that you can see this info in the log file but obviously this is not a user friendly way to do this!

kind regards

Lorcan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.