GithubHelp home page GithubHelp logo

sajjadasaf / genovi Goto Github PK

View Code? Open in Web Editor NEW

This project forked from robotod/genovi

0.0 0.0 0.0 179.55 MB

GenoVi, an automated customizable circular genome visualizer for bacteria and archaea

License: Other

Python 99.53% Dockerfile 0.47%

genovi's Introduction

GenoVi: Genome Visualizer Software

GenoVi generates circular genome representations for complete, draft, and multiple bacterial and archaeal genomes. GenoVi pipeline combines several python scripts to automatically generate all needed files for Circos to generate circular plots, including customisable options for colour palettes, fonts, font format, background colour and scaling options for genomes comprising more than 1 replicon. Optionally, GenoVi built-in workflow integrates DeepNOG to annotate COG categories using alignment-free methods with user-defined thresholds, creating COG categories histograms and COG distribution plots per genome, contig or replicon, useful for further analyses.

Diagram

Installation

GenoVi dependencies can be installed creating the following bioconda environment

conda create -n genovi python=3.7 circos 

Activate the environment

conda activate genovi

GenoVi can then be installed using pip

pip install genovi

Dependencies

  • Circos 0.69-8
  • Python 3.7 or later
  • DeepNog 1.2.3
  • NumPy 1.20.2
  • MatPlotLib 3.5.2
  • Pandas 1.2.4
  • Biopython 1.79
  • CairoSVG 2.5.2
  • Seaborn 0.12
  • Perl 5
  • List::MoreUtils (Perl library)
  • Natsort 8.2.0

Usage

genovi [-h] [options ..] -i input_file -s status

Main arguments

  • -i, --input_file. GenBank input file path.
  • -o, --output_file. Output file name. Default: genovi.
  • -s, --status. “complete” or “draft”. Complete genomes are drawn as separate circles for each contig/replicon.

Information:

  • -h, --help. Shows this help message and exit.
  • --version. Shows the currently installed version of genovi.

COGs:

  • -cu, --cogs_unclassified. Do not classify each coding sequence into Clusters of Orthologous Groups of proteins (COGs).
  • --cogs, COGS To specify which COG categories include in the circular representation. For example 'ABJKLX'
  • -b, --deepnog_confidence_threshold. DeepNOG confidence threshold range [0,1] Default: 0. If provided, predictions below the threshold are discarded.

Format:

  • -a, --alignment. When a --status complete is specified, this flag defines the alignment of each individual contig. Options: center, top, bottom, A (First on top), < (first to the left), U (Two on top, the rest below). By default, this is defined by contig sizes.
  • --scale. When using --status complete, whether to use a different scale format to ensure visibility. Options: variable, linear, sqrt. Default: sqrt.
  • -k, --keep_temporary_files. Keep temporary files.
  • -r, -reuse_predictions. If available, reuse DeepNog prediction result from the previous run. Useful only if --keep_temporary_files flag is enabled.
  • -w, --window. Window size (base pair) to assign a GC analysis. Default: 5000.
  • -v, --verbose. Verbose or in-console log messages activated.

Text:

  • -c, --captions_not_included. Do not include captions in the figure.
  • -cp, --captions_position. Captions position. Options: left, right, auto.
  • -t, --title. Figure title.
  • --title_position. Title position. Options: center, top, bottom.
  • --italic_words. How many title words should be written in italic. Default: 2.
  • --size. Displays the genome size of each independent circular representation.
  • -te, --tracks_explain. Adds a space break in the circular representation, including captions for each track within the ideogram.

Colours:

  • -cs, --colour_scheme. Prebuilt color scheme to use for CDS, RNAs, and GC analysis. Options: strong,autumn,dawn,blossom,paradise,neutral, blue, purple, soil, grayscale, velvet, pastel, ocean, wood, beach, desert, ice, island, forest, toxic, fire, spring.
  • -bc, --background. Background colour, in R, G, B format. Default: transparent.
  • -fc, --font_colour. Font color. Default: black.
  • -pc, --CDS_positive_colour. Colour for positive CDSs, in R, G, B format. Default: '180, 205, 222'.
  • -nc, --CDS_negative_colour. Colour for negative CDSs, in R, G, B format. Default: '53, 176, 42'.
  • -tc, --tRNA_colour. Colour for tRNAs, in R, G, B format. Default: '150, 5, 50'.
  • -rc, --rRNA_colour. Colour for rRNAs, in R, G, B format. Default: '150, 150, 50'.
  • -cc, --GC_content_colour. Colour for GC content, in R, G, B format. Default: '23, 0, 115'.
  • -sc, --GC_skew_colour. Colour scheme for positive and negative GC skew. A pair of RGB colors. Default: '140, 150, 198 - 158, 188, 218'.
  • -sl, --GC_skew_line_colour. Colour for GC skew line. Default: black.

More detailed information about the arguments can be found in the user guide.

Tutorials

Check the tutorials in the user guide tutorials.

Output files

Resulting images are saved in a folder called [name] as [name].svg and [name].png (name being specified with output_file argument or, by default, genovi. In case of a complete genome, individual contig image files are stored in a [name] subdirectory as [name]-contig_[i].png with i in [1, the number of circles]. In the case of draft genomes, GenoVi displays the replicons as delivered by the initial GenBank file.

Besides images, if -k or --keep_temporary_files was called, files described in user guide arguments will also be stored.

Four additional files are stored in [name] folder: a histogram displaying COG categories named [name]_COG_histogram.png; a file with the COG classification of each replicon named [name]_COG_Classification.csv; a csv file named [name]_Gral_Stats.csv displaying general information of each replicon, including size, GC content, number of CDS, tRNA and rRNA; and a heatmap displaying the distribution of COGs within each replicon [name]_COG_Classification.csv_percentage

Additional information

For further information, please read the user guide.

Publication

Cumsille, A., Durán, R.E., Rodríguez-Delherbe, A., Saona-Urmeneta, V., Cámara, B., Seeger, M., Araya, M., Jara, N., Buil-Aranda, C. (2023). GenoVi, an open-source automated circular genome visualizer for bacteria and archaea. (Accepted)

Citation and License

GenoVi is under a BY-NC-SA Creative Commons License, Please cite. Cumsille et al., 2023

You may remix, tweak, and build upon this work even for commercial purposes, as long as you credit this work and license your new creations under the identical terms.

genovi's People

Contributors

andresicm avatar arodel21 avatar cbuil avatar robotod avatar vsaona avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.