GithubHelp home page GithubHelp logo

lczech / genesis Goto Github PK

View Code? Open in Web Editor NEW
56.0 9.0 12.0 13.22 MB

A library for working with phylogenetic and population genetic data.

Home Page: http://genesis-lib.org/

License: GNU General Public License v3.0

C++ 96.89% Makefile 0.03% CMake 1.71% Shell 0.49% Python 0.86% R 0.03%
c-plus-plus phylogenetics evolutionary-placement phylogenetic-data phylogenetic-trees phylogenetic-placements placement population-genetics pool-sequencing

genesis's Introduction

genesis

A library for working with phylogenetic and population genetic data.

CI Softwipe Score License Language Platforms
Release DOI

Features

Genesis is a C++ library for working with phylogenetic and population genetic data:

  • Trees
    • Read, annotate and write trees in various formats.
    • Versatile tree data structure that can store any data on the edges and nodes.
    • Easily iterate trees with different policies (e.g., postorder, preorder).
    • Directly draw trees with colored branches to SVG files.
  • Placements
    • Read, manipulate and write jplace files from phylogenetic placement analyses.
    • Manipulate placement data: extract, filter, merge, and much more.
    • Calculate distance measures (e.g., KR distance, EDPL).
    • Run analyses like k-means Clustering, Squash Clustering, Edge PCA.
    • Visualize aspects like read abundances or correlation with meta-data on the branches of the tree.
  • Populations
    • Read and work with genome mapping and variant formats such as sam/bam/cram, pileup, sync, and vcf, as well as auxiliary formats such as gff/gtf, bim/map, and bed.
    • Iterate positions in a genome, individually or in different types of windows.
    • Compute statistics such as Tajima's D and F_ST for pool sequencing data.
  • Sequences and Taxonomies
    • Read, filter, manipulate and write sequences in fasta, fastq, and phylip format.
    • Calculate consensus sequences with different methods.
    • Work with taxonomic paths and build a taxonomic hierarchy.
  • Utilities
    • Math tools (matrices, histograms, statistics functions etc)
    • Color support (color lists, gradients etc, for making colored trees)
    • Various supportive file formats (bmp, csv, json, xml and more)

This is just an overview of the more prominent features. See the API reference for more.

Genesis is a library that is intended for researchers and developers who want to build their own tools and methods, or run their own custom analyses. If you are simply interested in analyzing your data with our methods, have a look at our command line tool Gappa for many common phylogenetic placement analyses.

Setup and Getting Started

For download and build instructions, see Setup.

You furthermore find all the information for getting started with genesis in the documentation. It contains a user manual with setup instructions and tutorials, as well as the full API reference.

For bug reports and feature requests of genesis, please open an issue on our GitHub page.

For user support of the phylogenetic placement parts of the library, please see our Phylogenetic Placement Google Group. It is intended for discussions about phylogenetic placement, and for user support for our software tools, such as EPA-ng and Gappa.

Showcases

A focus point of the library is to work with phylogenetic placements. The following figure summarized the placement position of 7.5 mio short reads on a reference tree with 190 taxa. The color code indicates the number of reads placed on each branch.

Phylogenetic tree with coloured branches.

This and other methods are presented in our manuscripts

Methods for Inference of Automatic Reference Phylogenies and Multilevel Phylogenetic Placement.
Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.
Bioinformatics, 2018. https://doi.org/10.1093/bioinformatics/bty767

and

Scalable Methods for Analyzing and Visualizing Phylogenetic Placement of Metagenomic Samples.
Lucas Czech and Alexandros Stamatakis.
PLOS One, 2019. https://doi.org/10.1371/journal.pone.0217050

See there for more on what Genesis can do.

Citation

When using Genesis, please cite

Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data.
Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.
Bioinformatics, 2020. https://doi.org/10.1093/bioinformatics/btaa070

Also, see Gappa for our command line tool to run your own analyses.

genesis's People

Contributors

bzizou avatar computations avatar frederic-mahe avatar lczech avatar pierrebarbera avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

genesis's Issues

Kuhner-Felsenstein distance metric

Related to #1, there could be a function for calculating the Kuhner-Felsenstein distance between two trees, as described in Kuhner and Felsenstein, 94, Equation (1).

auto distance = 0.0;
for(auto bipartition : bipartition_map_a) {
  auto other = find(bipartition_map_b, bipartition);
  auto other_length = other ? other.edge_length : 0.0;
  distance += pow(bipartition.edge_length - other_length, 2);
}

Maybe this could be written in such a way that the actual distance can be easily exchanged? As in, the body of the for sans the find could be passed as a functional.

Parsing multiple trees

It would be super awesome to be able to parse multiple newick strings in one txt file and also ignoring tree names

Example File: 

Tree_A:
((A:2,B:3):3,D);
Tree_B:
(E:4,(A:3,C:D):3);

make: *** [Makefile:84: all] Error 2

Hi @frederic-mahe

I clone the repository for genesis and tried to install it but when I call make in the main directory it is giving this error:

make[2]: *** [lib/genesis/CMakeFiles/genesis_lib_shared.dir/build.make:63: lib/genesis/CMakeFiles/genesis_lib_shared.dir///genesis_unity_sources/lib/all.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:144: lib/genesis/CMakeFiles/genesis_lib_shared.dir/all] Error 2
make: *** [Makefile:84: all] Error 2

It would be very helpful if you could please suggest why this software is not installing

Thank you
Vinita

Bipartition Utils

Bipartitions are very important representations to analyse tree topologies. Maybe we can create a Bipartition Utils containing

  • extraction algorithms
  • efficient bipartition representations (Sparse BitVector Format or a Compressed BitVector Format)
  • distance functions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.