GithubHelp home page GithubHelp logo

cogent3 / piqtree2 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from iqtree/iqtree2

0.0 0.0 1.0 75.46 MB

a python front end for IQ-TREE2 (software for efficient phylogenomic software by maximum likelihood http://www.iqtree.org)

License: GNU General Public License v2.0

Shell 0.07% C++ 69.48% Python 0.19% Perl 0.03% C 26.27% C# 0.41% Assembly 1.03% Ada 0.65% CLIPS 0.04% Pascal 0.51% SAS 0.01% Makefile 0.19% HTML 0.22% CMake 0.64% Batchfile 0.01% DIGITAL Command Language 0.20% Module Management System 0.01% M4 0.01% Roff 0.03% Jupyter Notebook 0.01%

piqtree2's Introduction

piqtree2

PyPI Version Python Version License CI

piqtree2 is a library which allows you use IQ-TREE directly from Python! The interface with python is through cogent3 objects, as shown below.

Note

This project is still in early development, if you encounter any problems or have any feature requests feel free to raise an issue!

Examples

Phylogenetic Reconstruction

from piqtree2 import build_tree
from cogent3 import load_aligned_seqs # Included with piqtree2!

# Load Sequences
aln = load_aligned_seqs("tests/data/example.fasta", moltype="dna")
aln = aln.take_seqs(["Human", "Chimpanzee", "Rhesus", "Mouse"])

# Reconstruct a phylogenetic tree with IQ-TREE!
tree = build_tree(aln, "JC", rand_seed=1) # Optionally specify a random seed.

print("Tree topology:", tree) # A cogent3 tree object
print("Log-likelihood:", tree.params["lnL"])
# In a Jupyter notebook, try tree.get_figure() to see a dendrogram

Note See the cogent3 docs for examples on what you can do with cogent3 trees.

Fit Branch Lengths to Tree Topology

from piqtree2 import fit_tree
from cogent3 import load_aligned_seqs, make_tree # Included with piqtree2!

# Load Sequences
aln = load_aligned_seqs("tests/data/example.fasta", moltype="dna")
aln = aln.take_seqs(["Human", "Chimpanzee", "Rhesus", "Mouse"])

# Construct tree topology
tree = make_tree("(Human, Chimpanzee, (Rhesus, Mouse));")

# Fit branch lengths with IQ-TREE!
tree = fit_tree(aln, tree, "JC", rand_seed=1) # Optionally specify a random seed.

print("Tree with branch lengths:", tree) # A cogent3 tree object
print("Log-likelihood:", tree.params["lnL"])

Create a Collection of Random Trees

from piqtree2 import TreeGenMode, random_trees

num_taxa = 5
num_trees = 3 

# Also supports YULE_HARDING, CATERPILLAR, BALANCED, BIRTH_DEATH and STAR_TREE
tree_gen_mode = TreeGenMode.UNIFORM 

# Randomly generate trees
trees = random_trees(num_taxa, tree_gen_mode, num_trees, rand_seed=1) # Optionally specify a random seed.

print(trees) # A tuple of 3 trees with 5 taxa each.

Pairwise Robinson-Foulds Distance between Trees

from piqtree2 import robinson_foulds
from cogent3 import make_tree # Included with piqtree2!

# Construct trees
tree1 = make_tree("(a,b,(c,(d,e)));")
tree2 = make_tree("(e,b,(c,(d,a)));")
tree3 = make_tree("(a,b,(d,(c,e)));")

# Calculate pairwise distances
pairwise_distances = robinson_foulds(tree1, tree2, tree3) # Supports any number of trees (for a sequence of trees use *seq_of_trees)

print(pairwise_distances) # A numpy array containing pairwaise Robinson-Foulds distances between trees

piqtree2's People

Contributors

dependabot[bot] avatar gavinhuttley avatar khiron avatar rmcar17 avatar

Forkers

khiron

piqtree2's Issues

Implement multiple_probabilities_calculation in libiqtree2/src/functions

  • implement export
extern "C" {
    /**
     * Calculates multiple probabilities.
     * 
     * @param tree_file          Path to the tree file.
     * @param second_align_file  Path to the second alignment file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* multiple_probabilities_calculation(const char* tree_file, const char* second_align_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Move params defaults from arg parser to params class

Currently the params singleton is set to it's default state externally in tools.parseArg

This is a behaviour localised to the singleton, so it can be implemented inside the singleton class. So we'll add a method void setDefault() on Params that initializes the singleton to the default state, and call Params.setDefault() from tools.parseArg

@bqminh This is a refactoring to allow a default state for Params independent from parsing a command line (and so necessary for automating IQTree2 from Cogent3). But it might also be a useful refactoring for the parsing code in parseArg.

Implement generate_random_tree in libiqtree2/src/functions

  • implement export
extern "C" {
    /**
     * Generates a random phylogenetic tree.
     * 
     * @param num_taxa          Number of taxa for the random tree.
     * @param branch_length_mode Mode for generating branch lengths.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* random_tree_generation(int num_taxa, const char* branch_length_mode);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement print_area in iqtree2_core

  • implement export
extern "C" {
    /**
     * Prints the area of a given tree.
     * 
     * @param tree_file Path to the tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* print_area(const char* tree_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement scale_branch_length in iqtree2_core

  • implement export
extern "C" {
    /**
     * Scales the branch lengths of a given tree by a specified factor.
     * 
     * @param tree_file Path to the tree file.
     * @param scale_factor Factor to scale the branch lengths.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* scale_branch_length(const char* tree_file, float scale_factor);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement all_nni_trees in iqtree2_core

  • implement export
extern "C" {
    /**
     * Generates all possible Nearest Neighbor Interchange (NNI) trees from the given tree.
     * 
     * @param tree_file    Path to the tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* all_nni_trees(const char* tree_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement guided_bootstrap in iqtree2_core

  • implement export
extern "C" {
    /**
     * Performs guided bootstrap analysis.
     * 
     * @param tree_file    Path to the tree file.
     * @param siteLL_file  Path to the site log-likelihood file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* guided_bootstrap(const char* tree_file, const char* siteLL_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Add build target: libiqtree2

  • create a iqtree2_core.cpp module that exports a placeholder function
  • modify CMakeLists.txt to add_library
  • build
  • create a pytest in /pyqtree2/tests that automates the library and calls the placeholder function

Implement eco_pd_analysis in iqtree2_core

  • implement export
extern "C" {
    /**
     * Performs ECO phylogenetic diversity (PD) analysis.
     * 
     * @param tree_file    Path to the tree file.
     * @param eco_dag_file Path to the ECO Directed Acyclic Graph (DAG) file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* eco_pd_analysis(const char* tree_file, const char* eco_dag_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement branch_statistics in iqtree2_core

  • implement export
extern "C" {
    /**
     * Calculates various statistics for the branches of the phylogenetic tree specified in the input file.
     * 
     * @param tree_file Path to the tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* branch_statistics(const char* tree_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement phylogenetic_analysis in libiqtree2/src/functions

  • implement export
extern "C" {
    /**
     * Performs a phylogenetic analysis.
     * 
     * @param aln_file        Path to the alignment file.
     * @param partition_file  Path to the partition file.
     * @param tree_file       Path to the initial tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* phylogenetic_analysis(
        const char* aln_file, 
        const char* partition_file, 
        const char* tree_file 
    );
}}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement print_taxa in iqtree2_core

  • implement export
extern "C" {
    /**
     * Prints the taxa of a given tree.
     * 
     * @param tree_file Path to the tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* print_taxa(const char* tree_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement RF_distance_calculation in libiqtree2/src/functions

  • implement export
extern "C" {
    /**
     * Performs a Robinson-Foulds distance between two phylogenetic trees.
     * 
     * @param tree1_file        Path to the first tree file.
     * @param tree2_file        Path to the second tree file.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
     EXPORT const char* RF_distance_calculation(const char* tree1_file, const char* tree2_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Add libiqtree2 functions to marshall a JSON representation of parameters and their values across the process boundary

extern "C" {
    void set_params(const char* json_str) {
        Params& params = Params::getInstance();
        json j = json::parse(json_str);  // Parse JSON string
        params.from_json(j);  // Update Params with JSON data
    }

    const char* get_params() {
        Params& params = Params::getInstance();
        json j = params.to_json();
        return j.dump().c_str();  // Convert JSON to string and return
    }

    char* get_param(const char* name) {
        try {
            Params& params = Params::getInstance();
            json j = params.get_param(name);
            std::string str = j.dump();
            char* cstr = new char[str.length() + 1];
            std::strcpy(cstr, str.c_str());
            return cstr;  // Caller is responsible for deleting this memory
        } catch (const std::exception& e) {
            // Handle error 
            return NULL;
        }
    }
}
  • Choose a trial set of Params methods
  • Implement Params::to_json, Params::from_json, Params::get_param methods
#include "json.hpp"
using json = nlohmann::json;

class Params {
public:
   // ...
    json to_json() const;
    void from_json(const json& j);
    json get_param(const std::string& name) const;
  //...
};

json Params::to_json() const {
    json j;
    j["fai"] = fai;   // this is required for each member variable of Params 
    // ... 
    return j;
}

void Params::from_json(const json& j) {
    if (j.contains("fai")) { // this is required for each member variable of Params 
        fai = j["fai"].get<bool>();  
    }
    // ... 
}

json Params::get_param(const std::string& name) const {
    json j;
    if (name == "fai") {  // this is required for each member variable of Params
        j[name] = fai;
    }
    // ... 
    else {
        throw std::invalid_argument("Unknown parameter name");
    }
    return j;
}
  • add pytests to /pyqtree2/tests to test that parameters can be marshalled to and from the singleton in the library

Build iqtree2core library using pybind11

iqtree2core target generates a shared library (.so on linux, .dll on windows)
exports
std::string getIqTreeVersion();
void set_params(const std::string& json_str);
std::string get_params();
std::string get_param(const std::string& name);
void generate_random_tree(const std::string& json_str);

source for iqtree2core is in /core folder

The python library that imports the shared library is in /piqtree2/piqtree2

/piqtree2/piqtree2/libs/test_iqtree2core_connectivity.py is a sample python program that simply tests connectivity to the shared library in the same directory. So build the shared library and copy it into the libs directory and run

python -m test_iqtree2core_connectivity.py

Result is

Traceback (most recent call last):
File "", line 189, in _run_module_as_main
File "", line 112, in _get_module_details
File "/iqtree2/piqtree2/libs/test_iqtree2core_connectivity.py", line 2, in
import iqtree2core
ImportError: /iqtree2/piqtree2/libs/iqtree2core.so: undefined symbol: _ZN9PhyloTree21setParsimonyKernelAVXEv

The problem appears to be that the iqtree2core project is not built with the AVX libraries.

pyqtree2 specific project documentation

  • create readme_pyqtree2.md
    • Describe the architecture of pyqtree2
  • Update project readme to retain IQTree2 content but to refer to the addition of the readme_pyqtree2

Determine which of the IQTREE2 functions will be exported in libiqtree2

Stubs for the first 3 exist in the library currently but need to be hooked up to the IQTREE2 code

computeRFDist
generateRandomTree
runPhyloAnalysis

runAliSim
doParsMultiState
testInputFile
printTaxa
printAreaList
scaleBranchLength
calcDistribution
branchStats
calcTreeCluster
processNCBITree
processECOpd
runterraceanalysis
guidedBootstrap
computeMulProb

The development process is to;

  • create a function wrapping this feature in libiqtree2/src/functions
  • create a Catch2 test case containing multiple tests for this feature in libiqtree2/tests/c++/
  • translate the test to a pytest in libiqtree2/tests/python
  • create a feature in piqtree2 that implements the iqtree2 feature using a cogent3 context
    • as a cogent3 app
    • using cogent3 data structures
  • create pytests testing the feature in piqtree2/tests

Implement PD_distribution in iqtree2_core

  • implement export
extern "C" {
    /**
     * Calculates the Phylogenetic Diversity (PD) distribution for given taxa sets based on the input tree.
     * 
     * @param tree_file Path to the tree file.
     * @param taxa_set_file Path to the file containing taxa sets.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* PD_distribution(const char* tree_file, const char* taxa_set_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Add test case in /libiqtree2/tests/c++/test_interfaces.cpp and ../python/test_interfaces.py that verifies the interface of iqtree2 hasn't changed

This will need to inspect C++ header files so there will have to be some research into how to easily do that

check;

  • /utils/tools.h contains Params structure
  • Params member variables have not changed
  • The calling 17 functions exist and have not changed
  • build C++ project if the project has changed independent on the interface not changing, and run unit tests

We should be able to synchronize with the upstream repository and run tests to verify that pyqtree2 should still function as expected

Implement parsimony_multistate in iqtree2_core

  • implement export
extern "C" {
    /**
     * Tests the input file 
     * 
     * @param input_file Path to the input file to be tested.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* input_file_testing(const char* input_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement ncbi_tree_processing in iqtree2_core

  • implement export
extern "C" {
    /**
     * Processes the NCBI tree based on the specified NCBI taxid file.
     * 
     * @param ncbi_taxid_file Path to the file containing NCBI taxid.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* ncbi_tree_processing(const char* ncbi_taxid_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement tree_clustering in iqtree2_core

  • implement export
extern "C" {
    /**
     * Clusters the phylogenetic trees based on the specified threshold.
     * 
     * @param tree_file Path to the tree file.
     * @param cluster_threshold Threshold for clustering.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* tree_clustering(const char* tree_file, float cluster_threshold);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Implement parsimony_multistate in iqtree2_core

  • implement export
extern "C" {
    /**
     * Performs a parsimony analysis on multistate data.
     * 
     * @param input_file Path to the input file containing multistate data.
     * 
     * @return A dynamically allocated C-style string containing the results. 
     *         The caller is responsible for deleting this memory 
     */
    EXPORT const char* parsimony_multistate(const char* input_file);
}
  • add mocked placeholder functionality
  • add pyqtree2 function to call the iqtree2_core function
  • create a /pyqtree2/tests pytest to probe this functionality
  • hook up the exported function to the iqtree2_core internal function

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.